In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

In [2]:
import performance_benchmark as pbench

# Intro

There are a number of python packages to work with FCA. In this notebook we will compare their performances in the basic FCA task: constructing the concept lattice from a formal context.

We consider two packages: FCApy and Concepts

// More packages can be compared in the future

# Install competitors libraries

`FCApy` package (by Egor Dudyrev, HSE Moscow): https://github.com/EgorDudyrev/FCApy 

`Concepts` package (by Sebastian Bank, University of Leipzig): https://github.com/xflr6/concepts

`fcapsy` package (by Tomáš Mikula, Palacký University): https://github.com/mikulatomas/fcapsy

Upd. We drop `fcapsy` package from the benchmark since now it uses `concepts` package under the hood

# Load data

First we load some classic FCA contexts (datasets)

In [3]:
contexts_to_test = ['animal_movement', 'digits', 'gewaesser','lattice', 'liveinwater', 'tealady']
frames_classic = pbench.load_classic_context(contexts_to_test)

Add Bob-Ross dataset which has more objects and attributes than the classic FCA datasets

In [4]:
frames_classic['bob_ross'] = pbench.load_bob_ross_dataframe()

These classic real world contexts are small so we add some big random contexts to our examination

In [5]:
n_objects_vars = [10, 30, 100]
n_attributes_vars = [10, 30, 50]
densities_vars = [0.1, 0.5, 0.9]

frames_random = pbench.generate_random_contexts(n_objects_vars, n_attributes_vars, densities_vars)

In [6]:
frames = dict(frames_classic, **frames_random)
#frames = dict(frames_classic)

# Run benchmarks

## Default lattice visualizations

Let us take one classic FCA context 'animal movement' and a bigger one 'bob ross' dataset

The description of Animals context:
* objects (rows) are Animals
* attributes (columns) are Actions
* the table shows whether an Animal can perform an Action

The description of Bob Ross dataset:
* objects (rows) are paintings by Bob Ross
* attributes (columns) are specific elements in these paintings
* the table shows whether an element is on a painting

In [7]:
K_names = ['animal_movement', 'tealady']#'bob_ross']

### Visualization by `concepts`

In [8]:
pbench.visualize_by_concepts(K_names, frames, "imgs/lattice_visualization/concepts")

animal_movement
Lattice constructed in 0.001369 seconds
Executed in 0.043852 seconds
tealady
Lattice constructed in 0.006612 seconds
Executed in 0.05267 seconds


### Visualization by `fcapy`

In [9]:
pbench.visualize_by_fcapy(K_names, frames, "imgs/lattice_visualization/fcapy")

animal_movement
Lattice constructed in 0.001838 seconds
Visualizer constructed in 0.002062 seconds
Png saved in 0.1977 seconds
tealady
Lattice constructed in 0.027946 seconds
Visualizer constructed in 0.028471 seconds
Png saved in 0.834006 seconds


## Time to construct a lattice

Run the benchmarks

In [10]:
n_runs = 10
timeout_secs = 5*60

In [11]:
frames_order = sorted(frames, key=lambda K_name: pbench.get_context_stat(frames[K_name])['n_connections'])

In [12]:
ctx_names_vals = frames_order
lib_names_vals = ['concepts', 'fcapy']#, 'fcapsy']
n_runs*len(ctx_names_vals)*len(lib_names_vals)

680

In [13]:
import pandas as pd

In [14]:
stats_df = pd.read_csv('benchmark_stats_tmp.csv', index_col=0)

In [15]:
print(stats_df['is_computed'].mean())

0.9882352941176471


In [16]:
stats_df[stats_df['is_computed']].to_csv('benchmark_stats.csv')

# Analyze the results

In [17]:
import pandas as pd

In [18]:
stats_df = pd.read_csv('benchmark_stats.csv', index_col=0)
print(stats_df.shape)
stats_df.head()

(672, 13)


Unnamed: 0,run_number,ctx_name,lib_name,is_computed,lattice_construction_time (secs),intent_time (secs),extent_time (secs),timeout_seconds,n_objects,n_attributes,n_connections,density,is_random
0,0,random_10_10_0.1,concepts,True,0.000682,5e-06,5e-06,300.0,10.0,10.0,9.0,0.09,True
1,0,random_10_10_0.1,fcapy,True,0.002102,4e-06,6e-06,300.0,10.0,10.0,9.0,0.09,True
2,0,animal_movement,concepts,True,0.00202,5e-06,4e-06,300.0,16.0,4.0,24.0,0.375,False
3,0,animal_movement,fcapy,True,0.001931,5e-06,7e-06,300.0,16.0,4.0,24.0,0.375,False
4,0,gewaesser,concepts,True,0.001259,4e-06,4e-06,300.0,8.0,6.0,24.0,0.5,False


In [19]:
stats_df = stats_df.fillna(timeout_secs)

In [20]:
context_stat_feats = ['n_objects', 'n_attributes', 'n_connections', 'density']

In [21]:
pbench.save_context_stats(stats_df, context_stat_feats)

In [22]:
pbench.save_lattice_time_plot('imgs/lattice_construction_time/classic_contexts.png', context_stat_feats, stats_df[~stats_df['is_random']], timeout_secs)
pbench.save_lattice_time_plot('imgs/lattice_construction_time/random_contexts.png', context_stat_feats, stats_df[stats_df['is_random']], timeout_secs)

In [None]:
pbench.save_extent_intent_time_plot('imgs/intent_extent_time/all_data.png', context_stat_feats, stats_df)