No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Benchmark suite for EasyMiner

Preparing data

Prerequisites include Python 2 with sci-kit learn and pandas.

The benchmark uses standard open datasets from the UCI repository. To ensure that algorithm implementations in all platforms (Weka, R) operate on exactly the same folds, the folds are materialized. Two versions of the folds are created, one without discretization of numerical attributes and one with hit. Missing values are treated in both versions.

The output is saved into


The process also creates a temporary folder


Since this proces takes long, precomputed folds are shipped zipped in


and can be unzipped with

Running benchmarks - WEKA

Weka implementations of PART, J48 and RIPPER with grid-based metaparameter optimiziation are executed using


All benchmarks use raw, undiscretized data.

If interrupted, running the file again will compute the missing results.

Running benchmarks - Python

Sci-Kit decision tree benchmarks are run with


Uses raw, undiscretized data.

Running benchmarks - EasyMiner

First, it is necessary to input valid API_KEY and API_URL into

The default benchmark (cba_d) of the rCBA implementation in EasyMiner is run with


The benchmark of auto-tuned CBA (cba_a) can be run with


By default, the benchmarks run in five parallel threads. This can be changed by passing PARALLEL_THREADS command line option to or

Uses discretized data.

If interrupted, running the file again will compute the missing results.

Note that returns slightly different results in each execution due to time limits used in the optimization algorithm.

Generating won-tie-loss matrix

The won-tie-loss matrix and Wilcoxon signed rank test are executed using:


All benchmarks are saved into: