This repository contains code used for experiments in the paper "Explaining deviating subsets through explanation networks" (A. Ukkonen, V. Dzyuba & M. van Leeuwen, ECMLPKDD 2017).
The implementation is written mainly in R. Please see the sources for dependencies to other R-packages (such as AUC
, foreign
, plyr
, and importantly, rJava
) and install those separately.
A few parts are implemented in Java, but of these (at the moment) only a pre-compiled library (the file tools.jar
) is provided. This contains some routines to speed up computing the explanation strengths, as well as the greedy seed selection algorithm. If you are interested in the sources of these, please be in touch with me.
To run a simple test with artificial data generated by the Bayes network (see paper for details), please run source('test_seedselect_bayes.R')
. This will produce a table that compares all algorithms, sorted in order of the pattern set score.
To produce a plot similar to the ones in Figure 1 (top), run:
source('pagerank_parameter_test.R')
foo <- run_bayes_net_test()
make_all_algs_plot( foo )
More documentation will be added, soonish, I hope.