Transition package to demonstrate fminer2 reproduces KDD 2009 results
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README
RUNME

README

Welcome to Fminer2 transition package.

This package aims to port and reproduce the KDD 2009 results on BBRC features to Fminer 2.0 version.
More specifically, it reproduces the leave-one-out crossvalidation results from table 4, p. 7. You may download the article from http://cs.maunz.de ("Large-Scale Graph Mining Using Backbone Refinement Classes").

STEPS
=====

The application "RUNME" downloads and compiles fminer2 as well as the C++ version of lazar ("lazar-core"). It also downloads the CPDB carcinogenicity data that was used in the original study. For details see http://bbrc.maunz.de.
The validation is done using the original validation software (loo2summary.rb).

USAGE
=====

Assumptions: you run a debian-based environment (e.g. Ubuntu, used here).

1) Perhaps you have R installed, otherwise install it:
sudo aptitude install r-base-dev r-base r-base-core git-core # required base packages
Create a link to R: 
sudo ln -sf /usr/lib/R/lib/libR.so /usr/lib/ # or whereever your libR.so resides (default value).

2) Install the following packages:
sudo aptitude install build-essential libopenbabel-dev libgsl0-dev ruby1.8-dev libboost1.40-dev
sudo updatedb # to find header files later on

3) Install the gem "rsruby" to integrate R in ruby. 
sudo aptitude install rubygems
export "R_HOME=/usr/lib/R"
echo "R_HOME=/usr/lib/R" >> $HOME/.bashrc
sudo gem install rsruby -- --with-R-dir=$R_HOME --with-R-include=/usr/share/R/include/

For issues when setting up rsruby, see http://rubyforge.org/projects/rsruby/.

4) Run RUNME. It will try to download and compile fminer2, lazar-core and the cpdbdata. If compilation fails, see the respective repository READMEs at http://github.com/amaunz/fminer2 and http://github.com/amaunz/lazar-core.

If compilation was successful, it will download the CPDB data and run leave-one-out crossvalidation on it. Individual predictions can be found in the .loo file, test statistics in the .summary file. They should correspond closely to the figures given in the KDD 2009 paper, table 4, last row ("BBRC Repr."), which you can get from http://cs.maunz.de.

(c) 2010 Andreas Maunz, sep 2010