Skip to content

[DEPRECATED] An innovative technique that constructs an ensemble of decision trees and converts this ensemble into a single, interpretable decision tree with an enhanced predictive performance

License

IBCNServices/GENESIM

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

GENESIM: GENetic Extraction of a Single, Interpretable Model

This repository contains an innovative algorithm that constructs an ensemble using well-known decision tree induction algorithms such as CART, C4.5, QUEST and GUIDE combined with bagging and boosting. Then, this ensemble is converted to a single, interpretable decision tree in a genetic fashion. For a certain number of iterations, random pairs of decision trees are merged together by first converting them to sets of k-dimensional hyperplanes and then calculating the intersection of these two sets (a classic problem from computational geometry). Moreover, in each iteration, an individual is mutated with a certain probabibility. After these iterations, the accuracy on a validation set is measured for each of the decision trees in the population and the one with the highest accuracy (and lowest number of nodes in case of a tie) is returned. Example.py has run code for all implemented algorithms and returns their average predictive performance, computational complexity and model complexity on a number of dataset

Dependencies

An install.sh script is provided that will install all required dependencies

Documentation

A nicely looking documentation page is available in the doc/ directory. Download the complete directory and open index.html

Decision Tree Induction Algorithm Wrappers

A wrapper is written around Orange C4.5, sklearn CART, GUIDE and QUEST. The returned object is a Decision Tree, which can be found in decisiontree.py. Moreover, different methods are available on this decision tree: classify new, unknown samples; visualise the tree; export it to string, JSON and DOT; etc.

Ensemble Technique Wrappers

A wrapper is written around the well-known state-of-the-art ensemble techniques XGBoost and Random Forests

Similar techniques

A wrapper written around the R package inTrees and an implementation of ISM can be found in the constructors package.

New dataset

A new dataset can easily be plugged in into the benchmark. For this, a load_dataset() function must be written in load_datasets.py

Contact

You can contact me at givdwiel.vandewiele at ugent.be for any questions, proposals or if you wish to contribute.

Referring

Please refer to my work when you use it. A reference to this github or to the following (yet unpublished) paper:

@article{vandewiele2016genesim, title={GENESIM: genetic extraction of a single, interpretable model}, author={Vandewiele, Gilles and Janssens, Olivier and Ongenae, Femke and De Turck, Filip and Van Hoecke, Sofie}, journal={arXiv preprint arXiv:1611.05722}, year={2016} }

About

[DEPRECATED] An innovative technique that constructs an ensemble of decision trees and converts this ensemble into a single, interpretable decision tree with an enhanced predictive performance

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages