Skip to content

chriswbartley/monoensemble

Repository files navigation

monoensemble

Build Status Appveyor Status

This package implements monotone versions of the Random Forest and Gradient Boosting classification algorithms. The two provided algorithms (MonoRandomForestClassifier and MonoGradientBoostingClassifier) are fast, achieve perfect monotonicity, and have multi-class capability. In addition, as well as partial monotonicity capability (i.e. the ability to specify both monotone and non-monotone features), they are based on the corresponding scit-kit learn classifiers and provide all the associated functionality (such as e.g. out-of-box performance estimates). The theory is described in Bartley C., Liu W., Reynolds M., 2019, Enhanced Random Forest for Partially Monotone Ordinal Classification. AAAI 2019 prepub.<--, available here!-->.

Code Example

First we define the monotone features, using the corresponding one-based X array column indices:

incr_feats=[6,9]
decr_feats=[1,8,13]

The specify the hyperparameters (see original paper for explanation):

# Ensure you have a reasonable number of trees
n_estimators=200
mtry = 3

And initialise and solve the classifier using scikit-learn norms:

clf = mono_forest.MonoRandomForestClassifier(n_estimators=n_estimators,
                                             max_features=mtry,
                                             incr_feats=incr_feats,
                                             decr_feats=decr_feats)
clf.fit(X, y)
y_pred = clf.predict(X)

Of course usually the above will be embedded in some estimate of generalisation error such as out-of-box (oob) score or cross-validation.

Documentation

For more examples see the documentation.

Installation

To use, clone this repo, then run setup.py with 'build_ext --inplace' arguments to build for your machine.

(pip and conda versions are not compatible with current sklearn versions).

Documentation

Documentation is provided here.

Contributors

Pull requests welcome! Notes:

  • We use the PEP8 code formatting standard, and we enforce this by running a code-linter called flake8 during continuous integration.
  • Continuous integration is used to run the tests in /monoensemble/tests/test_monoensemble.py, using Travis (Linux) and Appveyor (Windows).

License

BSD 3 Clause, Copyright (c) 2017, Christopher Bartley All rights reserved.