This package implements monotone versions of the Random Forest and Gradient Boosting classification algorithms. The two provided algorithms (MonoRandomForestClassifier and MonoGradientBoostingClassifier) are fast, achieve perfect monotonicity, and have multi-class capability. In addition, as well as partial monotonicity capability (i.e. the ability to specify both monotone and non-monotone features), they are based on the corresponding scit-kit learn
classifiers and provide all the associated functionality (such as e.g. out-of-box performance estimates). The theory is described in Bartley C., Liu W., Reynolds M., 2019, Enhanced Random Forest for Partially Monotone Ordinal Classification. AAAI 2019 prepub.<--, available here!-->.
First we define the monotone features, using the corresponding one-based X
array column indices:
incr_feats=[6,9]
decr_feats=[1,8,13]
The specify the hyperparameters (see original paper for explanation):
# Ensure you have a reasonable number of trees
n_estimators=200
mtry = 3
And initialise and solve the classifier using scikit-learn
norms:
clf = mono_forest.MonoRandomForestClassifier(n_estimators=n_estimators,
max_features=mtry,
incr_feats=incr_feats,
decr_feats=decr_feats)
clf.fit(X, y)
y_pred = clf.predict(X)
Of course usually the above will be embedded in some estimate of generalisation error such as out-of-box (oob) score or cross-validation.
For more examples see the documentation.
To use, clone this repo, then run setup.py with 'build_ext --inplace' arguments to build for your machine.
(pip and conda versions are not compatible with current sklearn versions).
Documentation is provided here.
Pull requests welcome! Notes:
- We use the
PEP8 code formatting standard, and
we enforce this by running a code-linter called
flake8
during continuous integration. - Continuous integration is used to run the tests in
/monoensemble/tests/test_monoensemble.py
, using Travis (Linux) and Appveyor (Windows).
BSD 3 Clause, Copyright (c) 2017, Christopher Bartley All rights reserved.