Skip to content

anpatton/lrboost

Repository files navigation

Documentation Status

PyPI version

lrboost is a sckit-learn compatible package for linear residual boosting. lrboost uses a linear estimator to first remove any linear trends from the data, and then uses a separate non-linear estimator to model the remaining non-linear trends. We find that extrapolation tasks or data with linear and non-linear components are the best use cases. Not every modeling task will benefit from lrboost, but we use this in our own work and wanted to share something that made it easy to use.

For a stable version, install using pip:

pip install lrboost

lrboost was inspired by 'Regression-Enhanced Random Forests' by Haozhe Zhang, Dan Nettleton, and Zhengyuan Zhu. An excellent PyData talk by Gabby Shklovsky explaining the intuition underlying the approach may also be found here: 'Random Forest Best Practices for the Business World'.

  • LRBoostRegressor can be used like any other sklearn estimator and is built off a sklearn template.
  • predict(X) returns an array-like of final predictions
  • Adding predict(X, detail=True) returns a dictionary with primary, secondary, and final predictions.
from sklearn.datasets import load_diabetes
from lrboost import LRBoostRegressor
X, y = load_diabetes(return_X_y=True)
lrb = LRBoostRegressor().fit(X, y)
predictions = lrb.predict(X)
detailed_predictions = lrb.predict(X, detail=True)

print(lrb.primary_model.score(X, y)) #R2
print(lrb.score(X, y)) #R2

>>> 0.512
>>> 0.933

More detailed documentation can be found here!

Andrew Patton, Nathan Walker, and Kostya Medvedovsky

About

LRBoost is a scikit-learn compatible approach to performing linear residual based stacking/boosting.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages