Skip to content

Stochastic Gradient Boosted Decision Trees as Standalone, TMVAPlugin and Python-Interface

License

Notifications You must be signed in to change notification settings

qinghsui/FastBDT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastBDT

Stochastic Gradient Boosted Decision Trees, usable standalone, as TMVA Plugin and via Python Interface.

FastBDT: A speed-optimized and cache-friendly implementation of stochastic gradient-boosted decision trees for multivariate classification

Stochastic gradient-boosted decision trees are widely employed for multivariate classification and regression tasks. This paper presents a speed-optimized and cache-friendly implementation for multivariate classification called FastBDT. FastBDT is one order of magnitude faster during the fitting-phase and application-phase, in comparison with popular implementations in software frameworks like TMVA, scikit-learn and XGBoost. The concepts used to optimize the execution time and performance studies are discussed in detail in this paper. The key ideas include: An equal-frequency binning on the input data, which allows replacing expensive floating-point with integer operations, while at the same time increasing the quality of the classification; a cache-friendly linear access pattern to the input data, in contrast to usual implementations, which exhibit a random access pattern. FastBDT provides interfaces to C/C++, Python and TMVA. It is extensively used in the field of high energy physics by the Belle II experiment.

Installation

  • cmake .
  • make
  • make install

Usage

Before you do anything you want to execute the unittests:

  • ./unittest

There are multiple ways to use FastBDT. I prepared some standalone executables

  • ./FastBDTMain
  • or python3 examples/FastBDTMain.py

for the python3 version you probably have to set the LD_LIBRARY_PATH

But usually it should be more convinient to use FastBDT as a library and integrate FastBDT directly into your application using

  • the C++ shared/static library,
  • the C shared library,
  • the TMVA plugin if you're using ROOT ( root examples/TMVAExample.C("files/TMVA.root") ),
  • or the Python3 library python/FastBDT.py.

About

Stochastic Gradient Boosted Decision Trees as Standalone, TMVAPlugin and Python-Interface

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 89.8%
  • CMake 6.1%
  • Python 4.0%
  • C 0.1%