Skip to content

Houski/biosphere

 
 

Repository files navigation

Biosphere

Simple, fast random forests.

Random forests with a runtime of O(n d log(n) + n_estimators d n max_depth) instead of O(n_estimators mtry n log(n) max_depth).

biosphere is available as a rust crate and as a Python package.

Benchmarks

Ran on an M1 Pro with n_jobs=4. Wall-time to fit a Random Forest including OOB score with 400 trees to the NYC Taxi dataset, minimum over 10 runs. After feature engineering, the dataset consists of 5 numerical and 7 one-hot encoded features.

model 1000 2000 4000 8000 16000 32000 64000 128000 256000 512000 1024000 2048000
biosphere 0.04s 0.08s 0.15s 0.32s 0.65s 1.40s 2.97s 6.48s 15.53s 37.91s 96.69s 231.82s
scikit-learn 0.28s 0.34s 0.46s 0.69s 1.23s 2.47s 4.99s 10.49s 22.11s 51.04s 118.95s 271.03s

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 82.6%
  • Python 17.4%