Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HistGradientBoostingClassifier is not tracked on https://scikit-learn.org/scikit-learn-benchmarks/ #18775

Closed
ogrisel opened this issue Nov 6, 2020 · 3 comments · Fixed by #18851

Comments

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

https://scikit-learn.org/scikit-learn-benchmarks/

I think we should bench it with a medium sized multiclass dataset (e.g. 5 classes, 100 features, 1e4 samples) or something similar.

Ideally a fit should last at least 5s to 10s.

The code of the benchmark for estimators in the sklearn.ensemble package is hosted here:

The documentation on how to use ASV to run the benchmarks locally:

@NicolasHug
Copy link
Member

Thanks for opening the issue

For prediction we can probably consider quite a bit more samples (1e6?)

We should also have 2 versions for each benchmark: one with OMP_NUM_THREADS set to 1, and one where we let it to the default (which is hopefully not sufferring from oversubscription). Some changes can be dramatically bad in the multi-threaded case and yet would go undetected in single-threaded benchmarks

@ogrisel
Copy link
Member Author

ogrisel commented Nov 8, 2020

Setting the OMP_NUM_THREADS environment variable for a single test is probably not easy for asv, but we can use threadpoolctl from within the test instead.

@jeremiedbb
Copy link
Member

I opened a PR, I don't deals with the number of threads for now. I'll do it in a separate PR because it involves a few other estimators like kmeans, t-sne, ... and I'm not sure how to do it yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants