You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For prediction we can probably consider quite a bit more samples (1e6?)
We should also have 2 versions for each benchmark: one with OMP_NUM_THREADS set to 1, and one where we let it to the default (which is hopefully not sufferring from oversubscription). Some changes can be dramatically bad in the multi-threaded case and yet would go undetected in single-threaded benchmarks
Setting the OMP_NUM_THREADS environment variable for a single test is probably not easy for asv, but we can use threadpoolctl from within the test instead.
I opened a PR, I don't deals with the number of threads for now. I'll do it in a separate PR because it involves a few other estimators like kmeans, t-sne, ... and I'm not sure how to do it yet
https://scikit-learn.org/scikit-learn-benchmarks/
I think we should bench it with a medium sized multiclass dataset (e.g. 5 classes, 100 features, 1e4 samples) or something similar.
Ideally a fit should last at least 5s to 10s.
The code of the benchmark for estimators in the
sklearn.ensemble
package is hosted here:The documentation on how to use ASV to run the benchmarks locally:
The text was updated successfully, but these errors were encountered: