New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional performance benchmarks #4
Comments
What's the issue here? The plots look fine to me. Some notes:
|
@siboehm no real issue here; I just wanted to share the findings I had based on the benchmark. To me, the important take-away is that for most inference payloads WE are seeing (usually 1-100 samples at a time), lleaves provides a performance gain, although only with disabled parallelization. Since the break-even can vary wildly, I think it may be important for high-performance settings to smartly toggle the parallelization on/off depending on the number of samples to be predicted at once. |
That's true! Thanks for sharing your benchmark results, I thought there was some performance issue you were bringing up but even after squinting hard at the plots could see anything out of the ordinary :D So I'm happy lleaves is working well for you! Regarding the parallelization:
If it's ok for you feel free to close the issue, but do keep me in the loop if you find any other outliers / observations :) I'm interested in how people are using lleaves and whether it makes more sense to develop the library into the easy-to-use or highest-possible-performance direction. |
Hi, currently evaluating this as a potential performance enhancement on our MLOps / Inference stack.
Tought I'd give some numbers here (based on MacBook Pro 2019).
Test set up as follows:
a) generate artificial data X = 1E6 x 200 float64, Y = X.sum() for regression, Y = X.sum() > 100 for binary classifier
b) for n_feat in [...] -> fit model on 1000 samples and n_feat features; compile model
c) for batchsize in [...] -> predict 10 times a randomly sampled batch of all data items, using (1) LGBM.predict(), (2). lleaves.predict(), (3) lleaves.predict(n_jobs=1); measure TOTAL time taken
For regression results are:
Independent of the number of features, the break-even between parallel lleaves and 1 job seems to be around 1k samples at once, independent of the number of features. Using this logic, we would get better performance than LGBM at all number of samples.
For classification:
Also, here, the break-even is around 1k samples.
For classification with HIGHLY IMBALANCED data (1/50 positive), the break-even is only at 10k samples - Any ideas on why this is the case?
The text was updated successfully, but these errors were encountered: