Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict Speed Dependency on Number of Trees #2094

sniemi opened this issue Apr 11, 2019 · 3 comments

Predict Speed Dependency on Number of Trees #2094

sniemi opened this issue Apr 11, 2019 · 3 comments


Copy link

@sniemi sniemi commented Apr 11, 2019

I am trying to understand how the prediction speed depends on the number of trees. My use case has very stringent requirements for prediction speed of a single sample, because of real-time streaming setting. My simple testing with everything being exactly the same implies that if the number of trees increases say from 700 to 5600, the prediction time does not increase by a factor of 8, but something smaller. On average, I was measuring about 4ms and 7ms for the smaller and larger models respectively. This is within the requirement, but I was expecting a linear dependency at best, so was hoping to gain some clarity on this.

Having looked at some of the other issues (e.g. #144) and also the C++ code ( it seems that multithreading can be used even in case of a single sample prediction. This would be consistent with my findings. Can someone confirm this.

Are there any benchmarks available how the predict speed (for a single sample) depends on the number of trees? Can someone please confirm that at prediction time each tree can be traversed independently, unlike at training time when tree induction proceeds sequentially, because the final prediction is the sum of the leave values. In this regard the prediction for a single sample is map-reducible.

Copy link

@guolinke guolinke commented Apr 12, 2019

The multi-threading in LightGBM prediction is the sample-level, not the tree-level.
Therefore, there is not multi-threading acceleration for single instance prediction.

LightGBM focuses more on training efficiency. For the prediction efficiency, you can try the, which support the fast inference for both lightgbm and xgboost.

Copy link

@shashankg7 shashankg7 commented Aug 7, 2019

Hi @guolinke,

I am also facing latency issues. I am working on a real-time application, where prediction speed is very critical.

Do you think treelite will help in performance for single-instance prediction too? On their website, it is mentioned that it helps in prediction for large #examples.

Also, do you have any suggestions for optimizing prediction time? Does changing num_iteration param helps?

Copy link

@hayesall hayesall commented Aug 8, 2019

Hey @shashankg7,

I have not worked with treelite, but it seems like good throughput should also correspond to fast prediction on individual instances.

Also, do you have any suggestions for optimizing prediction time? Does changing num_iteration param helps?

Prediction speed could be affected by quite a few things, but one of the big ones should be the number of trees. LightGBM uses a second-order approximation, so in theory it should be reaching a reasonable solution after a fairly small number of iterations1, 2.

Maybe set up an experiment where you vary num_iterations and learning_rate (num_leaves and max_depth might also be good), compile the model with treelite, and find parameters fitting your hardware/software/real-time constraints while maintaining reasonable predictive performance.

I'd be interested to know what you find!

[1]: Mukherjee et al. "Parallel Boosting with Momentum", ECML PKDD, 2013
[2]: Giau et al. "Accelerated gradient boosting", MLJ, 2019

@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
None yet

No branches or pull requests

5 participants