-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance issue with random forests #18
Comments
Hi Amir, Thanks for the report. Here are some thoughts. The inference speed of the model depends on multiple factors:
How many examples are there in the benchmark? If this value is small, the stability of the results can be improved using more rounds (i.e. increase the value of |
Thanks for reply and suggestions, update: |
Can you do the following:
|
Thanks for suggestions. 3-I've followed the example in usermanual :
and exec time comes down to about 35 μs. |
Thanks for the details and the patience :). 50 μs / example / core is a slow speed, but it is possible for a large model. As I mentioned before, using gradient boosted trees would lead to significantly faster models. The histogram you shared indicates that most of the forest paths go to full depth (15). While this is common in large datasets, on small datasets, it could indicate that the model is spending a lot of energy by failing to generate features. This type of problem is typically observed when feeding string IDs (e.g. example id, user id) to the model. Example IDs are not generalizable. Looking at the other histogram in the model description can help you identify which feature is at fault. This benchmark code has a few issues.
A simple and accurate way to run a benchmark is to use the model.benchmark() method in the Python API: |
Hi,
I'm running a RANDOM_FOREST model trained in tf_df, by using yggdrasil c++ api and inference time taking about 50 μs, But as you said it probably shouldn't take more than 10 μs.
Also running in large batches(vs batch size =1) or using --copt=-mavx2 doesn't make a difference at all!
I've used benchmark_inference tool and result was the same.
Another interesting observation was difference between min & max execution time per instance,
exec times for 10 run:
########################################
0 max 2133059 min 17293 avg 52250
1 max 1054634 min 16696 avg 52982
2 max 1038110 min 14949 avg 45468
3 max 1068611 min 16752 avg 53064
4 max 1657415 min 16790 avg 54514
5 max 1125537 min 16432 avg 53145
6 max 1939590 min 17591 avg 74354
7 max 2997816 min 17284 avg 70325
8 max 1064365 min 16554 avg 56063
9 max 1044182 min 16145 avg 51429
########################################
even if i ignore some of first execution times (for cache miss) the variance between exec times are still high.
########################################
0 max 1488318 min 15841 avg 51085
1 max 955384 min 16501 avg 45567
2 max 928377 min 16370 avg 44606
3 max 1018261 min 15124 avg 44204
4 max 1429345 min 17299 avg 79810
5 max 1628887 min 17539 avg 80997
6 max 2126679 min 16487 avg 67346
7 max 1058939 min 16616 avg 53941
8 max 1098449 min 16242 avg 48047
9 max 1103341 min 16659 avg 53750
########################################
Wondering if there is a problem in model or inference setup or this is the best performance i can get.
model spec:
RANDOM_FOREST
300 root(s), 618972 node(s), and 28 input feature(s).
RandomForestOptPred engine
compiled with this flags:
--config=linux_cpp17 --config=linux_avx2 --repo_env=CC=gcc-9 --copt=-mavx2
system spec:
Ubuntu 18.04.4
cpu Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
on esxi virtual machine
The text was updated successfully, but these errors were encountered: