You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yeah, I know the speed is an issue for the current implementation.
At each level, FFT will iterate through every single feature to get the median and evaluate the split. One quick thing to do is to cache FFT results for each level.
e.g. when building FFT0, we already iterated and evaluated all features for the WHOLE dataset.
While building FFT1, FFT2,...FFTN, we just keep doing this repetitively on the WHOLE dataset.
# 1st level
"All data":
feature1: median, metrics
feature2: median, metrics
...
# 2nd level
"feature1 < mid1:" # this one is added when it is selected as the feature for the split in the 1st level.
feature2: median, metrics
feature3: median, metrics
"featureX < mid1:" # this one is added when it is selected as the feature for the split in the 1st level.
feature1: median, metrics
feature2: median, metrics
So if we can cache the intermediate results, then the speed should be much faster.
If the number of features expands upto 1000, building 32 trees takes forever.
The text was updated successfully, but these errors were encountered: