Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFT takes enormous amount of time for high number of features #2

Open
amritbhanu opened this issue Mar 11, 2018 · 2 comments
Open

FFT takes enormous amount of time for high number of features #2

amritbhanu opened this issue Mar 11, 2018 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@amritbhanu
Copy link
Contributor

If the number of features expands upto 1000, building 32 trees takes forever.

@amritbhanu amritbhanu added the enhancement New feature or request label Mar 11, 2018
@amritbhanu
Copy link
Contributor Author

@dichen001: anyway to make it faster, where could be some room to improve?

@dichen001
Copy link

Yeah, I know the speed is an issue for the current implementation.
At each level, FFT will iterate through every single feature to get the median and evaluate the split. One quick thing to do is to cache FFT results for each level.

e.g. when building FFT0, we already iterated and evaluated all features for the WHOLE dataset.
While building FFT1, FFT2,...FFTN, we just keep doing this repetitively on the WHOLE dataset.

# 1st level
"All data":
   feature1: median, metrics
   feature2: median, metrics
   ...

# 2nd level
"feature1 < mid1:"  # this one is added when it is selected as the feature for the split in the 1st level.
   feature2: median, metrics
   feature3: median, metrics

"featureX < mid1:"  # this one is added when it is selected as the feature for the split in the 1st level.
   feature1: median, metrics
   feature2: median, metrics

So if we can cache the intermediate results, then the speed should be much faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants