Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of (fast) quantile regression #1076

Open
MaxTailt opened this issue Nov 30, 2021 · 2 comments
Open

Implementation of (fast) quantile regression #1076

MaxTailt opened this issue Nov 30, 2021 · 2 comments
Labels

Comments

@MaxTailt
Copy link

Hi all,

I have seen in the R packages ranger and quantregForest that a fast version of quantile computation is currently used, see :
imbs-hl/ranger#207
lorismichel/quantregForest#3

I wonder whether the Meinshausen's quantile regression forest algorithm (and generalized random forests) use this fast implementation in the "grf" package. I know that most of the grf contributors are also ranger contributors but I want to be sure ; I am not familiar with C/C++ routines.

Indeed, I am working on the adaptation of forests methods for extreme quantile regression and I would like to run a proper version of Meinshausen's algorithm, in order to compare the sensitivity of the fast implementation for extreme rainfall prediction.

Thanks in advance,

Max

@jtibshirani
Copy link
Member

Hello @MaxTailt, grf's quantile_forest method does not actually implement Meinshausen's quantile regression forest algorithm. A major difference is that grf makes splits that are sensitive to quantiles, whereas Meinshausen's method uses standard CART splits. The grf paper gives more details on the difference in section 5: https://arxiv.org/pdf/1610.01271.pdf.

I haven't taken a close look at the performance optimization in lorismichel/quantregForest#3 to see whether it could apply to grf. Currently we don't have any similar optimization.

@MaxTailt
Copy link
Author

Hello @jtibshirani
Thanks for you response.
I understood the differences between grf splits and standard CART splits.
But I read in the doc that grf's quantile_forest method is Meinshausen's algorithm by setting the option regression.splitting=TRUE

And so what happen to performance optimization with regression.splitting=TRUE ?

Thanks,

Max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants