New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantile Regression #719
Comments
I can only answer your last question. You can specify the quantile loss as follows:
|
Thanks @ian-contiamo, when I fitted |
OK I think I've got to the bottom of this - quantile regression does work, but it converges very slowly if at all. It's likely related to microsoft/LightGBM#1199, there's a good description here. I'm not 100% sure, but if the leaf values are approximated by L'(X,y) / L''(X,y) then it's no surprise that it doesn't work so well for the quantile loss with high / low alpha due to the zero 2nd derivative. Specifically I fitted the model
i.e. there's two clusters labeled 1 and 2, with target mean 5.5 and 25.5, and I attempted to fit using Quantile:alpha=0.95, I set learning_rate, max_depth to 1 and experimented with iterations. Both lightgbm and sklearn's GradientBoostingRegressor converge in a single iteration to a solution y=8|X=1 y=28|X=2 (GradientBoostingRegressor) or y=8.5|X=1 y=28.5|X=2 (lightgbm) catboost takes around 100 iterations to converge. The table below shows that after the first iteration all the leaf estimates are the same, and there's a large error (it appears that there's no split, I think it has split at the correct place though - I tried to verify using the standalone_evaluator but I couldn't easily debug the stl containers using vscode). At each step the estimates improve very slowly. I tried to set --leaf-estimation-method Newton but this isn't supported. Are there other parameters I can try? y_pred[n], e[n], L[n] represent the predictions, error and loss function at iteration n
The following is quoted from (http://jmarkhou.com/lgbqr/), it would be nice if catboost could implement the bolded bullet as an option
|
I'm (slowly) starting to understand what's happening here. In this case, since the loss is quantile, Newton's method isn't used to update the leaves so the comment above doesn't really apply, Gradient method is used. What's actually happening is the initial approx is zero. This means the derivs for every document at iteration 1 are equal to alpha (0.75) due to symmetry of the quantile loss function. The delta approximation is then (10 * alpha) / (10+3) where 10 is the number of values per leaf (since it split down the middle) and 3 is the L2 normalisation. So after a single iteration both leaves have value of 0.58 After a few iterations, we start to get -ve errors for some of the leaves and then the derivatives start to diverge. The issue seems to be the poor initial approximation. I might experiment with starting from the mean to see if that provides more direction. I also experimented with And finally I still think having the option to optimise the true loss for the leaves may be the best option? |
We have implemented Exact method for Quantile regression and also starting from best constant approximation for this mode. And thank you very much for your input, we really appreciate it! |
Closing this issue, keep updated, we will post in in release notices, when boost_from_average is True by default. |
Thanks @annaveronika Note if you set |
Thanks for the report! We'll add an exception for this case. |
any updates, is catboos better now for quantile regression than lightgbm ? microsoft/LightGBM#1182 |
I'd like to investigate quantile regression, i.e. Qalpha[y|X] instead of E[y|X]
sklearn's
GradientBoostingRegressor
hasquantile
as a loss function. For each loss function there's a class which implements__call__()
,negative_gradient()
andupdate_terminal_regions()
which implements the loss function, it's first derivative and the associated leaf predictor repectively (i.e. L2 loss has usesmean()
for the predictor, L1median()
and quantile uses thepercentile(alpha)
across the leaf targets.I've looked at the custom loss examples in catboost, from what I see they require first and second derivative functions but I don't see a way to replace the leaf prediction, is this possible (ideally initially as python, then if the result in promising I'll attempt a C++ implementation)
Also in general, I don't see anything in the documentation indicating which leaf prediction function is used for different loss functions? Are you using mean for RMSE and median for MAE?
Also I just found this #37 which implies you've implementation quantile loss even if it's not exposed via the api? Is that the case? Is there an easy way I can change alpha from 0.5 to 0.95 say?
The text was updated successfully, but these errors were encountered: