-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] Use update() finish condition on booster loop #5193
Conversation
Hey @Samsagax , thanks for your interest in LightGBM!
|
Strange. Works for me with this change. If i don't apply this, then when a booster trains and finds no splittable features I get a spam of warnings and a constant output metric. The warning says "Stopped training because there are no more leaves that meet the split requirements" Looking up the code I find I'm hitting this line LightGBM/src/boosting/gbdt.cpp Line 438 in da9072f
If I understood correctly that if will return true to the caller when training stops. Following the python code I see the C++ function TrainOneIter() is called by booster.update() python function and that is passed to the loop. Maybe is specific to GBDT booster, anyway.
Thanks for having the time to review the patch. |
This may mean that due to incorrect parameters your training actually builds only one tree and that's why there's no difference for your case. This mechanism shouldn't work as an early stopping. That warning should help to understand that something is wrong in users' setup. Refer to #4178 for more details. |
I just want to add one more comment, for those finding this PR from searching that warning. It's related to #5051 (comment). Another reason this shouldn't be used as an early stopping mechanism is that "can't find any more splits" in one iteration doesn't mean LightGBM automatically won't be able to in the next iteration. For example, if you're using |
Thanks for the clarification and I myself experienced that situation (with feature_fraction < 1). I agree it should continue and this patch is not desired behaviour. I was just replicating the behaviour of the LightGBM/src/boosting/gbdt.cpp Line 266 in da9072f
I still find a discrepancy on how the CLI version works since its implementation of GBDT LightGBM/src/boosting/gbdt.cpp Line 271 in da9072f
TrainOneIter() as is_finished condition to bring both implementations in line?
|
I think that's exactly what the conclusion of the discussion in #5051 was, yep! If you'd like to attempt a pull request to implement that change in behavior, please comment on #5051 and see if @shiyu1994 or @guolinke will agree to help review and answer questions. Thanks for your continued interest in LightGBM! |
As per discussion on GH-microsoft#5051 and GH-microsoft#5193 the python package does not stop training if a single leaf tree (stupm) is found and relies on early stopping methods to stop training. This commits removes the finish condition on training based on the result of `TrainOneIter()` and sets the `is_finished` flag on early stopping alone.
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Currently the boosting loop in python does not stop if the booster says it is finished. This patch captures the output of the booster.update() call and if it is finished it breaks the loop at the end of current iteration.