Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
FIX several bugs in initial predictions for GBDT #12983
What does this implement/fix? Explain your changes.
For more, see #12436 (comment) and following discussion (not sure if this is much clearer ^^)
Any other comments?
I've added a check in the losses that raises ValueError when the user uses an init estimator that wasn't trained on the same kind of data as the GradientBoosting instance. For example, training the initial estimator on 3 classes while only 2 classes are passed to the GBDT for training. I chose to raise a ValueError with an error message basically saying "This is the user's fault".
The thing is, the ValueError can also be raised because of early stopping: the training data is split into
Maybe splitting the data with
Note: please ignore the removed #FIXME comments in e2e1a5c, I'll add them back if I can but the fix forces me to remove these checks.
stratified splitting could help, but might not if there is a very rare class. Is the problem merely that you don't know how to align the prediction to the validation data? Would initialising the classifier loss with a list of classes known from training facilitate fixing this problem?
No the problem isn't related to the validation data. It's related to the training data.
Basically the raised ValueErrors in e2e1a5c may happen in 2 scenarios:
I'm sorry I find it hard to explain clearly and concisely.
I think that using stratify in
Yes but we still need to identify which of the classes are missing in
I don't think the same issues happens in other estimators. This issue is happening because our main estimator (GBDT) relies on another estimator (the init estimator) which is passed only a subset of the data where some classes might be missing, because of early stopping. GBDTs are the only estimators that use early stopping + another estimator, as far as I know.
I took a look at different early stopping implementations:
Would it be acceptable that I open a new PR to "bugfix" multiplayer perceptron and gradient boosting by making the splits stratified, and wait for it to be merged?