Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scikit-learn api section documentation correction #3967

Merged
merged 3 commits into from
Dec 14, 2018
Merged

Conversation

mbouznif
Copy link
Contributor

@mbouznif mbouznif commented Dec 5, 2018

The documentation is quite inconsistent in the scikit-learn api section since the fit paragraph tells that when early stopping rounds occurs, the last iteration is returned not the best one, but the predict paragraph tells that when the predict is called without ntree_limit specified, then ntree_limit is equals to best_ntree_limit.

Thus, when reading the fit part, one could think that it is needed to specify what is the best iter when calling the predict, but when reading the predict part, then the best iter is given by default, it is the last iter that you have to specify if needed.

the description of early stopping round was quite inconsistent in the scikit-learn api section since the fit paragraph tells that when early stopping rounds occurs, the last iteration is returned not the best one, but the predict paragraph tells that when the predict is called without ntree_limit specified, then ntree_limit is equals to best_ntree_limit.

Thus, when reading the fit part, one could think that it is needed to specify what is the best iter when calling the predict, but when reading the predict part, then the best iter is given by default, it is the last iter that you have to specify if needed.
fix doc according to the python_lightweight_test error
@hcho3
Copy link
Collaborator

hcho3 commented Dec 14, 2018

Thanks!

@Edvard88
Copy link

What about this documentation
https://xgboost.readthedocs.io/en/latest/python/python_intro.html?highlight=early%20stopping
will be it fixed?

@lyxthe 've written in https://stackoverflow.com/questions/53483648/is-the-xgboost-documentation-wrong-early-stopping-rounds-and-best-and-last-it
if you fit with "best iteration" from early_stopping summary

For example:
Stopping. Best iteration:
[109] validation_0-auc:0.996667

fit with (109), you won't given the best score.
You should fit with "plus one" iteration, fit with 110 (because iterations starts from 0 ).
Then you'll get best score and it'll be best iteration.
Is it issue?

@lock lock bot locked as resolved and limited conversation to collaborators Mar 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants