Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

Merged
merged 1 commit into from
Oct 25, 2018

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Oct 25, 2018

The save_model() and load_model() method only saves the part of the model that's common to all language interfaces and do not preserve Python-specific attributes, such as feature_names. More crucially, label encoder is not preserved either; this is needed for the scikit-learn wrapper, since you may have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model objects.

Related issue: #3828

…XGBRanker

The `save_model()` and `load_model()` method only saves the part of the model
that's common to all language interfaces and do not preserve Python-specific
attributes, such as `feature_names`. More crucially, label encoder is not
preserved either; this is needed for the scikit-learn wrapper, since you may
have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model
objects.
@hcho3
Copy link
Collaborator Author

hcho3 commented Oct 25, 2018

On a longer term, let's consider migrating to an extensible model format to store auxiliary information: JSON, Protocol Buffers, or INI config. LightGBM uses INI-like format, which works well for them.

@codecov-io
Copy link

codecov-io commented Oct 25, 2018

Codecov Report

Merging #3829 into master will increase coverage by 0.19%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3829      +/-   ##
============================================
+ Coverage     51.86%   52.06%   +0.19%     
  Complexity      203      203              
============================================
  Files           181      181              
  Lines         14358    14358              
  Branches        495      495              
============================================
+ Hits           7447     7475      +28     
+ Misses         6673     6645      -28     
  Partials        238      238
Impacted Files Coverage Δ Complexity Δ
python-package/xgboost/sklearn.py 84.75% <ø> (ø) 0 <0> (ø) ⬇️
python-package/xgboost/compat.py 90.9% <0%> (+1.81%) 0% <0%> (ø) ⬇️
python-package/xgboost/core.py 82.23% <0%> (+4.44%) 0% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a59ff2...0869404. Read the comment docs.

@hcho3 hcho3 merged commit d83c818 into dmlc:master Oct 25, 2018
@hcho3 hcho3 deleted the doc_sklearn_save_model branch October 25, 2018 20:51
CodingCat pushed a commit to CodingCat/xgboost that referenced this pull request Oct 25, 2018
…XGBRanker (dmlc#3829)

The `save_model()` and `load_model()` method only saves the part of the model
that's common to all language interfaces and do not preserve Python-specific
attributes, such as `feature_names`. More crucially, label encoder is not
preserved either; this is needed for the scikit-learn wrapper, since you may
have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model
objects.
alois-bissuel pushed a commit to criteo-forks/xgboost that referenced this pull request Dec 4, 2018
…XGBRanker (dmlc#3829)

The `save_model()` and `load_model()` method only saves the part of the model
that's common to all language interfaces and do not preserve Python-specific
attributes, such as `feature_names`. More crucially, label encoder is not
preserved either; this is needed for the scikit-learn wrapper, since you may
have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model
objects.
@lock lock bot locked as resolved and limited conversation to collaborators Jan 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants