Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

hcho3 · 2018-10-25T08:46:18Z

The save_model() and load_model() method only saves the part of the model that's common to all language interfaces and do not preserve Python-specific attributes, such as feature_names. More crucially, label encoder is not preserved either; this is needed for the scikit-learn wrapper, since you may have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model objects.

Related issue: #3828

…XGBRanker The `save_model()` and `load_model()` method only saves the part of the model that's common to all language interfaces and do not preserve Python-specific attributes, such as `feature_names`. More crucially, label encoder is not preserved either; this is needed for the scikit-learn wrapper, since you may have string labels. Fix: Explicitly recommend pickling as the way to save scikit-learn model objects.

hcho3 · 2018-10-25T08:49:21Z

On a longer term, let's consider migrating to an extensible model format to store auxiliary information: JSON, Protocol Buffers, or INI config. LightGBM uses INI-like format, which works well for them.

codecov-io · 2018-10-25T09:44:08Z

Codecov Report

Merging #3829 into master will increase coverage by 0.19%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##             master    #3829      +/-   ##
============================================
+ Coverage     51.86%   52.06%   +0.19%     
  Complexity      203      203              
============================================
  Files           181      181              
  Lines         14358    14358              
  Branches        495      495              
============================================
+ Hits           7447     7475      +28     
+ Misses         6673     6645      -28     
  Partials        238      238

Impacted Files	Coverage Δ	Complexity Δ
python-package/xgboost/sklearn.py	`84.75% <ø> (ø)`	`0 <0> (ø)`	⬇️
python-package/xgboost/compat.py	`90.9% <0%> (+1.81%)`	`0% <0%> (ø)`	⬇️
python-package/xgboost/core.py	`82.23% <0%> (+4.44%)`	`0% <0%> (ø)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a59ff2...0869404. Read the comment docs.

…XGBRanker (dmlc#3829) The `save_model()` and `load_model()` method only saves the part of the model that's common to all language interfaces and do not preserve Python-specific attributes, such as `feature_names`. More crucially, label encoder is not preserved either; this is needed for the scikit-learn wrapper, since you may have string labels. Fix: Explicitly recommend pickling as the way to save scikit-learn model objects.

hcho3 mentioned this pull request Oct 25, 2018

multi-GPU training is not adaptable to other GPU counts or CPU #3342

Closed

hcho3 merged commit d83c818 into dmlc:master Oct 25, 2018

hcho3 deleted the doc_sklearn_save_model branch October 25, 2018 20:51

lock bot locked as resolved and limited conversation to collaborators Jan 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

hcho3 commented Oct 25, 2018 •

edited

Loading

hcho3 commented Oct 25, 2018

codecov-io commented Oct 25, 2018 •

edited

Loading

Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker #3829

Conversation

hcho3 commented Oct 25, 2018 • edited Loading

hcho3 commented Oct 25, 2018

codecov-io commented Oct 25, 2018 • edited Loading

Codecov Report

hcho3 commented Oct 25, 2018 •

edited

Loading

codecov-io commented Oct 25, 2018 •

edited

Loading