Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking deprecated features. #3986

Open
trivialfis opened this issue Dec 11, 2018 · 12 comments
Open

Tracking deprecated features. #3986

trivialfis opened this issue Dec 11, 2018 · 12 comments

Comments

@trivialfis
Copy link
Member

trivialfis commented Dec 11, 2018

Feature PR(or time) Replacement Removed
GPU Objectives. #3643 Now all objectives having GPU port will use GPU by default. #4690
grow_fast_histmaker #3836 Renamed to grow_quantile_histmaker.
Some of R parameters documented in R/xgb.importance.R. #1964 ?? ??
num_pbuffer_deprecated ?? ??
python/training.py: train::learning_rates #1797 Now use use callback API. #5155
python/sklearn.py: seed ?? Now use random_state. #4929
python/sklearn.py: nthread ?? Now use n_jobs. #4929
XGDMatrixCreateFromCSC #1600 XGDMatrixCreateFromCSCEx
XGDMatrixCreateFromCSR #1600 XGDMatrixCreateFromCSR_omp
silent #3982 Now use verbosity #5476
gpu_exact #4527 Users are encouraged to use hist or approx on GPU. #4742
n_gpus #4749 Use distributed frameworks like dask or spark #6821
reg:linear #4267 Renamed to reg:squarederror #4267
XGDMatrixSetGroup #4864 Use XGDMatrixSetUIntInfo instead.
TreeParam.max_depth #5101 This variable is never used (TrainParam.max_depth is still supported) #5101
Old Python callback functions #6199 A new callback interface is designed. #7280
Label encoder in XGBClassifier (use_label_encoder=True) #6269 User should perform label encoding manually. #7357
ntree_limit in Python #6668 Use iteration_range or model slicing instead. #8345 (2.0.0)
positional arguments in Python package #6365 Use keyword arguments instead.
use_gpu in PySpark #9390 (2.0.0) Use device instead.
gpu_hist tree method #9385 (2.0.0) Use device instead.
gpu_coord_descent updater #9507 (2.0.0) Use device instead.
predictor (cpu_predictor, gpu_predictor) #9129 (2.0.0) Use device instead. #9129 (2.0.0)
gpu_id tree method #9385 (2.0.0) Use device instead.
fit parameters in the sklearn interface #6751 (1.6.0) Use set_params instead. #9986 (2.1.0
Command line interface #9485 (2.1.0) Use other language bindings instead
Reading files from URLs like S3 #9504 (2.1.0) Use third-party libraries instead. #9504 (2.1.0)
MPI support in RABIT #9525 (2.1.0) Use RABIT or NCCL instead (along with grpc for federated) #9525 (2.1.0)
XGDMatrixSetDenseInfo #10139 (2.1.0) Use XGDMatrixSetInfoFromInterface
manylinux2014 Python wheels #10475 (2.1.0) Use a recent Linux distribution and manylinux2_28 wheels instead
@trivialfis
Copy link
Member Author

trivialfis commented Dec 11, 2018

Inspired by @thvasilo in #3929 (Thanks). Let's keep track of deprecated features here. Great chance that I missed something, anyone who notice please help adding it back. :) @hcho3 Please help filling the ?? if it's still possible (please don't spend too much time on it, it's deprecated), otherwise I will summit a PR to remove everything that loss track.

@trivialfis trivialfis changed the title Tracking deprected features. Tracking deprecated features. Dec 11, 2018
@hcho3
Copy link
Collaborator

hcho3 commented Dec 14, 2018

Let's discuss whether num_roots should be kept or not. I think it is not documented and experimental and should be deprecated. Already, I decided not to support it when I first wrote hist updater.

@hcho3 hcho3 pinned this issue Dec 14, 2018
@trivialfis
Copy link
Member Author

@hcho3 I tried to find out what it does. It turns out num_roots is not documented. Is there any reference for it?

@RAMitchell
Copy link
Member

Its always 1 in code. According to @tqchen it was intended to be used for multiclass trees (he can maybe confirm this). There is also something in metainfo to do with root index that is not documented or used - suggest also deprecating this.

@tqchen
Copy link
Member

tqchen commented Dec 14, 2018

I think it is fine dropping them, but let us also keep the backward compatibility in mind. It might be useful to first introduce json exchange format, and then allow upgrade via json

@hcho3
Copy link
Collaborator

hcho3 commented Dec 14, 2018

@trivialfis Why is XGDMatrixCreateFromCSC deprecated? It is used by the R wrapper.

@tqchen Got it

@hcho3
Copy link
Collaborator

hcho3 commented Dec 14, 2018

Just found out that TreeParam.max_depth is never set and never used:

int max_depth;

This field is different from the TrainParam.max_depth, which is given at

int max_depth;

Every occurrence of max_depth refers to TrainParam.max_depth, not TreeParam.max_depth. (Just verified with CLion)

It is not possible to remove max_depth from TreeParam right away, since TreeParam is serialized byte-wise and we cannot change the byte count. However, we can certainly remove TreeParam.max_depth from the new JSON serialization schema.

@trivialfis
Copy link
Member Author

@tqchen

It might be useful to first introduce json exchange format

Yes, JSON is the priority. This issue is intended to be long term tracking.

Why is XGDMatrixCreateFromCSC deprecated? It is used by the R wrapper.

No idea. The document said so:

* \deprecated

@trivialfis
Copy link
Member Author

@hcho3 I'm only tracking the features that's already decided to be deprecated here, so that we can have a time frame for removing those code, as you can see some of them are really old and still lingering.

Let's move whether some other features needed to be deprecated into new issues. :)

@hcho3
Copy link
Collaborator

hcho3 commented Dec 15, 2018

@trivialfis

I'm only tracking the features that's already decided to be deprecated here, so that we can have a time frame for removing those code

Got it. I'll file a new issue to deprecate TreeParam.max_depth.

No idea. The document said so

Turns out XGDMatrixCreateFromCSC is indeed deprecated, in favor of XGDMatrixCreateFromCSCEx.

@thvasilo
Copy link
Contributor

There's an issue on the board about num_feature which I guess was used to subsample features? This is taken care of by colsample* now so I guess this parameter should be removed. @trivialfis should we open an issue to track this?

@trivialfis
Copy link
Member Author

@hcho3 Actually I deprecated a lots of things in #5101, but there is no user visible effect, as all deprecated parameters are duplicated in some way. I just can't remove them as binary IO is still the default IO method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants