Support null value in CUDA array interface. #46

quasiben · 2022-11-28T16:47:58Z

) * added type hints to custom_obj and custom_eval for Spark persistence Co-authored-by: Bobby Wang <wbo4958@gmail.com>

* Support binary/multi-class classification, ranking. * Add documents. * Handle missing data.

We have 2 new custom objective demos covering both regression and classification with accompanying tutorials in documents.

* Replace -1 in pandas initializer. * Unify `IsValid` functor. * Mimic pandas data handling in cuDF glue code. * Check invalid categories. * Fix DDM sketching.

* Add new classifiers. * Typehint.

…mlc#6751) A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`. And `feval` is deprecated. The new `custom_metric` receives transformed prediction when the built-in objective is used. This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function. `eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model. The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`. Added more detailed documents for the behaviour of custom objective and metric.

Spark 3.2 depends on 3.7.0-M11 which has changed some implicited functions' signatures. And it will result the xgboost4j built against spark 3.0/3.1 failed when saving the model.

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>

Generated using `clang-format -style=google -dump-config > .clang-format`, with column width changed from 80 to 100 to be consistent with existing cpplint check.

* Fix span reverse iterator. * Disable `rbegin` on device code to avoid calling host function. * Add `trbegin` and friends.

This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support for most of the data types along with tests.

* Support building with CTK11.5. * Require system cub installation for CTK11.4+. * Check thrust version for segmented sort.

* Move attribute setter to callback. * Remove the internal train function. * Remove unnecessary initialization.

* Add test for invalid categorical data values. * Add check during sketching.

Change from system Python to environment python3. For Ubuntu 20.04, only `python3` is available and there's no `python`. So at least `python3` is consistent with Python virtual env, Ubuntu and anaconda.

* Define the `ObjInfo` and pass it down to every tree updater.

* Replace existing matrix and vector view. This is to prepare for handling higher dimension data and prediction when we support multi-target models.

- Optionally switch to c++17 - Use rmm CMake target. - Workaround compiler errors. - Fix GPUMetric inheritance. - Run death tests even if it's built with RMM support. Co-authored-by: jakirkham <jakirkham@gmail.com> Co-authored-by: jakirkham <jakirkham@gmail.com>

* Fix pylint errors. (dmlc#7967) * Rebase error.

* Update CUDA docker image and NCCL. (dmlc#8139) * Rest of the CI. * CPU test dependencies.

* Fix compatibility with latest cupy. * Freeze mypy.

* Copy gputreeshap.

…8170) * Fix LTR with weighted Quantile DMatrix. * Better tests.

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

…dmlc#8185) * Fix loading DMatrix binary in distributed env. (dmlc#8149) - Try to load DMatrix binary before trying to parse text input. - Remove some unmaintained code. * Fix.

Add missing Thrust header includes.

Update gputreeshap submodule.

…patch-1.6-cupy

Update to latest xgboost release_1.6.0 to include cupy compatibility patch

Merge branch-22.10 into branch-22.12 (Update to latest xgboost release_1.6.0 to include cupy compatibility patch)

nicovdijk and others added 30 commits October 21, 2021 16:22

[XGBoost4J-Spark] Serialization for custom objective and eval (dmlc#7274

31a307c

) * added type hints to custom_obj and custom_eval for Spark persistence Co-authored-by: Bobby Wang <wbo4958@gmail.com>

[doc] Remove num_pbuffer. [skip ci] (dmlc#7356)

864d236

[doc] Document the status of RTD hosting. [skip ci] (dmlc#7353)

e36b066

Avoid omp reduction in rank metric. (dmlc#7349)

fd61c61

[jvm-packages] Fix for space in sys.executable path in create_jni.py (d…

a6bcd54

…mlc#7358)

Re-implement PR-AUC. (dmlc#7297)

d434942

* Support binary/multi-class classification, ranking. * Add documents. * Handle missing data.

Update GPU doc for PR-AUC. [skip ci] (dmlc#7368)

b9414b6

Remove old custom objective demo. (dmlc#7369)

2eee874

We have 2 new custom objective demos covering both regression and classification with accompanying tutorials in documents.

Handle missing values in dataframe with category dtype. (dmlc#7331)

ac9bfaa

* Replace -1 in pandas initializer. * Unify `IsValid` functor. * Mimic pandas data handling in cuDF glue code. * Check invalid categories. * Fix DDM sketching.

Avoid OMP reduction in AUC. (dmlc#7362)

d05754f

[breaking] Remove label encoder deprecated in 1.3. (dmlc#7357)

3c4aa9b

Update setup.py. (dmlc#7360)

6b074ad

* Add new classifiers. * Typehint.

Typehint for subset of core API. (dmlc#7348)

c676948

[jvm-packages] Fix json4s binary compatibility issue (dmlc#7376)

b81ebbe

Spark 3.2 depends on 3.7.0-M11 which has changed some implicited functions' signatures. And it will result the xgboost4j built against spark 3.0/3.1 failed when saving the model.

Move macos test to github action. (dmlc#7382)

239dbb3

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>

Use double precision in metric calculation. (dmlc#7364)

0f7a9b4

Add clang-format config. (dmlc#7383)

8211e5f

Generated using `clang-format -style=google -dump-config > .clang-format`, with column width changed from 80 to 100 to be consistent with existing cpplint check.

Fix span reverse iterator. (dmlc#7387)

6295dc3

* Fix span reverse iterator. * Disable `rbegin` on device code to avoid calling host function. * Add `trbegin` and friends.

Support building with CTK11.5. (dmlc#7379)

32e673d

* Support building with CTK11.5. * Require system cub installation for CTK11.4+. * Check thrust version for segmented sort.

Move callbacks from fit to __init__. (dmlc#7375)

154b150

Cleanup the train function. (dmlc#7377)

c74df31

* Move attribute setter to callback. * Remove the internal train function. * Remove unnecessary initialization.

Add test for invalid categorical data values. (dmlc#7380)

a55d43c

* Add test for invalid categorical data values. * Add check during sketching.

Change shebang used in CLI demo. (dmlc#7389)

e6ab594

Change from system Python to environment python3. For Ubuntu 20.04, only `python3` is available and there's no `python`. So at least `python3` is consistent with Python virtual env, Ubuntu and anaconda.

Handle OMP_THREAD_LIMIT. (dmlc#7390)

57a4b4f

Support building gradient index with cat data. (dmlc#7371)

ccdabe4

Pass infomation about objective to tree methods. (dmlc#7385)

4100827

* Define the `ObjInfo` and pass it down to every tree updater.

Add note about CRAN release [skip ci] (dmlc#7395)

232144c

Implement a general array view. (dmlc#7365)

b06040b

* Replace existing matrix and vector view. This is to prepare for handling higher dimension data and prediction when we support multi-target models.

trivialfis and others added 27 commits June 7, 2022 14:20

[backport] Fix pylint errors. (dmlc#7967) (dmlc#7981)

a55d3bd

* Fix pylint errors. (dmlc#7967) * Rebase error.

[backport] Update CUDA docker image and NCCL. (dmlc#8139) (dmlc#8162)

39c1488

* Update CUDA docker image and NCCL. (dmlc#8139) * Rest of the CI. * CPU test dependencies.

[backport] Fix compatibility with latest cupy. (dmlc#8129) (dmlc#8160)

140c377

* Fix compatibility with latest cupy. * Freeze mypy.

Fix monotone constraint with tuple input. (dmlc#7891) (dmlc#8159)

9c65337

[CI] Test with latest RAPIDS. (dmlc#7816) (dmlc#8164)

9d816d9

[dask] Use an invalid port for test. (dmlc#8064) (dmlc#8167)

97d89c3

Verify shared object version at load. (dmlc#7928) (dmlc#8168)

0e2b5c4

[backport] Limit max_depth to 30 for GPU. (dmlc#8098) (dmlc#8169)

2e6444b

[dask] Deterministic rank assignment. (dmlc#8018) (dmlc#8165)

b18c984

[backport] Fix Python package source install. (dmlc#8036) (dmlc#8171)

e82162d

* Copy gputreeshap.

[backport] Fix LTR with weighted Quantile DMatrix. (dmlc#7975) (dmlc#…

51c3301

…8170) * Fix LTR with weighted Quantile DMatrix. * Better tests.

Make 1.6.2 patch release. (dmlc#8175)

2d54f7d

Disable modin test on 1.6.0 branch. (dmlc#8176)

7036d4f

[CI] Fix R build on Jenkins. (dmlc#8154) (dmlc#8180)

922d213

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

[backport] Fix loading DMatrix binary in distributed env. (dmlc#8149) (…

0fd6391

…dmlc#8185) * Fix loading DMatrix binary in distributed env. (dmlc#8149) - Try to load DMatrix binary before trying to parse text input. - Remove some unmaintained code. * Fix.

Fix release script. (dmlc#8187)

1fbb452

Fix typo. (dmlc#8192)

b993424

Add missing Thrust header includes.

b400967

Merge pull request rapidsai#39 from bdice/thrust-includes

2e1e95e

Add missing Thrust header includes.

Update gputreeshap submodule.

26ea6e2

Merge pull request rapidsai#40 from bdice/update-gputreeshap

3732123

Update gputreeshap submodule.

Merge branch 'release_1.6.0' of https://github.com/dmlc/xgboost into …

e6d5624

…patch-1.6-cupy

Merge pull request rapidsai#41 from beckernick/patch-1.6-cupy

1a2012d

Update to latest xgboost release_1.6.0 to include cupy compatibility patch

Merge pull request rapidsai#42 from rapidsai/branch-22.10

e6756c7

Merge branch-22.10 into branch-22.12 (Update to latest xgboost release_1.6.0 to include cupy compatibility patch)

Simple tests.

f6c429a

Extract mask.

4fe8204

quasiben changed the base branch from branch-22.10 to branch-21.12 November 28, 2022 16:48

quasiben closed this Nov 28, 2022

quasiben deleted the trivialfis-arr-null-fields branch November 28, 2022 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support null value in CUDA array interface. #46

Support null value in CUDA array interface. #46

quasiben commented Nov 28, 2022

Support null value in CUDA array interface. #46

Support null value in CUDA array interface. #46

Conversation

quasiben commented Nov 28, 2022