Batch UpdatePosition calls #7925

RAMitchell · 2022-05-20T11:10:08Z

Also removes position information from inside RowPartitioner class. We don't actually need it anywhere.

FinalisePosition has been moved outside the class, and now calculates positions as well as leaf values, which can be used for UpdatePredictionCache.

I have removed prediction caching for external memory in gpu_hist.

This reverts commit 80a3e78.

This reverts commit a1cddaa.

…scan

RAMitchell · 2022-05-20T11:17:38Z

Some thoughts on further re-factoring:

Updaters should not see the task. Their behaviour should not depend on the objective function.
The prediction cache update is perhaps best returned from the tree->Update call alongside position information. It is very easy to calculate the prediction when you are calculating the row positions. Alternatively GBTree can just calculate this using the positions obtained from the update.
Our lives would be made much easier if every updater consistently supported position information and prediction caching, without exceptions.
The gradient based sampling needs to transparently show which samples it has selected to enable the above for gpu_hist.

RAMitchell · 2022-05-20T11:31:01Z

Single GPU benchmarks, max_depth=20

dataset	master	batch-position-scan
airline	5014.992213	4659.965927
bosch	94.40516737	92.6324375
covtype	190.9750907	162.9997489
epsilon	986.6671753	979.8350237
fraud	1.618839191	1.577903566
higgs	2577.564037	2177.161882
year	2251.341695	2081.802971

RAMitchell · 2022-05-23T11:33:18Z

There is a performance regression for airline dataset (many rows) with max_depth=8. One of the batched kernels shows poor performance and needs to be profiled/optimised.

RAMitchell added 30 commits April 21, 2022 03:19

Remove single_precision_histogram

2b4cf67

Batch nodes from driver

f140ebc

Categoricals broken

80a3e78

Refactor categoricals

e1fb702

Refactor categoricals 2

dc100cf

Skip copy if no categoricals

bc74458

Review comment

c4f8eac

Merge branch 'master' of github.com:dmlc/xgboost into categorical

2a53849

Revert "Categoricals broken"

a1cddaa

This reverts commit 80a3e78.

Merge branch 'master' of github.com:dmlc/xgboost into fuse

829bda6

Merge branch 'categorical' of github.com:RAMitchell/xgboost into fuse

0bc8745

Lint

fd0e25e

Merge branch 'master' of github.com:dmlc/xgboost into fuse

9fab64e

Revert "Revert "Categoricals broken""

56785f3

This reverts commit a1cddaa.

Limit concurrent nodes

1dd1a6c

Lint

8751d14

Basic blockwise partitioning

49809bf

Working block partition

181d7cf

Reduction

666eb9b

Some failing tests

66173c7

Handle empty candidate

ec7fea8

Cleanup

49c5f90

experiments

bd48082

Improvements

c3ef1f6

Fused scan

ba8bbdf

Register blocking

f4ef4ca

Cleanup

9c27dd0

Working tests

0bcc84a

Transplanted new code

723ff47

Optimised

199bed9

RAMitchell added 8 commits May 19, 2022 06:57

Do not initialise data structures to maximum possible tree size.

0e35e99

Comments, cleanup

daa9b56

Refactor FinalizePosition

8ab989e

Remove redundant functions

d50ec4b

Lint

c34c3ad

Merge branch 'master' of github.com:dmlc/xgboost into batch-position-…

e534edc

…scan

Remove old kernel

47bfc6e

Add tests for AtomicIncrement

a53ba87

RAMitchell mentioned this pull request Jun 1, 2022

Batch UpdatePosition using cudaMemcpy #7964

Merged

RAMitchell closed this Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch UpdatePosition calls #7925

Batch UpdatePosition calls #7925

RAMitchell commented May 20, 2022 •

edited

RAMitchell commented May 20, 2022

RAMitchell commented May 20, 2022

RAMitchell commented May 23, 2022

Batch UpdatePosition calls #7925

Batch UpdatePosition calls #7925

Conversation

RAMitchell commented May 20, 2022 • edited

RAMitchell commented May 20, 2022

RAMitchell commented May 20, 2022

RAMitchell commented May 23, 2022

RAMitchell commented May 20, 2022 •

edited