Run prediction on histogram index. #5319

trivialfis · 2020-02-17T16:31:36Z

Unify GPU Predict tree node. (merged)
Add wrapper for ellpack matrix in GPU Predictor. (merged)
Add PageExists to DMatrix for testing whether gradient index is available. (merged)
Move CPU histogram index into DMatrix (external memory is still not supported).
Add predict on CPU gradient index.

trivialfis · 2020-02-18T14:35:50Z

@RAMitchell Your last PR made the SparsePage a value instead of a pointer, this way it will always exist. This PR runs predict on Ellpack only when SparsePage is not created to avoid performance drop using normal DMatrix.

RAMitchell

Changes to gpu_predictor look great.

RAMitchell · 2020-02-19T03:47:11Z

include/xgboost/data.h

@@ -493,13 +505,33 @@ class DMatrix {
  virtual BatchSet<CSCPage> GetColumnBatches() = 0;
  virtual BatchSet<SortedCSCPage> GetSortedColumnBatches() = 0;
  virtual BatchSet<EllpackPage> GetEllpackBatches(const BatchParam& param) = 0;
+  virtual BatchSet<GradientIndexPage> GetGradientIndexBatches(const BatchParam& param) = 0;
+
+  virtual bool EllpackExists() const = 0;


Is there a way to avoid adding these virtual private methods and make PageExists() virtual instead?

@RAMitchell Not entirely sure how to do that right now. Maybe add a enum for each type of page?

include/xgboost/data.h

src/common/hist_util.h

src/common/quantile.h

src/data/gradient_index_source.h

src/data/simple_dmatrix.cc

src/predictor/cpu_predictor.cc

RAMitchell · 2020-02-19T04:09:23Z

src/predictor/gpu_predictor.cu


+    if (precise && dmat->PageExists<SparsePage>()) {
+      do_predict_sparse_page();
+    } else if (!dmat->PageExists<EllpackPage>()) {


We need to make sure this EllpackPage exists at the time of the first prediction call, otherwise the SparsePage will be created on device in the first iteration.

That's the part where I need to know the plan for DMatrix initialization.

trivialfis · 2020-02-19T11:01:08Z

@RAMitchell

Why would predictions from the quantised matrix be different from the SparsePage?

This is quite subtle. Here is an explanation I found from simple examples:

The input value would have exact same values as the tree node split condition. When traversing the tree we use fvalue < split_cond, which is false because they are exactly the same value. Hence it will prefer right child.
Shifting value by an Eps helps, but doesn't solve the problem.

Here is a dataset with normal distribution I generated, with shape of 8 * 4:

[[ 1.76405235  0.40015721  0.97873798  2.2408932 ]
 [ 1.86755799 -0.97727788  0.95008842 -0.15135721]
 [-0.10321885  0.4105985   0.14404357  1.45427351]
 [ 0.76103773  0.12167502  0.44386323  0.33367433]
 [ 1.49407907 -0.20515826  0.3130677  -0.85409574]
 [-2.55298982  0.6536186   0.8644362  -0.74216502]
 [ 2.26975462 -1.45436567  0.04575852 -0.18718385]
 [ 1.53277921  1.46935877  0.15494743  0.37816252]]

The cuts for first column are (max_bin=2):

1.53277921676635742, 4.53951930999755859,

The cut values combined with min_values cover the whole range of this column, but min_value will not be used in prediction (should we use it?). The last row will be assigned to second bin as the sorted version of first column is:

[-2.55298982,
 -0.10321885,
 0.76103773,
 1.49407907,
 1.53277921,
 1.76405235,
 1.86755799,
 2.26975462]

and 1.53277921 is at second half. So it will get 4.53925 as cut value and hence feature value. But if you are using original data, it's exactly the same as cut point value 1.53... If we shift the value by Eps or change < to <=, we will again obtain different result.

RAMitchell · 2020-02-19T21:53:16Z

@trivialfis I don't follow. If we have a split condition f0 < 1.53277921 the row containing 1.53277921 will go right no matter if predicting from quantised data or floating point data.

trivialfis · 2020-02-19T22:19:08Z

In the above example, 0.76103773 should go left, but as it's assigned to the bin of 1.532 so it goes right

RAMitchell · 2020-02-19T22:54:25Z

I see. Any element assigned to bin 0 must have value < 1.532 based on the way the bins are constructed, so if we return floating point value of 1.532 for bin 0 this means we would need to use the <= condition for splits, but that is not consistent with prediction directly from floating point data. I think the solution is to return the cut point less than 1.532, in this case the min value.

Maybe you can produce a few test cases verifying these scenarios. Fundamentally I still think there should be no difference between quantile prediction and normal prediction, assuming the tree has been constructed on the same quantile cuts.

* Move CPU histogram index into DMatrix (external memory is still not supported). * Add predict on CPU gradient index. * Check is missing when getting global index.

trivialfis · 2020-02-25T16:17:38Z

The feature is basically done. Waiting for #5351 to be resolved.

trivialfis · 2020-03-13T22:37:32Z

@SmirnovEgorRu @ShvetsKS Hi, this PR is used to move the histogram index into DMatrix object, and prepare for in-place sketching so that we can prevent copying data into CSR for hist and gpu_hist algorithm. The resulting memory usage reduction is expected to be significant. The GPU version is already merged, where histogram index is stored in an Ellpack matrix. Also part of in-place sketching on GPU is merged in #5365 . Now we are working on user interface. There's an issue #5351 waiting to be resolved for merging this one. Feel free to take a look into this PR when time allows.

trivialfis added the status: WIP label Feb 17, 2020

trivialfis force-pushed the predict-quantile branch from 1c2408c to 983bd79 Compare February 18, 2020 08:28

RAMitchell reviewed Feb 19, 2020

View reviewed changes

trivialfis force-pushed the predict-quantile branch from eae2ed6 to 89041a4 Compare February 19, 2020 11:10

This was referenced Feb 20, 2020

Predict on Ellpack. #5327

Merged

Add release note for 1.0.0 in NEWS.md #5329

Merged

Predict from CPU CSR gradient index.

a2b4328

* Move CPU histogram index into DMatrix (external memory is still not supported). * Add predict on CPU gradient index. * Check is missing when getting global index.

trivialfis force-pushed the predict-quantile branch from 89041a4 to a2b4328 Compare February 23, 2020 18:20

ShvetsKS mentioned this pull request Mar 27, 2020

Reducing memory consumption for 'hist' method on CPU #5334

Merged

trivialfis closed this Aug 4, 2022

trivialfis deleted the predict-quantile branch August 4, 2022 03:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run prediction on histogram index. #5319

Run prediction on histogram index. #5319

trivialfis commented Feb 17, 2020 •

edited

Loading

trivialfis commented Feb 18, 2020

RAMitchell left a comment

RAMitchell Feb 19, 2020

trivialfis Feb 19, 2020

RAMitchell Feb 19, 2020

trivialfis Feb 19, 2020

trivialfis commented Feb 19, 2020 •

edited

Loading

RAMitchell commented Feb 19, 2020

trivialfis commented Feb 19, 2020 •

edited

Loading

RAMitchell commented Feb 19, 2020

trivialfis commented Feb 25, 2020

trivialfis commented Mar 13, 2020 •

edited

Loading

Run prediction on histogram index. #5319

Run prediction on histogram index. #5319

Conversation

trivialfis commented Feb 17, 2020 • edited Loading

trivialfis commented Feb 18, 2020

RAMitchell left a comment

Choose a reason for hiding this comment

RAMitchell Feb 19, 2020

Choose a reason for hiding this comment

trivialfis Feb 19, 2020

Choose a reason for hiding this comment

RAMitchell Feb 19, 2020

Choose a reason for hiding this comment

trivialfis Feb 19, 2020

Choose a reason for hiding this comment

trivialfis commented Feb 19, 2020 • edited Loading

RAMitchell commented Feb 19, 2020

trivialfis commented Feb 19, 2020 • edited Loading

RAMitchell commented Feb 19, 2020

trivialfis commented Feb 25, 2020

trivialfis commented Mar 13, 2020 • edited Loading

trivialfis commented Feb 17, 2020 •

edited

Loading

trivialfis commented Feb 19, 2020 •

edited

Loading

trivialfis commented Feb 19, 2020 •

edited

Loading

trivialfis commented Mar 13, 2020 •

edited

Loading