[REVIEW] MNMG RF broadcast feature #3349

viclafargue · 2021-01-06T17:46:01Z

This will provide the following new features to MNMG RF:

Possibility to train on the whole dataset (the workers will receive the full dataset)
Possibility to avoid the transfer of the model before inference, instead workers will perform partial inference with trees at disposal and the results will be reduced on the client side. This will be particularly helpful when the dataset is actually smaller than the model to be transfered.

hcho3 · 2021-01-06T20:26:40Z

@viclafargue Thanks for working on this. I will make sure to review once it's marked ready for review. Also, let me know if you'd like some early feedback.

viclafargue · 2021-01-11T18:28:33Z

Thanks @hcho3! I think that it should be ready for a first review/discussion. For now, the broadcast feature is implemented for training, but only partially implemented for inference (predict but not predict_proba). Reducing results produced by different sets of trees may not produce the same results as performing the inference with the full model. That's why I'm a bit dubious about the validity of the predictions with the broadcast feature enabled. Maybe have you some tips to get a more faithful results during the reduction operation? Also, I'm wondering how to implement the reduction for predict_proba.

Additionally, is it preferable to perform the reduction in a distributed or local fashion? Some of the Dask features I am using only work with host arrays. Maybe a better solution would be to assume there's enough device space on the client and to perform the reduction there with GPU acceleration.

drobison00

Looks reasonable to me.

Have you tested in a MNMG environment?

JohnZed

The train code looks really nice and clean. Per discussion, I think it's worth moving the prediction code over to do the predict_proba+average+vote for classification to ensure no change in prediction. I'd really hope we can get it all onto gpu too with dask.array/cupy arrays... if mean isn't working on gpu for the reduction you need, maybe worth conversing with the dask team.

python/cuml/dask/ensemble/base.py

JohnZed · 2021-01-20T06:06:34Z

python/cuml/dask/ensemble/base.py

+def _func_predict(model, input_data, **kwargs):
+    X = concatenate(input_data)
+    with cuml.using_output_type("numpy"):
+        prediction = model.predict(X, **kwargs)


Per discussion, for classification, this should probably be predict_proba and then we take the per-class average (maybe make a 3d array of nrowsnclassesnworkers) and average over the 3rd dim. We'd also want to weight by the ntrees per worker, since they could differ a bit (e.g. training 101 trees on 10 workers would lead to an extra tree on 1). Same approach (weighted average reduction) should work for regression, just working like nclasses=1.

python/cuml/dask/ensemble/base.py

python/cuml/dask/ensemble/randomforestclassifier.py

python/cuml/dask/ensemble/base.py

python/cuml/dask/ensemble/randomforestregressor.py

viclafargue · 2021-01-20T09:26:06Z

Looks reasonable to me.

Have you tested in a MNMG environment?

Thanks, only MG for now. I need to do this.

drobison00 · 2021-02-04T17:49:00Z

@viclafargue Looks good to me.

viclafargue · 2021-02-05T09:31:29Z

@viclafargue Looks good to me.

Thank you for the review! I'll test the PR on MNMG settings. Unfortunately, it's a little bit hard to setup one at the moment because of technical problems. I'll keep you updated.

codecov-io · 2021-03-05T21:08:43Z

Codecov Report

Merging #3349 (7cc2495) into branch-0.19 (6dfff66) will increase coverage by 1.67%.
The diff coverage is 95.24%.

@@               Coverage Diff               @@
##           branch-0.19    #3349      +/-   ##
===============================================
+ Coverage        79.21%   80.89%   +1.67%     
===============================================
  Files              226      227       +1     
  Lines            17900    17796     -104     
===============================================
+ Hits             14180    14396     +216     
+ Misses            3720     3400     -320

Flag	Coverage Δ
dask	`45.49% <75.17%> (+1.74%)`	⬆️
non-dask	`72.90% <82.42%> (+1.42%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
python/cuml/dask/cluster/dbscan.py	`97.29% <ø> (+0.07%)`	⬆️
python/cuml/dask/linear_model/elastic_net.py	`100.00% <ø> (ø)`
python/cuml/datasets/arima.pyx	`97.56% <ø> (ø)`
python/cuml/datasets/blobs.py	`77.27% <ø> (ø)`
python/cuml/datasets/regression.pyx	`98.11% <ø> (ø)`
python/cuml/decomposition/incremental_pca.py	`94.70% <ø> (ø)`
python/cuml/feature_extraction/_tfidf.py	`94.73% <ø> (+0.11%)`	⬆️
python/cuml/metrics/pairwise_distances.pyx	`98.83% <ø> (ø)`
python/cuml/neighbors/__init__.py	`100.00% <ø> (ø)`
python/cuml/preprocessing/LabelEncoder.py	`94.73% <ø> (ø)`
... and 94 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bacb05e...7cc2495. Read the comment docs.

JohnZed

Looking good... I just had a few suggestions then will approve quick. The weighting scheme via reduction looks good, but it may be nonobvious to code-readers, so it'd be good to beef up the comments there a bit. Small test suggestions too. Otherwise great!

python/cuml/dask/ensemble/base.py

python/cuml/dask/ensemble/randomforestregressor.py

python/cuml/test/dask/test_random_forest.py

JohnZed

Looks great, Victor!

JohnZed · 2021-03-25T22:28:53Z

@gpucibot merge

MNMG RF broadcast feature for fit

e12a2d6

viclafargue requested a review from a team as a code owner January 6, 2021 17:46

viclafargue changed the title ~~[ENH] MNMG RF broadcast feature~~ [WIP] MNMG RF broadcast feature Jan 6, 2021

viclafargue added 4 commits January 8, 2021 12:57

MNMG RF broadcast feature for transform

927bed5

Test for broadcast feature

785b33d

Merge branch-0.18

7808917

Merge branch 'branch-0.18' into fea-mnmg-rf-broadcast

96ebc32

viclafargue changed the title ~~[WIP] MNMG RF broadcast feature~~ [REVIEW] MNMG RF broadcast feature Jan 11, 2021

viclafargue added the 3 - Ready for Review Ready for review by team label Jan 11, 2021

dantegd requested a review from hcho3 January 11, 2021 21:04

JohnZed requested review from JohnZed and removed request for hcho3 January 14, 2021 17:14

drobison00 reviewed Jan 19, 2021

View reviewed changes

JohnZed reviewed Jan 20, 2021

View reviewed changes

viclafargue added 4 commits January 21, 2021 09:46

Merge branch 'branch-0.18' into fea-mnmg-rf-broadcast

4048daa

First round changes

16e300d

Second round of changes

4eb130b

Back to Dask

f6ead8d

github-actions bot added the Cython / Python Cython or Python issue label Jan 21, 2021

viclafargue added 2 commits January 27, 2021 17:06

Merge branch 'branch-0.18' into fea-mnmg-rf-broadcast

1342158

weighted average

93719fe

JohnZed added this to PR-WIP in v0.18 Release via automation Feb 8, 2021

viclafargue changed the base branch from branch-0.18 to branch-0.19 February 9, 2021 09:34

JohnZed removed this from PR-WIP in v0.18 Release Feb 9, 2021

JohnZed added this to PR-WIP in v0.19 Release via automation Feb 9, 2021

Merge branch 'branch-0.19' into fea-mnmg-rf-broadcast

7cc2495

viclafargue requested review from a team as code owners March 5, 2021 16:49

JohnZed self-assigned this Mar 16, 2021

JohnZed added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels Mar 16, 2021

v0.19 Release automation moved this from PR-WIP to PR-Needs review Mar 17, 2021

JohnZed requested changes Mar 17, 2021

View reviewed changes

viclafargue added 7 commits March 22, 2021 15:01

Use da.from_delayed

a9a4e23

Update documentation

204baf4

Update testing

7de4da1

Check that broadcasting actually happened

e392f4f

Fix for n_estimators<n_workers

4f2c2e2

Merge branch-0.19

49b7a58

Catch all fetch_20newsgroups exceptions

4a0685d

viclafargue force-pushed the fea-mnmg-rf-broadcast branch from ff5e505 to 4a0685d Compare March 24, 2021 09:49

JohnZed approved these changes Mar 25, 2021

View reviewed changes

v0.19 Release automation moved this from PR-Needs review to PR-Reviewer approved Mar 25, 2021

rapids-bot bot merged commit 9244037 into rapidsai:branch-0.19 Mar 25, 2021

v0.19 Release automation moved this from PR-Reviewer approved to Done Mar 25, 2021

This was referenced May 11, 2021

[FEA] Add optional data broadcast mode for Dask RF training #3343

Closed

[FEA] Allow optional "broadcast data" mode for Dask RF inference #3342

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] MNMG RF broadcast feature #3349

[REVIEW] MNMG RF broadcast feature #3349

viclafargue commented Jan 6, 2021

hcho3 commented Jan 6, 2021

viclafargue commented Jan 11, 2021 •

edited

drobison00 left a comment

JohnZed left a comment

JohnZed Jan 20, 2021

viclafargue commented Jan 20, 2021

drobison00 commented Feb 4, 2021

viclafargue commented Feb 5, 2021

codecov-io commented Mar 5, 2021

JohnZed left a comment

JohnZed left a comment

JohnZed commented Mar 25, 2021

[REVIEW] MNMG RF broadcast feature #3349

[REVIEW] MNMG RF broadcast feature #3349

Conversation

viclafargue commented Jan 6, 2021

hcho3 commented Jan 6, 2021

viclafargue commented Jan 11, 2021 • edited

drobison00 left a comment

Choose a reason for hiding this comment

JohnZed left a comment

Choose a reason for hiding this comment

JohnZed Jan 20, 2021

Choose a reason for hiding this comment

viclafargue commented Jan 20, 2021

drobison00 commented Feb 4, 2021

viclafargue commented Feb 5, 2021

codecov-io commented Mar 5, 2021

Codecov Report

JohnZed left a comment

Choose a reason for hiding this comment

JohnZed left a comment

Choose a reason for hiding this comment

JohnZed commented Mar 25, 2021

viclafargue commented Jan 11, 2021 •

edited