RF: Make experimental-backend default for regression tasks and deprecate old-backend. #3872

venkywonka · 2021-05-18T14:34:43Z

This PR follows Fix RF regression performance #3845 and resolves [FEA] Promote experimental RF backend to default for regression as well #3520
Makes new-backend default for regression tasks. Now, for both classification and regression tasks, experimental-backend (new-backend) is better than old 😀
Adds deprecation warning when using old-backend in the C++ DecisionTree layer so that the warning reflects for the decision trees C++ API too.
Sets default n_bins to 128
Some docs update
Some python tests update

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

teju85

Minor nitpicks. Overall LGTM.

cpp/src/decisiontree/decisiontree_impl.cuh

python/cuml/ensemble/randomforestclassifier.pyx

python/cuml/ensemble/randomforestregressor.pyx

teju85 · 2021-05-18T16:45:26Z

This should only be merged after #3845 !

Co-authored-by: Thejaswi. N. S <rao.thejaswi@gmail.com>

RAMitchell

This should get removed:

cuml/cpp/src/decisiontree/decisiontree_impl.cuh

Line 426 in c1ce535

CUML_LOG_WARN("Using experimental backend for growing trees\n");

venkywonka · 2021-05-19T03:38:26Z

This should get removed:

cuml/cpp/src/decisiontree/decisiontree_impl.cuh

Line 426 in c1ce535

CUML_LOG_WARN("Using experimental backend for growing trees\n");

I thought so too, @RAMitchell ,but wanted confirmation since that output was being tested; will remove it 👍

…-rf-relegate-old-backend

teju85

Changes LGTM.

venkywonka · 2021-05-28T18:16:46Z

@venkywonka The dask issues in CI seem to have cleared up. @but now the fails seem relevant to the PR:
=========================== short test summary info ============================
FAILED cuml/test/test_api.py::test_fit_function[RandomForestClassifier] - Val...
FAILED cuml/test/test_pickle.py::test_small_rf[10-20-100-RandomForestClassifier-float32]
FAILED cuml/test/test_random_forest.py::test_rf_regression[special_reg0-1-log2-True-100-float32-1.0]
FAILED cuml/test/test_random_forest.py::test_rf_regression[special_reg0-1-sqrt-True-100-float32-1.0] 
yes @dantegd , some unforeseen changes because of changing default n_bins=128 ... and others that are still mysterious.... diggin it

seems like it was a combination of 2, one of them due to the quoted reason, second is detailed in the issue #3910 . My latest commit closes #3910

dantegd · 2021-05-28T22:32:31Z

rerun tests

This reverts commit 863a073.

dantegd · 2021-05-29T12:51:53Z

rerun tests

dantegd · 2021-05-29T17:17:36Z

It seems like all the test jobs timed out in the same test (Jenkins UI doesn’t show it):

cuml/test/test_random_forest.py::test_rf_regression[special_reg0-1-log2-True-100-float32-1.0] 
Build timed out (after 40 minutes). Marking the build as failed.

venkywonka · 2021-05-30T17:09:40Z

It seems like all the test jobs timed out in the same test (Jenkins UI doesn’t show it):
cuml/test/test_random_forest.py::test_rf_regression[special_reg0-1-log2-True-100-float32-1.0] 
Build timed out (after 40 minutes). Marking the build as failed.

I added more tests for the experimental backend in this PR (here), which exposed an unusual problem: seems like setting n_bins=100 hangs the regressor (but n_bins=128 did not)! (Not only in this PR but also branch-21.06 build!).
Will create an issue shortly and dig in deeper.
The scope of solving the aforementioned issue is beyond this PR (as the issue has nothing to do with this PR, this PR just happened to expose it).
Hence, for now, have made n_bins=128 in the tests so it does not hang...

dantegd · 2021-06-01T14:34:35Z

@venkywonka do we have a list of values that could make the tests hang?

venkywonka · 2021-06-01T16:12:28Z

@venkywonka do we have a list of values that could make the tests hang?

Thanks to @vinaydes , we discovered the bug, it was a regression in regression (again) (sorry)
It is detailed in issue #3919 and has to do with some buggy code that's on me!
TLDR; for all values of n_bins > 64 and n_bins % 64 != 0 this will hang due to deadlock caused due to this line.

I have patched the fix in this PR #3921 and that should resolve this deadly bug

dantegd · 2021-06-01T19:23:10Z

python/cuml/test/test_random_forest.py

@@ -217,10 +221,14 @@ def test_rf_classification(small_clf, datatype, split_algo,
     (1, 'sqrt', False, 100),
     (1, 1.0, True, 17),
     (1, 1.0, True, 32),
+     (0, 1.0, True, 16),
+     (1, 1.0, True, 11),
+     (0, 'auto', True, 128),


Fantastic news on solving the bug @venkywonka (and @vinaydes) I think we should just add some test cases like n_bins 100 back to have better test coverage and we’re good to merge

yep will add them right away!

python/cuml/test/test_random_forest.py

vinaydes · 2021-06-02T02:08:06Z

python/cuml/test/test_random_forest.py

@@ -341,7 +366,7 @@ def test_rf_classification_float64(small_clf, datatype, convert_dtype):
        fil_preds = np.reshape(fil_preds, np.shape(cu_preds))

        fil_acc = accuracy_score(y_test, fil_preds)
-        assert fil_acc >= (cu_acc - 0.02)
+        assert fil_acc >= (cu_acc - 0.07)


May be add a comment here as reminder to restore the old tolerance value later.

sure will do

…enh-ext-rf-relegate-old-backend

dantegd · 2021-06-02T04:08:17Z

@gpucibot merge

codecov-commenter · 2021-06-02T06:10:13Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.06@b30d527). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-21.06    #3872   +/-   ##
===============================================
  Coverage                ?   77.30%           
===============================================
  Files                   ?      215           
  Lines                   ?    17042           
  Branches                ?        0           
===============================================
  Hits                    ?    13174           
  Misses                  ?     3868           
  Partials                ?        0

Flag	Coverage Δ
non-dask	`77.30% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b30d527...20af360. Read the comment docs.

…ate old-backend. (rapidsai#3872) * This PR follows rapidsai#3845 and resolves rapidsai#3520 * Makes new-backend default for regression tasks. Now, for both classification and regression tasks, experimental-backend (new-backend) is better than old 😀 * Adds deprecation warning when using old-backend in the C++ DecisionTree layer so that the warning reflects for the decision trees C++ API too. * Sets default `n_bins` to 128 * Some docs update * Some python tests update Authors: - Venkat (https://github.com/venkywonka) - Rory Mitchell (https://github.com/RAMitchell) Approvers: - Rory Mitchell (https://github.com/RAMitchell) - Thejaswi. N. S (https://github.com/teju85) - Philip Hyunsu Cho (https://github.com/hcho3) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#3872

RAMitchell and others added 8 commits May 9, 2021 21:32

Fix MSE perf

c9dc9f6

Disable MAE

3725f8e

Update cpp/src/decisiontree/decisiontree.cu

496c17f

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

Lint

4fa2394

Merge branch 'mae' of github.com:RAMitchell/cuml into mae

5aeabb3

deprecate and default to new backend

620bc2f

Delete build symlink

3ac289f

Delete raft symlink

9c03707

teju85 requested changes May 18, 2021

View reviewed changes

cpp/src/decisiontree/decisiontree_impl.cuh Outdated Show resolved Hide resolved

python/cuml/ensemble/randomforestclassifier.pyx Outdated Show resolved Hide resolved

python/cuml/ensemble/randomforestregressor.pyx Outdated Show resolved Hide resolved

teju85 added breaking Breaking change improvement Improvement / enhancement to an existing function labels May 18, 2021

dantegd added the 0 - Blocked Cannot progress due to external reasons label May 18, 2021

teju85 added the 5 - Merge After Dependencies Depends on another PR: do not merge out of order label May 18, 2021

change to calendar-versions

c1ce535

Co-authored-by: Thejaswi. N. S <rao.thejaswi@gmail.com>

RAMitchell approved these changes May 18, 2021

View reviewed changes

Disable mae in gtests

e3d64ae

venkywonka added 2 commits May 19, 2021 00:55

Merge branch 'mae' of https://github.com/RAMitchell/cuml into enh-ext…

89ba638

…-rf-relegate-old-backend

remove warning, modify tests

eca21f9

github-actions bot added CUDA/C++ Cython / Python Cython or Python issue labels May 19, 2021

dantegd marked this pull request as ready for review May 27, 2021 16:17

dantegd requested review from a team as code owners May 27, 2021 16:17

hcho3 self-requested a review May 27, 2021 16:19

fixing conflicts

23ef24d

teju85 approved these changes May 27, 2021

View reviewed changes

venkywonka added 2 commits May 28, 2021 08:58

clang format fix

75de200

flake fix

d03acde

venkywonka added 2 commits May 28, 2021 23:00

explicit for small datasets

6611867

reflect tolerance increase due to accuracy increase

7131453

hcho3 approved these changes May 28, 2021

View reviewed changes

dantegd added this to PR-WIP in v21.06 Release via automation May 28, 2021

venkywonka added 2 commits May 29, 2021 10:07

stricter tolerance between fil and skl

863a073

Revert "stricter tolerance between fil and skl"

ab4bf3e

This reverts commit 863a073.

changing warning and keeping n_bins to 128 in tests

21cbe0e

v21.06 Release automation moved this from PR-WIP to PR-Needs review Jun 1, 2021

dantegd requested changes Jun 1, 2021

View reviewed changes

JohnZed reviewed Jun 1, 2021

View reviewed changes

python/cuml/test/test_random_forest.py Show resolved Hide resolved

dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 0 - Blocked Cannot progress due to external reasons 5 - Merge After Dependencies Depends on another PR: do not merge out of order labels Jun 1, 2021

vinaydes reviewed Jun 2, 2021

View reviewed changes

venkywonka added 2 commits June 2, 2021 09:32

add tests and comments

e6387b4

Merge branch 'branch-21.06' of https://github.com/rapidsai/cuml into …

20af360

…enh-ext-rf-relegate-old-backend

dantegd approved these changes Jun 2, 2021

View reviewed changes

v21.06 Release automation moved this from PR-Needs review to PR-Reviewer approved Jun 2, 2021

rapids-bot bot merged commit fc23461 into rapidsai:branch-21.06 Jun 2, 2021

v21.06 Release automation moved this from PR-Reviewer approved to Done Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RF: Make experimental-backend default for regression tasks and deprecate old-backend. #3872

RF: Make experimental-backend default for regression tasks and deprecate old-backend. #3872

venkywonka commented May 18, 2021 •

edited

teju85 left a comment

teju85 commented May 18, 2021

RAMitchell left a comment

venkywonka commented May 19, 2021

teju85 left a comment

venkywonka commented May 28, 2021

dantegd commented May 28, 2021

dantegd commented May 29, 2021

dantegd commented May 29, 2021

venkywonka commented May 30, 2021

dantegd commented Jun 1, 2021

venkywonka commented Jun 1, 2021 •

edited

dantegd Jun 1, 2021

venkywonka Jun 2, 2021

vinaydes Jun 2, 2021

venkywonka Jun 2, 2021

dantegd commented Jun 2, 2021

codecov-commenter commented Jun 2, 2021

RF: Make experimental-backend default for regression tasks and deprecate old-backend. #3872

RF: Make experimental-backend default for regression tasks and deprecate old-backend. #3872

Conversation

venkywonka commented May 18, 2021 • edited

teju85 left a comment

Choose a reason for hiding this comment

teju85 commented May 18, 2021

RAMitchell left a comment

Choose a reason for hiding this comment

venkywonka commented May 19, 2021

teju85 left a comment

Choose a reason for hiding this comment

venkywonka commented May 28, 2021

dantegd commented May 28, 2021

dantegd commented May 29, 2021

dantegd commented May 29, 2021

venkywonka commented May 30, 2021

dantegd commented Jun 1, 2021

venkywonka commented Jun 1, 2021 • edited

dantegd Jun 1, 2021

Choose a reason for hiding this comment

venkywonka Jun 2, 2021

Choose a reason for hiding this comment

vinaydes Jun 2, 2021

Choose a reason for hiding this comment

venkywonka Jun 2, 2021

Choose a reason for hiding this comment

dantegd commented Jun 2, 2021

codecov-commenter commented Jun 2, 2021

Codecov Report

venkywonka commented May 18, 2021 •

edited

venkywonka commented Jun 1, 2021 •

edited