Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the new RF backend by default for classification #3686

Merged
merged 14 commits into from
Apr 7, 2021

Conversation

hcho3
Copy link
Contributor

@hcho3 hcho3 commented Mar 31, 2021

The new RF backend now significantly outperforms the old backend in performance, while retaining a similar model accuracy.

The default is only changed for the classifier, as the performance is still not good for the regression task.

gbm-bench-depth30

@hcho3 hcho3 requested a review from a team as a code owner March 31, 2021 23:01
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Mar 31, 2021
@hcho3 hcho3 added breaking Breaking change improvement Improvement / enhancement to an existing function 2 - In Progress Currenty a work in progress labels Mar 31, 2021
@vinaydes
Copy link
Contributor

vinaydes commented Apr 2, 2021

When we flip the switch on default backend, we should change it in all unit tests, benchmark code in bench/ and elsewhere as well.

@hcho3
Copy link
Contributor Author

hcho3 commented Apr 2, 2021

Blocked by #3706

@hcho3 hcho3 requested a review from a team as a code owner April 7, 2021 05:07
@hcho3 hcho3 added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Apr 7, 2021
Copy link
Member

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

@vinaydes
Copy link
Contributor

vinaydes commented Apr 7, 2021

Now that experimental backend is default, should we change this warning

CUML_LOG_WARN("Using experimental backend for growing trees\n");
?

We should probably warn user when it goes back to old backend due to some limitation (quantile per tree or histogram approximation).

@JohnZed JohnZed added this to PR-WIP in v0.19 Release via automation Apr 7, 2021
@hcho3
Copy link
Contributor Author

hcho3 commented Apr 7, 2021

@vinaydes I fixed all the failing tests (thanks for your suggestion). As for the warning, we should still keep it, since we are changing the default only for the classification.

v0.19 Release automation moved this from PR-WIP to PR-Needs review Apr 7, 2021
Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a tiny comment change suggested

v0.19 Release automation moved this from PR-Needs review to PR-Reviewer approved Apr 7, 2021
@JohnZed
Copy link
Contributor

JohnZed commented Apr 7, 2021

Looks great, @hcho3 ! I made a tiny change to the comment string but looks great overall.

@JohnZed
Copy link
Contributor

JohnZed commented Apr 7, 2021

@gpucibot merge

@JohnZed
Copy link
Contributor

JohnZed commented Apr 7, 2021

@gpucibot merge

@hcho3
Copy link
Contributor Author

hcho3 commented Apr 7, 2021

rerun tests

@codecov-io
Copy link

Codecov Report

Merging #3686 (8df1942) into branch-0.19 (fd9ec89) will increase coverage by 5.24%.
The diff coverage is n/a.

❗ Current head 8df1942 differs from pull request most recent head 8b39957. Consider uploading reports for the commit 8b39957 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #3686      +/-   ##
===============================================
+ Coverage        80.70%   85.95%   +5.24%     
===============================================
  Files              227      225       -2     
  Lines            17615    16987     -628     
===============================================
+ Hits             14217    14601     +384     
+ Misses            3398     2386    -1012     
Flag Coverage Δ
dask 48.96% <ø> (+3.97%) ⬆️
non-dask 77.79% <ø> (+4.88%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/cuml/__init__.py 95.58% <ø> (+0.20%) ⬆️
...cuml/_thirdparty/sklearn/preprocessing/__init__.py 100.00% <ø> (ø)
...on/cuml/_thirdparty/sklearn/preprocessing/_data.py 67.22% <ø> (+4.10%) ⬆️
...hirdparty/sklearn/preprocessing/_discretization.py 83.33% <ø> (-0.88%) ⬇️
...l/_thirdparty/sklearn/preprocessing/_imputation.py 85.54% <ø> (+22.74%) ⬆️
python/cuml/_thirdparty/sklearn/utils/extmath.py 97.22% <ø> (+40.32%) ⬆️
...cuml/_thirdparty/sklearn/utils/skl_dependencies.py 79.54% <ø> (+25.62%) ⬆️
...ython/cuml/_thirdparty/sklearn/utils/validation.py 64.10% <ø> (+41.65%) ⬆️
python/cuml/cluster/__init__.py 100.00% <ø> (ø)
python/cuml/cluster/agglomerative.pyx 96.47% <ø> (ø)
... and 163 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f40344...8b39957. Read the comment docs.

@rapids-bot rapids-bot bot merged commit 6729fb0 into rapidsai:branch-0.19 Apr 7, 2021
v0.19 Release automation moved this from PR-Reviewer approved to Done Apr 7, 2021
@hcho3 hcho3 deleted the use_new_backend_for_classifier branch April 7, 2021 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function
Projects
No open projects
v0.19 Release
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants