Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable warp-per-tree inference in FIL for regression and binary classification #3760

Merged
merged 40 commits into from
Jun 14, 2021

Conversation

levsnv
Copy link
Contributor

@levsnv levsnv commented Apr 17, 2021

In FIL, enables using multiple threads to infer a single tree on multiple rows (less, equal or more than a warp, but often close to a warpful). Highly speeds up inference on the first ~6 levels of each tree, which can be significant even for rather deep models. E.g. random forests of max depth 40 may have an average depth of 13, which is barely double the highly sped up portion.

The breaking status is since we're adding a new C API parameter that has two mandatory values: threads_per_tree and n_items.
Python API is not breaking because if defaults to threads_per_tree=1 and n_items=0, which I am currently ensuring to have no performance regressions

@github-actions github-actions bot added CUDA/C++ Cython / Python Cython or Python issue labels Apr 17, 2021
@levsnv levsnv added breaking Breaking change CUDA / C++ CUDA issue Perf Related to runtime performance of the underlying code labels Apr 18, 2021
@levsnv levsnv added the 4 - Waiting on Author Waiting for author to respond to review label Apr 18, 2021
@levsnv
Copy link
Contributor Author

levsnv commented Apr 19, 2021

python tests show a bug that did not show in C++ tests with similar parameters. I will investigate, but assuming the code will not change much from that.

cpp/include/cuml/fil/fil.h Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/test/sg/fil_test.cu Outdated Show resolved Hide resolved
python/cuml/fil/fil.pyx Outdated Show resolved Hide resolved
@levsnv levsnv changed the title [WIP] Enable warp-per-tree inference in FIL [WIP] Enable warp-per-tree inference in FIL for regression and binary classification Apr 27, 2021
@levsnv levsnv changed the title [WIP] Enable warp-per-tree inference in FIL for regression and binary classification Enable warp-per-tree inference in FIL for regression and binary classification May 4, 2021
@levsnv levsnv requested a review from canonizer May 4, 2021 08:06
@levsnv levsnv added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on Author Waiting for author to respond to review labels May 4, 2021
@levsnv levsnv marked this pull request as ready for review May 4, 2021 08:06
@levsnv levsnv requested review from a team as code owners May 4, 2021 08:06
@levsnv levsnv added the improvement Improvement / enhancement to an existing function label May 4, 2021
@levsnv
Copy link
Contributor Author

levsnv commented Jun 9, 2021

in test_base.py and test_pickle.py
E ImportError: cannot import name 'make_meta' from 'dask.dataframe.core' (/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py)

v21.08 Release automation moved this from PR-Reviewer approved to PR-Needs review Jun 9, 2021
@github-actions github-actions bot added CMake gpuCI gpuCI issue labels Jun 9, 2021
@levsnv
Copy link
Contributor Author

levsnv commented Jun 9, 2021

Hopefully, by the time CI finishes, it's a normal FIL-only PR again

@levsnv levsnv requested a review from a team as a code owner June 9, 2021 03:00
@levsnv
Copy link
Contributor Author

levsnv commented Jun 9, 2021

This has #3941 merged in, which, in turn, was scheduled for gpucibot to merge.

@levsnv
Copy link
Contributor Author

levsnv commented Jun 10, 2021

Somehow, the build system changes did not vanish from "Files changed". Asked Dante :)

v21.08 Release automation moved this from PR-Needs review to PR-Reviewer approved Jun 10, 2021
@github-actions github-actions bot removed CMake gpuCI gpuCI issue labels Jun 10, 2021
@levsnv levsnv removed the request for review from a team June 10, 2021 23:29
@dantegd
Copy link
Member

dantegd commented Jun 14, 2021

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@8fe1b05). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #3760   +/-   ##
===============================================
  Coverage                ?   85.32%           
===============================================
  Files                   ?      230           
  Lines                   ?    18095           
  Branches                ?        0           
===============================================
  Hits                    ?    15439           
  Misses                  ?     2656           
  Partials                ?        0           
Flag Coverage Δ
dask 47.90% <0.00%> (?)
non-dask 77.67% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8fe1b05...cec71e5. Read the comment docs.

@dantegd
Copy link
Member

dantegd commented Jun 14, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit bcb7b6c into rapidsai:branch-21.08 Jun 14, 2021
v21.08 Release automation moved this from PR-Reviewer approved to Done Jun 14, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
…ification (rapidsai#3760)

In FIL, enables using multiple threads to infer a single tree on multiple rows (less, equal or more than a warp, but often close to a warpful). Highly speeds up inference on the first ~6 levels of each tree, which can be significant even for rather deep models. E.g. random forests of max depth 40 may have an average depth of 13, which is barely double the highly sped up portion.

The breaking status is since we're adding a new C API parameter that has two mandatory values: `threads_per_tree` and `n_items`.
Python API is not breaking because if defaults to `threads_per_tree=1` and `n_items=0`, which I am currently ensuring to have no performance regressions

Authors:
  - https://github.com/levsnv
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Andy Adinets (https://github.com/canonizer)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#3760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change CUDA / C++ CUDA issue CUDA/C++ Cython / Python Cython or Python issue feature request New feature or request Perf Related to runtime performance of the underlying code
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants