Add Categorical Naive Bayes #4150

lowener · 2021-08-06T17:43:23Z

This is a continuation of PR #1763, #4053, and #4079, to add Categorical Naive Bayes.
This is supposed to be merged after #4079.
Linking issue #1666.

lowener · 2021-08-16T13:56:08Z

Here is a comparison of cuML and SKLearn performance on Categorical NB.
This is done using a synthetic dataset generated by make_classification with 4 classes.
The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz

cjnolet · 2021-08-25T23:34:10Z

@lowener can you include the steps taken for the timings in your benchmark chart? I'm mostly interested in whether these timings are only for training or if they also include the likelihoods.

cjnolet

Your implementaiton looks great overall. This algorithm is immensely popular on sparse inputs / bigraphs, though, so we should strive to support sparse inputs and, as a result, assume a significantly large upper-bound on the number of features.

python/cuml/naive_bayes/naive_bayes.py

lowener · 2021-09-07T13:19:39Z

I added support for sparse inputs and removed the loops over n_features.

For the benchmark previously posted I was only timing the fit() operation. Now the timing also include the predict():

CategoricalNB().fit(X, Y).predict(X)

And we can see that the removal of the loop over n_features greatly boosted the performances.

cjnolet

The changes look really good and the benchmarks are super impressive. Just one little cleanup opportunity remains.

python/cuml/naive_bayes/naive_bayes.py

codecov-commenter · 2021-09-08T15:48:08Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@0e770fa). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-21.10    #4150   +/-   ##
===============================================
  Coverage                ?   86.07%           
===============================================
  Files                   ?      231           
  Lines                   ?    18637           
  Branches                ?        0           
===============================================
  Hits                    ?    16042           
  Misses                  ?     2595           
  Partials                ?        0

Flag	Coverage Δ
dask	`47.05% <0.00%> (?)`
non-dask	`78.73% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e770fa...af27bfa. Read the comment docs.

cjnolet

LGTM

cjnolet · 2021-09-08T19:12:58Z

@gpucibot merge

This is a continuation of PR rapidsai#1763, rapidsai#4053, and rapidsai#4079, to add Categorical Naive Bayes. This is supposed to be merged after rapidsai#4079. Linking issue rapidsai#1666. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4150

lowener added 2 commits August 6, 2021 03:39

Add Categorical NB base

2b138e3

Add CNB doc to API

854a640

github-actions bot added the Cython / Python Cython or Python issue label Aug 6, 2021

lowener added feature request New feature or request non-breaking Non-breaking change labels Aug 6, 2021

lowener added 2 commits August 10, 2021 12:22

Merge branch 'branch-21.10' into 21.10-categorical-nb

9b4940f

Add raw-kernel for Categorical Naive Bayes

5d5bf37

caryr35 added this to PR-WIP in v21.10 Release via automation Aug 10, 2021

lowener added 2 commits August 12, 2021 18:01

Add test on categorical parameters

7c14ae7

Fix style

750e3d7

lowener marked this pull request as ready for review August 12, 2021 16:13

lowener requested a review from a team as a code owner August 12, 2021 16:13

lowener added 2 commits August 16, 2021 15:44

Update doc for Categorical NB

35bef79

Update base param names for Categorical NB

2b54db5

lowener added the 3 - Ready for Review Ready for review by team label Aug 16, 2021

v21.10 Release automation moved this from PR-WIP to PR-Needs review Aug 25, 2021

cjnolet requested changes Aug 25, 2021

View reviewed changes

lowener added 4 commits August 30, 2021 16:19

Merge branch 'branch-21.10' into 21.10-categorical-nb

95831e2

Merge branch 'branch-21.10' into 21.10-categorical-nb

5646416

Merge branch 'branch-21.10' into 21.10-categorical-nb

2bf5d70

Remove host loop on fit and jll of cnb

7643508

lowener mentioned this pull request Sep 5, 2021

cuML Documentation Example Errors #4188

Open

lowener added 4 commits September 6, 2021 20:34

Almost working sparse cnb

bbd9787

Working version of sparse categoricalNB

aad21e3

Fix coding style and comments

89bdb07

Fix comments

209e3a4

lowener requested a review from cjnolet September 7, 2021 13:19

lowener added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Sep 7, 2021

cjnolet requested changes Sep 7, 2021

View reviewed changes

python/cuml/naive_bayes/naive_bayes.py Outdated Show resolved Hide resolved

Use cp Elementwise Kernel

af27bfa

lowener requested a review from cjnolet September 8, 2021 10:27

v21.10 Release automation moved this from PR-Needs review to PR-Reviewer approved Sep 8, 2021

cjnolet approved these changes Sep 8, 2021

View reviewed changes

rapids-bot bot merged commit 496ddf0 into rapidsai:branch-21.10 Sep 8, 2021

v21.10 Release automation moved this from PR-Reviewer approved to Done Sep 8, 2021

lowener deleted the 21.10-categorical-nb branch September 17, 2021 16:10

lowener mentioned this pull request Sep 20, 2021

[BUG] test_gaussian_partial_fit pytest requires an update #4180

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Categorical Naive Bayes #4150

Add Categorical Naive Bayes #4150

lowener commented Aug 6, 2021

lowener commented Aug 16, 2021

cjnolet commented Aug 25, 2021

cjnolet left a comment

lowener commented Sep 7, 2021

cjnolet left a comment

codecov-commenter commented Sep 8, 2021

cjnolet left a comment

cjnolet commented Sep 8, 2021

Add Categorical Naive Bayes #4150

Add Categorical Naive Bayes #4150

Conversation

lowener commented Aug 6, 2021

lowener commented Aug 16, 2021

cjnolet commented Aug 25, 2021

cjnolet left a comment

Choose a reason for hiding this comment

lowener commented Sep 7, 2021

cjnolet left a comment

Choose a reason for hiding this comment

codecov-commenter commented Sep 8, 2021

Codecov Report

cjnolet left a comment

Choose a reason for hiding this comment

cjnolet commented Sep 8, 2021