Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gaussian Naive Bayes #4079

Merged
merged 30 commits into from
Aug 9, 2021
Merged

Conversation

lowener
Copy link
Contributor

@lowener lowener commented Jul 21, 2021

This is a continuation of PR #1763 and #4053, to add Gaussian Naive Bayes.
This is supposed to be merged after #4053

Here is a comparison of cuML and SKLearn performance on Gaussian NB.
This is done using a synthetic dataset generated by make_regression.
The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz
gaussian

Linking issue #1666

cjnolet and others added 25 commits February 19, 2020 10:47
Conflicts:
	python/cuml/naive_bayes/naive_bayes.py
…when fitting all classes at once instead of one at a time.
@lowener lowener requested a review from a team as a code owner July 21, 2021 23:31
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Jul 21, 2021
@lowener lowener added feature request New feature or request non-breaking Non-breaking change labels Jul 21, 2021
@caryr35 caryr35 added this to PR-WIP in v21.10 Release via automation Jul 22, 2021
@caryr35 caryr35 moved this from PR-WIP to PR-Needs review in v21.10 Release Jul 22, 2021
@caryr35 caryr35 removed this from PR-Needs review in v21.10 Release Jul 27, 2021
@caryr35 caryr35 added this to PR-WIP in v21.08 Release via automation Jul 27, 2021
@caryr35 caryr35 moved this from PR-WIP to PR-Needs review in v21.08 Release Jul 27, 2021
@dantegd dantegd added this to PR-WIP in v21.10 Release via automation Jul 27, 2021
@dantegd dantegd removed this from PR-Needs review in v21.08 Release Jul 27, 2021
v21.10 Release automation moved this from PR-WIP to PR-Needs review Jul 30, 2021
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work. Only 1 small nitpick and I think this is ready to go.

Parameters
----------

X : {array-like, cupy sparse matrix} of shape (n_samples, n_features)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to make a mention here that the "optimal" input for the sparse matrix is COOrdinate format, otherwise it will be copied in order to convert to that format.

@cjnolet cjnolet changed the base branch from branch-21.08 to branch-21.10 July 30, 2021 19:34
@cjnolet
Copy link
Member

cjnolet commented Jul 30, 2021

I updated the target branch to 21.10 so we can see CI pass

@dantegd
Copy link
Member

dantegd commented Aug 3, 2021

rerun tests

@lowener
Copy link
Contributor Author

lowener commented Aug 4, 2021

CI failed on 3 tests of Nearest Neighbors due to timeouts:

  • cuml.test.test_nearest_neighbors.test_ivfpq_pred[8-512-4000-False-4-32-8]
  • cuml.test.test_nearest_neighbors.test_ivfpq_pred[8-512-4000-True-4-32-8]
  • cuml.test.test_nearest_neighbors.test_ivfpq_pred[8-512-4000-True-6-32-8]

@lowener
Copy link
Contributor Author

lowener commented Aug 4, 2021

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@40af8af). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #4079   +/-   ##
===============================================
  Coverage                ?   85.96%           
===============================================
  Files                   ?      232           
  Lines                   ?    18500           
  Branches                ?        0           
===============================================
  Hits                    ?    15904           
  Misses                  ?     2596           
  Partials                ?        0           
Flag Coverage Δ
dask 47.76% <0.00%> (?)
non-dask 78.56% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40af8af...750779e. Read the comment docs.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

v21.10 Release automation moved this from PR-Needs review to PR-Reviewer approved Aug 9, 2021
@cjnolet
Copy link
Member

cjnolet commented Aug 9, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 3b439a0 into rapidsai:branch-21.10 Aug 9, 2021
v21.10 Release automation moved this from PR-Reviewer approved to Done Aug 9, 2021
@lowener lowener deleted the 21.08-gaussian-nb branch August 9, 2021 20:24
rapids-bot bot pushed a commit that referenced this pull request Sep 8, 2021
This is a continuation of PR #1763, #4053, and #4079, to add Categorical Naive Bayes.
This is supposed to be merged after #4079.
Linking issue #1666.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #4150
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
This is a continuation of PR rapidsai#1763 and rapidsai#4053, to add Gaussian Naive Bayes.
This is supposed to be merged after rapidsai#4053 

Here is a comparison of cuML and SKLearn performance on Gaussian NB.
This is done using a synthetic dataset generated by make_regression.
The GPU used is a RTX 8000, and the CPU is i9-10920X @ 3.50GHz
![gaussian](https://user-images.githubusercontent.com/9810050/126572439-8982faa8-5ad1-4bca-91ab-76704050bf33.png)

Linking issue rapidsai#1666

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#4079
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
This is a continuation of PR rapidsai#1763, rapidsai#4053, and rapidsai#4079, to add Categorical Naive Bayes.
This is supposed to be merged after rapidsai#4079.
Linking issue rapidsai#1666.

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#4150
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cython / Python Cython or Python issue feature request New feature or request non-breaking Non-breaking change
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants