Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix for multi GPU PCA compute failing bug after transform and added error handling when n_components is not passed #3912

Merged
merged 3 commits into from
Jun 16, 2021

Conversation

akaanirban
Copy link
Contributor

@akaanirban akaanirban commented May 28, 2021

This PR fixes #3913 where Multi GPU PCA is failing on calling compute() after transform.

This fix is similar to what has been done in a previous PR #3320 . The current PR :

  1. Fixes the problem of AttributeError when calling compute() after transform() while using cuml.dask.decomposition.PCA
  2. Added setting of n_components=1 when n_components is not set in the multi GPU case. (The tests were failing if I don't set n_components, however, I did not find a place where n_component is set to 1. I may be wrong. PR Fix for default arguments of PCA #3320 seems to fix n_component=min(n_cols, n_rows as of 0.16 for single GPU. Not entirely sure if that should also be the case for MG as well. )
  3. Added test cases for the above two changes.

@akaanirban akaanirban requested a review from a team as a code owner May 28, 2021 20:47
@github-actions github-actions bot added the Cython / Python Cython or Python issue label May 28, 2021
@Nanthini10 Nanthini10 added bug Something isn't working non-breaking Non-breaking change labels May 28, 2021
@Nanthini10
Copy link
Contributor

Closes #3913

Copy link
Contributor

@lowener lowener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix it's a good catch. Regarding n_components, it was set to 1 in single GPU before 0.16, after it was decided to set this parameter to min(n_cols, n_rows) so we should try to stay consistent here and always set it by default to min(n_cols, n_rows).

python/cuml/decomposition/base_mg.pyx Outdated Show resolved Hide resolved
python/cuml/decomposition/base_mg.pyx Outdated Show resolved Hide resolved
Copy link
Contributor

@lowener lowener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@caryr35 caryr35 added this to PR-WIP in v21.08 Release via automation Jun 2, 2021
@Nanthini10
Copy link
Contributor

rerun tests

@akaanirban akaanirban changed the title Fix for multi GPU PCA compute failing bug after transform and added error handling when n_components is not passed [REVIEW] Fix for multi GPU PCA compute failing bug after transform and added error handling when n_components is not passed Jun 7, 2021
@lowener lowener added the 3 - Ready for Review Ready for review by team label Jun 9, 2021
@dantegd
Copy link
Member

dantegd commented Jun 11, 2021

rerun tests

1 similar comment
@akaanirban
Copy link
Contributor Author

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@e1b7804). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #3912   +/-   ##
===============================================
  Coverage                ?   85.32%           
===============================================
  Files                   ?      230           
  Lines                   ?    18096           
  Branches                ?        0           
===============================================
  Hits                    ?    15440           
  Misses                  ?     2656           
  Partials                ?        0           
Flag Coverage Δ
dask 48.06% <0.00%> (?)
non-dask 77.66% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1b7804...84eb007. Read the comment docs.

@Nanthini10 Nanthini10 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jun 14, 2021
@dantegd
Copy link
Member

dantegd commented Jun 16, 2021

@gpucibot merge

Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python code owner approval

v21.08 Release automation moved this from PR-WIP to PR-Reviewer approved Jun 16, 2021
@rapids-bot rapids-bot bot merged commit 87f8e90 into rapidsai:branch-21.08 Jun 16, 2021
v21.08 Release automation moved this from PR-Reviewer approved to Done Jun 16, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
…rror handling when n_components is not passed (rapidsai#3912)

This PR fixes rapidsai#3913 where Multi GPU PCA is failing on calling compute() after transform. 

This fix is similar to what has been done in a previous PR rapidsai#3320 . The current PR :

1. Fixes the problem of `AttributeError` when calling `compute()` after `transform()` while using `cuml.dask.decomposition.PCA`
2. Added setting of `n_components=1` when n_components is not set in the multi GPU case.  (The tests were failing if I don't set n_components, however, I did not find a place where n_component is set to 1. I may be wrong.  PR rapidsai#3320 seems to fix `n_component=min(n_cols, n_rows` as of 0.16 for single GPU. Not entirely sure if that should also be the case for MG as well. )
3. Added test cases for the above two changes.

Authors:
  - Anirban Das (https://github.com/akaanirban)

Approvers:
  - Micka (https://github.com/lowener)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#3912
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

[BUG] Multi GPU PCA using dask is failing on call of compute after a transform
5 participants