Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Provide workaround for cupy.percentile bug #3315

Merged

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Dec 17, 2020

Ensure that the 100th quantile value returned by cupy.percentile is the maximum of the input array rather than (possibly) NaN due to cupy/cupy#4451. This eliminates an intermittent failure observed in tests of KBinsDiscretizer, which makes use of cupy.percentile. Note that this includes an alteration of the included sklearn code and should be reverted once the upstream cupy issue is resolved.

Resolve failure due to ValueError described in #2933.

Ensure that the 100th quantile value returned by cupy.percentile is the
maximum of the input array rather than (possibly) NaN due to
cupy/cupy#4451
@wphicks wphicks requested a review from a team as a code owner December 17, 2020 01:30
@wphicks wphicks added bug Something isn't working non-breaking Non-breaking change labels Dec 17, 2020
@wphicks
Copy link
Contributor Author

wphicks commented Dec 17, 2020

An open question for this PR is whether we should pull off the xfail on that KBinsDiscretizer test. I'm confident that the ValueError issue has been eliminated, but #2933 describes 2 other sporadic errors. I'll run 100K cycles tonight to try to see if those other errors are still reproducible. If not, I'd lean toward pulling off the xfail in this PR and then just keeping an eye out for those errors in CI, but I'm open to other opinions on that.

@codecov-io
Copy link

codecov-io commented Dec 17, 2020

Codecov Report

Merging #3315 (ee8c1ef) into branch-0.18 (2e4388d) will decrease coverage by 0.03%.
The diff coverage is 92.88%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #3315      +/-   ##
===============================================
- Coverage        71.45%   71.42%   -0.04%     
===============================================
  Files              205      207       +2     
  Lines            16594    16741     +147     
===============================================
+ Hits             11858    11958     +100     
- Misses            4736     4783      +47     
Impacted Files Coverage Δ
python/cuml/neighbors/__init__.py 100.00% <ø> (ø)
python/cuml/neighbors/ann.pxd 87.05% <87.05%> (ø)
python/cuml/common/sparsefuncs.py 91.66% <88.23%> (-2.34%) ⬇️
python/cuml/linear_model/base.pyx 97.36% <97.36%> (ø)
python/cuml/neighbors/nearest_neighbors.pyx 92.36% <98.24%> (+1.90%) ⬆️
...hirdparty/sklearn/preprocessing/_discretization.py 84.21% <100.00%> (+0.11%) ⬆️
python/cuml/linear_model/elastic_net.pyx 88.00% <100.00%> (-0.89%) ⬇️
python/cuml/linear_model/lasso.pyx 90.47% <100.00%> (-0.83%) ⬇️
python/cuml/linear_model/linear_regression.pyx 87.95% <100.00%> (-1.68%) ⬇️
python/cuml/linear_model/ridge.pyx 87.09% <100.00%> (-1.70%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34efaf8...ee8c1ef. Read the comment docs.

@wphicks
Copy link
Contributor Author

wphicks commented Dec 17, 2020

My 100K run errored out because of how pytest-repeat gathers tests, but a 10K run passed without issue, and (conservatively) I've probably run this about 500K times since I last saw the other errors reported in #2933. I'm going to pull off the xfail in this PR unless reviewers talk me out of it, and we can see if anything turns up in CI down the line. My prejudice is that it's better to have the test in place and running some validation of KBinsDiscretizer, even if we're not absolutely certain that the other errors are gone.

@wphicks wphicks changed the title Provide workaround for cupy.percentile bug [REVIEW] Provide workaround for cupy.percentile bug Dec 17, 2020
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just one small comment

@dantegd dantegd added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Dec 17, 2020
Copy link
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for this in-depth investigation! Hope we will soon have a fix in cuPy.

@dantegd dantegd added 6 - Okay to Auto-Merge and removed 5 - Ready to Merge Testing and reviews complete, ready to merge labels Dec 17, 2020
@rapids-bot rapids-bot bot merged commit 550121b into rapidsai:branch-0.18 Dec 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants