-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Enable covariance/correlation in dask_cudf #3393
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-0.11 #3393 +/- ##
==============================================
+ Coverage 87.3% 87.3% +<.01%
==============================================
Files 49 49
Lines 9214 9218 +4
==============================================
+ Hits 8044 8048 +4
Misses 1170 1170
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With dask/dask#5597 merged in, I think this is good
Closes #3363
In order to use the upstream dask covariance/correlation implementation, a few changes are needed in cudf/dask_cudf:
concat_cudf
must supportaxis=1
(already supported incudf
, so this is a trivial change)cov
methodThis PR addresses both of these changes. However, (2) is only addressed with a cupy-based implementation (which does not handle dataframes with null values).
TODO Before Merging: