Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Series or DataFrame support for equivalent of numpy.isclose #5105

Closed
paul-tqh-nguyen opened this issue May 5, 2020 · 4 comments · Fixed by #5442
Closed

[FEA] Series or DataFrame support for equivalent of numpy.isclose #5105

paul-tqh-nguyen opened this issue May 5, 2020 · 4 comments · Fixed by #5442
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@paul-tqh-nguyen
Copy link

Is your feature request related to a problem? Please describe.

It's useful for testing to be able to perform the equivalent of numpy.isclose on instances of cudf.DataFrame and cudf.Series.

Describe the solution you'd like

This would be a nice thing to have:

>>> import cudf
>>> s1 = cudf.Series([1.9876543,   2.9876654,   3.9876543])
>>> s2 = cudf.Series([1.987654321, 2.987654321, 3.987654321])
>>> rel_tol=1e-5
>>> abs_tol=0.0
>>> s2.isclose(s1, rel_tol, abs_tol)
0    True
1    True
2    True
dtype: bool
>>> 

Describe alternatives you've considered

Here's my current hand-rolled solution:

>>> import cudf
>>> s1 = cudf.Series([1.9876543,   2.9876654,   3.9876543])
>>> s2 = cudf.Series([1.987654321, 2.987654321, 3.987654321])
>>> rel_tol=1e-5
>>> abs_tol=0.0
>>> s2.abs().mul(rel_tol).add(abs_tol).sub(s1.sub(s2).abs()).gt(0)
0    True
1    True
2    True
dtype: bool
>>> 

There's nothing wrong with using this approach pervasively. I figured it'd be more convenient to have isclose as a built-in method.

I'm happy to help in whatever way I can.

@paul-tqh-nguyen paul-tqh-nguyen added Needs Triage Need team to review and classify feature request New feature or request labels May 5, 2020
@beckernick
Copy link
Member

Could you use cupy.isclose for this? I believe with pandas you would still use np.isclose.

@kkraus14 kkraus14 added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels May 6, 2020
@kkraus14
Copy link
Collaborator

kkraus14 commented May 6, 2020

Could you use cupy.isclose for this? I believe with pandas you would still use np.isclose.

@paul-tqh-nguyen this would go from cudf --> cupy zero copy so it wouldn't cause any performance degradation. Could you give that a shot?

@paul-tqh-nguyen
Copy link
Author

Using cupy.isclose solves all of my problems!

My apologies; I should've looked there earlier.

Thanks for the quick and helpful responses!

@kkraus14
Copy link
Collaborator

kkraus14 commented May 6, 2020

One note that this won't handle null values so I'm going to reopen this to provide some syntactic sugar in cuDF surrounding the cupy function in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants