Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf.testing methods do not respect tolerances #8646

Closed
vyasr opened this issue Jul 2, 2021 · 0 comments · Fixed by #8649
Closed

[BUG] cudf.testing methods do not respect tolerances #8646

vyasr opened this issue Jul 2, 2021 · 0 comments · Fixed by #8649
Assignees
Labels
bug Something isn't working

Comments

@vyasr
Copy link
Contributor

vyasr commented Jul 2, 2021

Describe the bug
cudf's testing utilities only find two frame-like objects to be equal if they are exactly equal, despite accepting atol and rtol parameters. The problem is that these parameters are ignored by cudf.testing.assert_column_equal, which simply calls lhs.equal(rhs).

Steps/Code to reproduce bug

>>> import cudf
>>> import pandas as pd
>>> eps = 1e-4
>>> cudf.testing.assert_series_equal(cudf.Series([1e6 + eps]), cudf.Series([1e6 + 2*eps]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nfs/vyasr/local/rapids/cudf/python/cudf/cudf/testing/testing.py", line 489, in assert_series_equal
    atol=atol,
  File "/home/nfs/vyasr/local/rapids/cudf/python/cudf/cudf/testing/testing.py", line 222, in assert_column_equal
    obj, f"values are different ({np.round(diff, 5)} %)", msg1, msg2,
  File "/home/nfs/vyasr/local/rapids/cudf/python/cudf/cudf/testing/testing.py", line 37, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: ColumnBase are different

values are different (100.0 %)
[left]:  [1000000.0001]
[right]: [1000000.0002]
>>> pd.testing.assert_series_equal(pd.Series([1e6 + eps]), pd.Series([1e6 + 2*eps]))
>>>

Expected behavior
The first assertion should succeed for a sufficiently small value of eps (at least, larger than machine epsilon, depending on what the smallest representable number is in cudf/pandas).

Additional context
Since #7495 removed the assert_eq function from cudf's public namespace, fixing this problem is a blocker for downstream packages that need to adapt tests to make use of these functions.

@vyasr vyasr added bug Something isn't working Needs Triage Need team to review and classify labels Jul 2, 2021
@vyasr vyasr self-assigned this Jul 2, 2021
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this issue Jul 2, 2021
…ing (#430)

This PR contains three distinct changes required to get cuspatial builds working and tests passing again:
1. RMM switched to rapids-cmake (rapidsai/rmm#800), which requires CMake 3.20.1, so this PR includes the required updates for that.
2. The Arrow upgrade in cudf also moved the location of testing utilities (rapidsai/cudf#7495). Long term cuspatial needs to move away from use of the testing utilities, which are not part of cudf's public API, but we are currently blocked by rapidsai/cudf#8646, so this PR just imports the internal `assert_eq` method as a stopgap to get tests passing.
3. The changes in rapidsai/cudf#8373 altered the way that metadata was propagated to libcudf outputs from previously existing cuDF Python objects. The new code paths require cuspatial to override metadata copying at the GeoDataFrame rather than the GeoColumn level in order to ensure that information about column types is lost in the libcudf round trip and the metadata copying functions are now called on the output DataFrame rather than the input one.

This PR supersedes #427, #428, and #429, all of which can now be closed.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Christopher Harris (https://github.com/cwharris)

URL: #430
@rapids-bot rapids-bot bot closed this as completed in #8649 Jul 9, 2021
rapids-bot bot pushed a commit that referenced this issue Jul 9, 2021
Resolves #8646 so that testing equality between different types of frames can be based on approximate rather than exact equality. Note that this is a blocker for packages that need to move away from relying on `cudf.tests.utils` for testing functions, since that module is no longer exposed by `cudf`.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - https://github.com/brandon-b-miller
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #8649
rapids-bot bot pushed a commit to rapidsai/cuspatial that referenced this issue Jul 21, 2021
As of rapidsai/cudf#7495 the `cudf.tests.utils` module (and in particular the `assert_eq` function) are no longer part of the public API. This PR switches tests to use the public testing functions in the `cudf.testing` subpackage.

This PR is currently blocked by #430 and rapidsai/cudf#8646.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Christopher Harris (https://github.com/cwharris)
  - Paul Taylor (https://github.com/trxcllnt)

URL: #431
@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants