Skip to content

Conversation

j-bowhay
Copy link
Member

@j-bowhay j-bowhay commented May 2, 2024

Different array libraries have different default floating dtypes and as such the default rtol=1e-7 of xp_assert_close can be too stringent when using the default type. This leaves two options:

  1. Specify the dtype of the input data eg. xp.asarray(..., dtype=xp.float64)
  2. Specify a lower value of rtol

The first is potentially as undesirable as it is reasonable to expect the user to be using the default type most of the time. The second is potentially undesirable as it makes the test unnecessarily less stringent for some libraries.

As for the choice of value:

Good question. It is a nice number that puts us about halfway between sqrt(eps) and the old 1e-7 default for floats, and it was on my mind because it's used in the definition of tolerances in optimize._zeros.py. And yeah, it happened to be enough so that no old tests started failing. I can add a comment that any multiple that keeps us under the old default would be OK. sqrt(eps) can be theoretically motivated in some contexts.

Originally posted by @mdhaber in #20595 (comment)

@j-bowhay j-bowhay added the array types Items related to array API support and input array validation (see gh-18286) label May 2, 2024
@github-actions github-actions bot added scipy.cluster scipy._lib enhancement A new feature or improvement labels May 2, 2024
Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy if Ralf is happy

@lucascolley lucascolley changed the title ENH: add dtype dependent default rtol to xp_assert_close TST: add dtype dependent default rtol to xp_assert_close May 2, 2024
@lucascolley lucascolley added maintenance Items related to regular maintenance tasks and removed enhancement A new feature or improvement labels May 2, 2024
@mdhaber
Copy link
Contributor

mdhaber commented May 2, 2024

As examples of how this can clean things up:
5bc13c9
j-bowhay#3
The latter also poses a question about default dtypes that has come up a few times: what should we do with integer arrays when the calculations assume that input will be floating point - let them raise errors when an invalid operation is attempted (which is backend dependent), convert to a fixed dtype like float64, or convert them to the default dtype of the given backend?

Numerically, the rtol for float64 ends up a tiny bit stricter than we have now, ~6e-8 vs 1e-7, and for float32 it's ~1e-3. Of course, we can set these here to whatever we want as long as the default for float64 stays at least as tight as it is now. Using sqrt(eps) just looks less arbitrary than setting separate numerical values manually.

@rgommers
Copy link
Member

rgommers commented May 2, 2024

I have only browsed this PR briefly, just wanted to give a 👍🏼 to the general idea of dtype-dependent tolerances.

@mdhaber
Copy link
Contributor

mdhaber commented May 2, 2024

We just discussed this on a call about the NASA grant, and scikit-learn is doing something similar. They manually set the values to 1e-4 for float32 and 1e-7 for everything else. That's close enough to what we have here, so I'm going to go ahead and merge this so this no longer holds up other PRs.

@j-bowhay, would you make a post on the forum to alert developers to the change, and invite them to submit a PR with a different specific choice of defaults if there are strong objections to these numerical values?

@mdhaber mdhaber merged commit d553004 into scipy:main May 2, 2024
@lucascolley lucascolley added this to the 1.14.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks scipy.cluster scipy._lib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants