Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float strategy generates -0.0 for max_value=0.0 when exclude_max=True #2201

Closed
adnilsson opened this issue Nov 11, 2019 · 6 comments · Fixed by #2206
Closed

Float strategy generates -0.0 for max_value=0.0 when exclude_max=True #2201

adnilsson opened this issue Nov 11, 2019 · 6 comments · Fixed by #2206
Labels
docs opinions-sought question

Comments

@adnilsson
Copy link

@adnilsson adnilsson commented Nov 11, 2019

As a user, I'd expect values drawn from a strategy with exclude_max=True to have the invariant value < max_value. However, I discovered that a floats strategy with max_value set to 0.0 can generate -0.0 even if exclude_max=True. This is unexpected since 0.0 == -0.0.

Example program (hypothesis 4.43.6):

from hypothesis import given
from hypothesis.strategies import floats


max_val = 0.0
less_than_zero = floats(max_value=max_val, exclude_max=True, allow_nan=False)

@given(less_than_zero)
def less_than_max(val):
    assert val < max_val


if __name__ == "__main__":
    less_than_max()

Example output (excluding traceback):

Falsifying example: less_than_max(val=-0.0)

The issue seems to stem from changes to floats.py in 36d641d. The version of prior to this commit matches the referenced Stack Overflow solution and would not cause this particular issue.

If I understood it correctly, PR #1860 solved the issue of floats(min_value=0.0, max_value=-0.0) and excluding endpoints at infinity ( #1859).
Perhaps this was an unintended side-effect?

For now, I can avoid this with, for instance:

less_than_zero = less_than_zero.filter(lambda x: x < max_val)

but this should not really be necessary. Would it be possible to consolidate solutions to both issues?

@Zac-HD Zac-HD added bug tests/build/CI question opinions-sought and removed bug tests/build/CI labels Nov 11, 2019
@Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Nov 11, 2019

As a user, I'd expect values drawn from a strategy with exclude_max=True to have the invariant value < max_value. However, I discovered that a floats strategy with max_value set to 0.0 can generate -0.0 even if exclude_max=True. This is unexpected since 0.0 == -0.0.

FWIW this was intentional as it exposes the underlying - and I agree unintuitive - behaviour of IEEE754 floats regarding signed zeros.

I can see that it might be a better compromise to ensure that excluding either zero excludes both though!

@DRMacIver
Copy link
Member

@DRMacIver DRMacIver commented Nov 11, 2019

I think regardless of what we decide as the API our documentation of the floats strategy needs to clearly signal it - right now it's pretty non-obvious from the documentation that this is the behaviour!

@Zac-HD Zac-HD added the docs label Nov 11, 2019
@adnilsson
Copy link
Author

@adnilsson adnilsson commented Nov 11, 2019

FWIW this was intentional as it exposes the underlying - and I agree unintuitive - behaviour of IEEE754 floats regarding signed zeros.

Was there any reason in particular to distinguish between signed/unsigned zeros in next_up?

In any case, should this still be the desired behaviour, then I definitely agree that the docs should be made clearer! But I do think that the API should exclude both zeros in exclude_min/max.

@Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Nov 13, 2019

Was there any reason in particular to distinguish between signed/unsigned zeros in next_up?

Yes - they are in fact distinct values, and code - including st.floats() - might behave differently around them! So we need to handle both, and can't skip one without breaking x == next_up(next_down(x)) for all finite x.

@adnilsson
Copy link
Author

@adnilsson adnilsson commented Nov 13, 2019

I understand that they are distinct values and that treating -0.0 as 0.0, as was done before, would break the property you describe. My point was just that other properties one would intuitively expect are next_down(x) < x and next_up(x) > x for all finite x, but this breaks since e.g. next_down(0.0) currently evaluates to -0.0. But I suppose these properties and x == next_up(next_down(x)) are mutually exclusive.

Is it more important to guarantee that next_up and next_down take steps that are equally large (i.e., x == next_up(next_down(x))), or that they produce a comparably smaller/greater number (i.e. next_down(x) < x and next_up(x) > x)? I'm not sure what the implications would be on the API, but it could be an interesting discussion to have.

Anyway, thanks for the helpful responses!

@Zac-HD
Copy link
Member

@Zac-HD Zac-HD commented Nov 13, 2019

Huh, it turns out that the IEEE754 standard prefers the opposite convention to us: "nextUp(±0) is the positive number of least magnitude in x’s format" and explicitly skips +0 on the way up and -0 on the way down.

Fortunately our next_up function isn't part of the public API and nor do we claim it's standards-compliant, so leaving it as-is seems OK 😅

I'm more confident now that changing the public-facing behaviour is correct, though!

@Zac-HD Zac-HD closed this in #2206 Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs opinions-sought question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants