Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float strategy generates -0.0 for max_value=0.0 when exclude_max=True #2201

Closed
adnilsson opened this issue Nov 11, 2019 · 6 comments · Fixed by #2206

Comments

@adnilsson
Copy link

@adnilsson adnilsson commented Nov 11, 2019

As a user, I'd expect values drawn from a strategy with exclude_max=True to have the invariant value < max_value. However, I discovered that a floats strategy with max_value set to 0.0 can generate -0.0 even if exclude_max=True. This is unexpected since 0.0 == -0.0.

Example program (hypothesis 4.43.6):

from hypothesis import given
from hypothesis.strategies import floats


max_val = 0.0
less_than_zero = floats(max_value=max_val, exclude_max=True, allow_nan=False)

@given(less_than_zero)
def less_than_max(val):
    assert val < max_val


if __name__ == "__main__":
    less_than_max()

Example output (excluding traceback):

Falsifying example: less_than_max(val=-0.0)

The issue seems to stem from changes to floats.py in 36d641d. The version of prior to this commit matches the referenced Stack Overflow solution and would not cause this particular issue.

If I understood it correctly, PR #1860 solved the issue of floats(min_value=0.0, max_value=-0.0) and excluding endpoints at infinity ( #1859).
Perhaps this was an unintended side-effect?

For now, I can avoid this with, for instance:

less_than_zero = less_than_zero.filter(lambda x: x < max_val)

but this should not really be necessary. Would it be possible to consolidate solutions to both issues?

@Zac-HD

This comment has been minimized.

Copy link
Member

@Zac-HD Zac-HD commented Nov 11, 2019

As a user, I'd expect values drawn from a strategy with exclude_max=True to have the invariant value < max_value. However, I discovered that a floats strategy with max_value set to 0.0 can generate -0.0 even if exclude_max=True. This is unexpected since 0.0 == -0.0.

FWIW this was intentional as it exposes the underlying - and I agree unintuitive - behaviour of IEEE754 floats regarding signed zeros.

I can see that it might be a better compromise to ensure that excluding either zero excludes both though!

@DRMacIver

This comment has been minimized.

Copy link
Member

@DRMacIver DRMacIver commented Nov 11, 2019

I think regardless of what we decide as the API our documentation of the floats strategy needs to clearly signal it - right now it's pretty non-obvious from the documentation that this is the behaviour!

@Zac-HD Zac-HD added the docs label Nov 11, 2019
@adnilsson

This comment has been minimized.

Copy link
Author

@adnilsson adnilsson commented Nov 11, 2019

FWIW this was intentional as it exposes the underlying - and I agree unintuitive - behaviour of IEEE754 floats regarding signed zeros.

Was there any reason in particular to distinguish between signed/unsigned zeros in next_up?

In any case, should this still be the desired behaviour, then I definitely agree that the docs should be made clearer! But I do think that the API should exclude both zeros in exclude_min/max.

@Zac-HD

This comment has been minimized.

Copy link
Member

@Zac-HD Zac-HD commented Nov 13, 2019

Was there any reason in particular to distinguish between signed/unsigned zeros in next_up?

Yes - they are in fact distinct values, and code - including st.floats() - might behave differently around them! So we need to handle both, and can't skip one without breaking x == next_up(next_down(x)) for all finite x.

@adnilsson

This comment has been minimized.

Copy link
Author

@adnilsson adnilsson commented Nov 13, 2019

I understand that they are distinct values and that treating -0.0 as 0.0, as was done before, would break the property you describe. My point was just that other properties one would intuitively expect are next_down(x) < x and next_up(x) > x for all finite x, but this breaks since e.g. next_down(0.0) currently evaluates to -0.0. But I suppose these properties and x == next_up(next_down(x)) are mutually exclusive.

Is it more important to guarantee that next_up and next_down are guaranteed to take steps that are equally large (i.e., x == next_up(next_down(x))), or that they are guaranteed to produce a comparably smaller/greater number (i.e. next_down(x) < x and next_up(x) > x)? I'm not sure what the implications would be on the API, but it could be an interesting discussion to have.

Anyway, thanks for the helpful responses!

@Zac-HD

This comment has been minimized.

Copy link
Member

@Zac-HD Zac-HD commented Nov 13, 2019

Huh, it turns out that the IEEE754 standard prefers the opposite convention to us: "nextUp(±0) is the positive number of least magnitude in x’s format" and explicitly skips +0 on the way up and -0 on the way down.

Fortunately our next_up function isn't part of the public API and nor do we claim it's standards-compliant, so leaving it as-is seems OK 😅

I'm more confident now that changing the public-facing behaviour is correct, though!

@Zac-HD Zac-HD closed this in #2206 Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.