Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gen_range(f64::MIN..f64::MAX) should panic instead of silently overflow #1090

Closed
nlopes opened this issue Jan 24, 2021 · 7 comments · Fixed by #1108
Closed

gen_range(f64::MIN..f64::MAX) should panic instead of silently overflow #1090

nlopes opened this issue Jan 24, 2021 · 7 comments · Fixed by #1108

Comments

@nlopes
Copy link

nlopes commented Jan 24, 2021

A change request is considered a (small) Request-For-Comment, and requires a concrete proposal and motivation.

Summary

Although we can find a mention in the book of how to use f64 it's not clear that doing rand::thread_rng().gen_range(f64::MIN..f64::MAX); will not give the results intended.

Details

I think potentially guarding the range when doing gen_range on floats is a good idea. The above will not complain but won't (assume a normal distribution) give floats in the range specified (they will all be < 0).

Motivation

Make it easier for newcomers to the rand library to not shoot themselves in the foot. Be explicit programmatically in what a function accepts is better than leaving it to documentation (explicit is better than implicit).

Alternatives

@vks
Copy link
Collaborator

vks commented Jan 24, 2021

Looks like you hit a corner case, probably resulting in an overflow. We can probably fix it to return a more sensible result (or panic), but I'm not sure what a more sensible result would be. Generating floats uniformly from such a big range is quite tricky, and probably not something that comes up in practice? You would probably need to use a different algorithm for sampling, such as in #531.

I don't agree that we should restrict gen_range to 0..1, getting other ranges is more subtle than just multiplying by the (shifted) maximum and adding the minimum, therefore it is good to guide users away from this pitfall by letting them set the range directly.

@dhardy
Copy link
Member

dhardy commented Jan 25, 2021

I fully agree with @vks. We did explore alternative sampling methods that would not lose precision on some parts of the range just because others support less precision (e.g. #320 #531) but never finished these, because most of the time "reasonably good precision and fast" seems to be the preferred choice, and we never settled on how to expose an alternative.

gen_range(f64::MIN..f64::MAX)

It is not clear to me what this is supposed to do anyway. f64::MAX > 10.pow(308). Very few things want numbers this large. What's more, floats are designed to give more precision closer to zero — but with a uniform distribution over this range, it is very unlikely that the result will be even slightly close to zero (e.g. -10.pow(306) .. 10.pow(306) is just 1% of the range). Trying to use floating-point without restrictions on your range is always going to cause problems.

@nlopes
Copy link
Author

nlopes commented Jan 25, 2021

Thanks for the replies, all totally fair and I do understand that it's an odd usage. I think just being more explicit to the user on when usage of the function takes a turn would provide a nicer user experience.

It took me a minute to actually realise that what I was doing probably wasn't going to be supported.

Feel free to close this if you think there are no good ways to do so (beyond #531).

@vks
Copy link
Collaborator

vks commented Feb 4, 2021

@nlopes

I think just being more explicit to the user on when usage of the function takes a turn would provide a nicer user experience.

Do you have a suggestion how we could improve our documentation in that regard?

@vks
Copy link
Collaborator

vks commented Feb 20, 2021

I think we should at least add a debug_assert catching the overflow in Float::sample_single.

@vks vks changed the title CHANGE: Restrict the use of gen_range on f64 and f32 to 0 and 1 gen_range(f64::MIN..f64::MAX) should panic instead of silently overflow Feb 20, 2021
@vks
Copy link
Collaborator

vks commented Feb 20, 2021

From the implementation of Float::sample_single:

                    // This handles a number of edge cases.
                    // * `low` or `high` is NaN. In this case `scale` and
                    //   `res` are going to end up as NaN.
                    // * `low` is negative infinity and `high` is finite.
                    //   `scale` is going to be infinite and `res` will be
                    //   NaN.
                    // * `high` is positive infinity and `low` is finite.
                    //   `scale` is going to be infinite and `res` will
                    //   be infinite or NaN (if value0_1 is 0).
                    // * `low` is negative infinity and `high` is positive
                    //   infinity. `scale` will be infinite and `res` will
                    //   be NaN.
                    // * `low` and `high` are finite, but `high - low`
                    //   overflows to infinite. `scale` will be infinite
                    //   and `res` will be infinite or NaN (if value0_1 is 0).
                    // So if `high` or `low` are non-finite, we are guaranteed
                    // to fail the `res < high` check above and end up here.
                    //
                    // While we technically should check for non-finite `low`
                    // and `high` before entering the loop, by doing the checks
                    // here instead, we allow the common case to avoid these
                    // checks. But we are still guaranteed that if `low` or
                    // `high` are non-finite we'll end up here and can do the
                    // appropriate checks.
                    //
                    // Likewise `high - low` overflowing to infinity is also
                    // rare, so handle it here after the common case.

gen_range(f64::MIN..f64::MAX) results in high - low overflowing, so it seems that this case is not handled, unlike the comment claims. This seems to be a bug: I cannot find any check that would catch high - low overflowing to inf.

@dhardy
Copy link
Member

dhardy commented Feb 20, 2021

I would guess that this is a result of checks being moved earlier to provide better error messages. So either there's a trade-off or more checks are needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants