Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.rand can sample the upper bound for lower precision floating point dtypes on CUDA #96947

Closed
pmeier opened this issue Mar 16, 2023 · 0 comments
Assignees
Labels
high priority module: bfloat16 module: correctness (silent) issue that returns an incorrect result silently module: half Related to float16 half-precision floats module: random Related to random number generation in PyTorch (rng generator) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@pmeier
Copy link
Collaborator

pmeier commented Mar 16, 2023

Documentation:

Returns a tensor filled with random numbers from a uniform distribution on the [half-open] interval [0,1)

import itertools

import torch

for device, dtype in itertools.product(
    ["cpu", "cuda"],
    [
        torch.float16,
        torch.bfloat16,
        torch.float32,
        torch.float64,
        torch.complex32,
        torch.complex64,
        torch.complex128,
    ],
):
    torch.manual_seed(0)
    # I only used this high number of samples to make sure the other dtypes are not affected
    # On my machine 1_000 was sufficient for the check to fail for bfloat16, 
    # and 10_000 for float16 and complex32
    t = torch.rand(10_000_000, dtype=dtype, device=device)
    if dtype.is_complex:
        t = torch.view_as_real(t)

    print(f"{dtype}, {device}: {'PASS' if (t != 1).all() else 'FAIL'}")
torch.float16, cpu: PASS
torch.bfloat16, cpu: PASS
torch.float32, cpu: PASS
torch.float64, cpu: PASS
torch.complex32, cpu: PASS
torch.complex64, cpu: PASS
torch.complex128, cpu: PASS
torch.float16, cuda: FAIL
torch.bfloat16, cuda: FAIL
torch.float32, cuda: PASS
torch.float64, cuda: PASS
torch.complex32, cuda: FAIL
torch.complex64, cuda: PASS
torch.complex128, cuda: PASS

Failures happen for float16, bfloat16, and complex32 only on CUDA.

This was detected in #96331, which uses Tensor.uniform_ under the hood, but I guess internally it is the same kernel.

cc @ezyang @gchanan @zou3519 @pbelevich

peterbell10 added a commit to peterbell10/pytorch that referenced this issue Mar 16, 2023
Fixes pytorch#96947

If we generate 1.0 - float_eps, the BFloat16 and Half constructors will
round this to 1.0 which is outside of the half-open range. This
changes the rounding of the last bit in the BFloat16 representation to
never round up. The result is we never go outside the end point and
also the from point now equally likely where before it was half as
likely.
@ezyang ezyang added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: half Related to float16 half-precision floats high priority module: random Related to random number generation in PyTorch (rng generator) module: correctness (silent) issue that returns an incorrect result silently labels Mar 16, 2023
yuantailing added a commit to yuantailing/pytorch that referenced this issue Mar 19, 2023
yuantailing added a commit to yuantailing/pytorch that referenced this issue Mar 19, 2023
@peterbell10 peterbell10 self-assigned this Mar 20, 2023
peterbell10 pushed a commit to peterbell10/pytorch that referenced this issue Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: bfloat16 module: correctness (silent) issue that returns an incorrect result silently module: half Related to float16 half-precision floats module: random Related to random number generation in PyTorch (rng generator) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants