Skip to content

Conversation

@XiaobingSuper
Copy link
Collaborator

@XiaobingSuper XiaobingSuper commented Aug 22, 2022

Given the following case:

import torch
a = torch.ones(1, 3, 320, 480).bfloat16().to(memory_format=torch.channels_last)
out_bf16 = torch.nn.functional.interpolate(a, size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
out_fp32= torch.nn.functional.interpolate(a.float(), size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
print(out_bf16[0, 2, :, :])
print(out_fp32[0, 2, :, :])

the boundary of bfloat16 output gets a wrong value:

tensor([[1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        ...,
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.8367e-40,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]], dtype=torch.bfloat16)
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        ...,
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.]])

the expected behavior is that the bfloat16 output value should also be one. The main reason is that we use low precision to compute the index, see

const scalar_t real_input_index = area_pixel_compute_source_index<scalar_t>(
, we should use a high precison to do the computation as GPU path:
accscalar_t src_idx = scale * (dst_index + static_cast<accscalar_t>(0.5)) -

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 22, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit f3ad87e (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@XiaobingSuper XiaobingSuper force-pushed the xiaobing/upsample_channels_last branch from 8fc5a14 to f3ad87e Compare August 22, 2022 16:05
@XiaobingSuper XiaobingSuper marked this pull request as ready for review August 22, 2022 16:06
@frank-wei
Copy link
Contributor

thanks for fixing this issue. Does it also happen to other interpolation mode? Maybe try with more config combinations to make sure we did not miss anything.

@XiaobingSuper
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@XiaobingSuper
Copy link
Collaborator Author

thanks for fixing this issue. Does it also happen to other interpolation mode? Maybe try with more config combinations to make sure we did not miss anything.

Currently, I didn't find it happening to other modes, but I will do more tests to check them at the next step.

@github-actions
Copy link
Contributor

Hey @XiaobingSuper.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Aug 24, 2022
… to compute index (#83847) (#83847)

Summary:
Given the following case:
```
import torch
a = torch.ones(1, 3, 320, 480).bfloat16().to(memory_format=torch.channels_last)
out_bf16 = torch.nn.functional.interpolate(a, size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
out_fp32= torch.nn.functional.interpolate(a.float(), size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
print(out_bf16[0, 2, :, :])
print(out_fp32[0, 2, :, :])
```
the boundary of bfloat16 output gets a wrong value:
```
tensor([[1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        ...,
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.8367e-40,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]], dtype=torch.bfloat16)
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        ...,
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.]])

```

the expected behavior is that the bfloat16 output value should also be one. The main reason is that we use low precision to compute the index,  see
https://github.com/pytorch/pytorch/blob/fcb124406bdf86bc2d15e999d5a3e09b86238bba/aten/src/ATen/native/UpSample.h#L448, we should use a high precison to do the computation as GPU path:
https://github.com/pytorch/pytorch/blob/fcb124406bdf86bc2d15e999d5a3e09b86238bba/aten/src/ATen/native/cuda/UpSample.cuh#L123

Pull Request resolved: #83847
Approved by: https://github.com/frank-wei

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/658f958bc4bb314d9c6030eeaf3e1784792b5d15

Reviewed By: weiwangmeta

Differential Revision: D38947080

fbshipit-source-id: eef6bfe50a4becd4550b20a88b119da1e1fc46c0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants