fix upsample bf16 issue for channels last path by using high pricsion to compute index #83847

XiaobingSuper · 2022-08-22T15:47:55Z

Given the following case:

import torch
a = torch.ones(1, 3, 320, 480).bfloat16().to(memory_format=torch.channels_last)
out_bf16 = torch.nn.functional.interpolate(a, size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
out_fp32= torch.nn.functional.interpolate(a.float(), size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False)
print(out_bf16[0, 2, :, :])
print(out_fp32[0, 2, :, :])

the boundary of bfloat16 output gets a wrong value:

tensor([[1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        ...,
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [1.0000e+00, 1.0000e+00, 1.0000e+00,  ..., 1.0000e+00, 1.0000e+00,
         1.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.8367e-40,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]], dtype=torch.bfloat16)
tensor([[1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        ...,
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.],
        [1., 1., 1.,  ..., 1., 1., 1.]])

the expected behavior is that the bfloat16 output value should also be one. The main reason is that we use low precision to compute the index, see

pytorch/aten/src/ATen/native/UpSample.h

Line 448 in fcb1244

const scalar_t real_input_index = area_pixel_compute_source_index<scalar_t>(

, we should use a high precison to do the computation as GPU path:

pytorch/aten/src/ATen/native/cuda/UpSample.cuh

Line 123 in fcb1244

accscalar_t src_idx = scale * (dst_index + static_cast<accscalar_t>(0.5)) -

facebook-github-bot · 2022-08-22T15:48:03Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/83847
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit f3ad87e (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

… to compute index

frank-wei · 2022-08-22T16:41:44Z

thanks for fixing this issue. Does it also happen to other interpolation mode? Maybe try with more config combinations to make sure we did not miss anything.

XiaobingSuper · 2022-08-23T00:52:09Z

@pytorchbot merge

pytorchmergebot · 2022-08-23T00:53:33Z

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

XiaobingSuper · 2022-08-23T00:54:11Z

thanks for fixing this issue. Does it also happen to other interpolation mode? Maybe try with more config combinations to make sure we did not miss anything.

Currently, I didn't find it happening to other modes, but I will do more tests to check them at the next step.

github-actions · 2022-08-23T00:54:18Z

Hey @XiaobingSuper.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

… to compute index (#83847) (#83847) Summary: Given the following case: ``` import torch a = torch.ones(1, 3, 320, 480).bfloat16().to(memory_format=torch.channels_last) out_bf16 = torch.nn.functional.interpolate(a, size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False) out_fp32= torch.nn.functional.interpolate(a.float(), size = (640, 960), scale_factor = None, mode = 'bilinear', align_corners = False, recompute_scale_factor= None, antialias = False) print(out_bf16[0, 2, :, :]) print(out_fp32[0, 2, :, :]) ``` the boundary of bfloat16 output gets a wrong value: ``` tensor([[1.0000e+00, 1.0000e+00, 1.0000e+00, ..., 1.0000e+00, 1.0000e+00, 1.0000e+00], [1.0000e+00, 1.0000e+00, 1.0000e+00, ..., 1.0000e+00, 1.0000e+00, 1.0000e+00], [1.0000e+00, 1.0000e+00, 1.0000e+00, ..., 1.0000e+00, 1.0000e+00, 1.0000e+00], ..., [1.0000e+00, 1.0000e+00, 1.0000e+00, ..., 1.0000e+00, 1.0000e+00, 1.0000e+00], [1.0000e+00, 1.0000e+00, 1.0000e+00, ..., 1.0000e+00, 1.0000e+00, 1.0000e+00], [0.0000e+00, 0.0000e+00, 1.8367e-40, ..., 0.0000e+00, 0.0000e+00, 0.0000e+00]], dtype=torch.bfloat16) tensor([[1., 1., 1., ..., 1., 1., 1.], [1., 1., 1., ..., 1., 1., 1.], [1., 1., 1., ..., 1., 1., 1.], ..., [1., 1., 1., ..., 1., 1., 1.], [1., 1., 1., ..., 1., 1., 1.], [1., 1., 1., ..., 1., 1., 1.]]) ``` the expected behavior is that the bfloat16 output value should also be one. The main reason is that we use low precision to compute the index, see https://github.com/pytorch/pytorch/blob/fcb124406bdf86bc2d15e999d5a3e09b86238bba/aten/src/ATen/native/UpSample.h#L448, we should use a high precison to do the computation as GPU path: https://github.com/pytorch/pytorch/blob/fcb124406bdf86bc2d15e999d5a3e09b86238bba/aten/src/ATen/native/cuda/UpSample.cuh#L123 Pull Request resolved: #83847 Approved by: https://github.com/frank-wei Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/658f958bc4bb314d9c6030eeaf3e1784792b5d15 Reviewed By: weiwangmeta Differential Revision: D38947080 fbshipit-source-id: eef6bfe50a4becd4550b20a88b119da1e1fc46c0

facebook-github-bot added the cla signed label Aug 22, 2022

fix upsample bf16 issue for channels last path by using high pricsion…

f3ad87e

… to compute index

pytorchbot added the open source label Aug 22, 2022

XiaobingSuper force-pushed the xiaobing/upsample_channels_last branch from 8fc5a14 to f3ad87e Compare August 22, 2022 16:05

XiaobingSuper marked this pull request as ready for review August 22, 2022 16:06

XiaobingSuper requested review from frank-wei and mingfeima August 22, 2022 16:07

frank-wei approved these changes Aug 22, 2022

View reviewed changes

pytorchmergebot added the Merged label Aug 23, 2022

pytorchmergebot closed this in 658f958 Aug 23, 2022

malfet mentioned this pull request Nov 2, 2022

[PT 1.13] resize bilinear yields different results in newer versions #87968

Closed

jgong5 mentioned this pull request Nov 2, 2022

Do not use double for single-prec upsample #88277

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix upsample bf16 issue for channels last path by using high pricsion to compute index #83847

fix upsample bf16 issue for channels last path by using high pricsion to compute index #83847

Uh oh!

XiaobingSuper commented Aug 22, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 22, 2022 •

edited

Loading

Uh oh!

frank-wei commented Aug 22, 2022

Uh oh!

XiaobingSuper commented Aug 23, 2022

Uh oh!

pytorchmergebot commented Aug 23, 2022

Uh oh!

XiaobingSuper commented Aug 23, 2022

Uh oh!

github-actions bot commented Aug 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix upsample bf16 issue for channels last path by using high pricsion to compute index #83847

fix upsample bf16 issue for channels last path by using high pricsion to compute index #83847

Uh oh!

Conversation

XiaobingSuper commented Aug 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Aug 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

✅ No Failures (0 Pending)

Uh oh!

frank-wei commented Aug 22, 2022

Uh oh!

XiaobingSuper commented Aug 23, 2022

Uh oh!

pytorchmergebot commented Aug 23, 2022

Uh oh!

XiaobingSuper commented Aug 23, 2022

Uh oh!

github-actions bot commented Aug 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

XiaobingSuper commented Aug 22, 2022 •

edited

Loading

facebook-github-bot commented Aug 22, 2022 •

edited

Loading