Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output difference resulting from torch.cat #57122

Closed
lawlict opened this issue Apr 28, 2021 · 3 comments
Closed

Output difference resulting from torch.cat #57122

lawlict opened this issue Apr 28, 2021 · 3 comments
Labels
high priority module: viewing and reshaping triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@lawlict
Copy link

lawlict commented Apr 28, 2021

🐛 Bug

The following functions unfold1 and unfold2 are expected to work the same:

import torch
import torch.nn as nn
import torch.nn.functional as F


def unfold1(x, chunk_size):
    hop_size = chunk_size // 2
    x = x.transpose(1, 2)
    x = x.unfold(-1, chunk_size, hop_size)
    return x


def unfold2(x, chunk_size):
    hop_size = chunk_size // 2
    x = x.transpose(1, 2)
    B, C, T = x.shape
    x = x.reshape(B, C, T // hop_size, hop_size)
    x = torch.cat((x[:, :, :-1], x[:, :, 1:]), dim=-1)
    return x


if __name__ == '__main__':
    device = 'cuda'
    x = torch.arange(24).reshape(2, 6, 2).float().to(device)
    print(unfold1(x, 4))
    print('-' * 50)
    print(unfold2(x, 4))

However, the output of unfold2 is wrong:

tensor([[[[ 0.,  2.,  4.,  6.],
          [ 4.,  6.,  8., 10.]],

         [[ 1.,  3.,  5.,  7.],
          [ 5.,  7.,  9., 11.]]],


        [[[12., 14., 16., 18.],
          [16., 18., 20., 22.]],

         [[13., 15., 17., 19.],
          [17., 19., 21., 23.]]]], device='cuda:0')
--------------------------------------------------
tensor([[[[ 0.,  4.,  4.,  8.],
          [ 1.,  5.,  5.,  9.]],

         [[ 2.,  6.,  6., 10.],
          [ 3.,  7.,  7., 11.]]],


        [[[12., 16., 16., 20.],
          [13., 17., 17., 21.]],

         [[14., 18., 18., 22.],
          [15., 19., 19., 23.]]]], device='cuda:0')

The difference appears only when I use pytorch==1.8.1 with cuda. In pytorch==1.6 or on cpu device, the two outputs are the same. Specifically, the difference results from torch.cat in the unfold2 function. Does any have ideas? Many thanks.

Environment

Collecting environment information...
PyTorch version: 1.8.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: CentOS Linux 7 (Core) (x86_64)
GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
Clang version: Could not collect
CMake version: version 3.6.2

Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti
GPU 4: GeForce RTX 2080 Ti
GPU 5: GeForce RTX 2080 Ti
GPU 6: GeForce RTX 2080 Ti
GPU 7: GeForce RTX 2080 Ti

Nvidia driver version: 450.80.02
cuDNN version: Probably one of the following:
/usr/local/cuda-8.0/lib64/libcudnn.so.5
/usr/local/cuda-9.0/lib64/libcudnn.so.7
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.8.1
[pip3] torchaudio==0.8.1
[pip3] torchvision==0.9.1
[conda] numpy 1.19.5 pypi_0 pypi
[conda] torch 1.8.1 pypi_0 pypi
[conda] torchaudio 0.8.1 pypi_0 pypi
[conda] torchvision 0.9.1 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411

@zou3519 zou3519 added module: viewing and reshaping triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module high priority labels Apr 28, 2021
@zou3519
Copy link
Contributor

zou3519 commented Apr 28, 2021

Tentatively marking hi-pri for investigation. I'm not familiar with what unfold does.

@gchanan
Copy link
Contributor

gchanan commented Apr 28, 2021

I could reproduce this on 1.8.1+cu111, doesn't reproduce on 1.7.1.

@gchanan
Copy link
Contributor

gchanan commented Apr 28, 2021

bisected it to #46859.

krshrimali pushed a commit to krshrimali/pytorch that referenced this issue May 19, 2021
…rch#57177)

Summary:
Fixes pytorch#57122

Pull Request resolved: pytorch#57177

Reviewed By: zou3519

Differential Revision: D28072674

Pulled By: ngimel

fbshipit-source-id: 1f0b1d6916eb9739c35a5ac5aba33e70c1c43a34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: viewing and reshaping triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants