Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.compile trigger assertion error when executing histogramdd #93274

Closed
Kristoff-starling opened this issue Jan 30, 2023 · 2 comments
Closed
Assignees
Labels
high priority module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Kristoff-starling
Copy link
Contributor

Kristoff-starling commented Jan 30, 2023

馃悰 Describe the bug

The following program works fine in eager mode but triggers assertion error in compile mode.
The value shown in the assertion error message is non-deterministic, i.e., changes every time but not equal to 34.

import torch

def fn(input):
    return torch.histogramdd(input, [1, 1, 55, 31, 34])

x = torch.rand([5, 6], dtype=torch.float32)
ret_eager = fn(x)
print('==== Eager mode OK! ====')

compiled = torch.compile(fn)
ret_compiled = compiled(x)
print('==== torchcomp mode OK! ====')
Error logs
==== Eager mode OK! ====
[2023-01-30 09:55:40,965] torch._inductor.graph: [WARNING] Creating implicit fallback for:
  target: aten._histogramdd_bin_edges.default
  args[0]: TensorBox(StorageBox(
    InputBuffer(name='arg0_1', layout=FixedLayout('cpu', torch.float32, size=[5, 6], stride=[6, 1]))
  ))
  args[1]: [1, 1, 55, 31, 34]
[2023-01-30 09:55:40,972] torch._inductor.graph: [WARNING] Using FallbackKernel: torch.ops.aten._histogramdd_bin_edges.default
[2023-01-30 09:55:40,974] torch._inductor.graph: [WARNING] Creating implicit fallback for:
  target: aten._histogramdd_from_bin_cts.default
  args[0]: TensorBox(StorageBox(
    InputBuffer(name='arg0_1', layout=FixedLayout('cpu', torch.float32, size=[5, 6], stride=[6, 1]))
  ))
  args[1]: [1, 1, 55, 31, 34]
[2023-01-30 09:55:41,039] torch._inductor.graph: [WARNING] Using FallbackKernel: torch.ops.aten._histogramdd_from_bin_cts.default
Traceback (most recent call last):
  File "repro.py", line 11, in <module>
    ret_compiled = compiled(x)
  File "python3.10/site-packages/torch/_dynamo/eval_frame.py", line 211, in _fn
    return fn(*args, **kwargs)
  File "/home/yuyao/bug_repro/bug1.py", line 3, in fn
    def fn(input):
  File "python3.10/site-packages/torch/_dynamo/eval_frame.py", line 211, in _fn
    return fn(*args, **kwargs)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2497, in forward
    return compiled_fn(full_args)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1065, in new_fn
    fw_outs = call_func_with_args(compiled_fw, args, disable_amp=disable_amp)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1021, in call_func_with_args
    out = normalize_as_list(f(args))
  File "/tmp/torchinductor/2v/c2v26sgp2glrt2qh24cv2shzpq7pbaeilmybqasfa7c5wui6sbon.py", line 37, in call
    assert_size_stride(buf6, (34, ), (1, ))
AssertionError: expected size 594==34, stride 1==1 at dim=0

Versions

Environment [Click to expand]
PyTorch version: 2.0.0.dev20230129+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.1 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Clang version: 11.1.0-6
CMake version: version 3.22.1
Libc version: glibc-2.35

Python version: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-56-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.6.124
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 3090
GPU 1: NVIDIA GeForce RTX 3090
GPU 2: NVIDIA GeForce RTX 3090

Nvidia driver version: 515.86.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.4.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] pytorch-triton==2.0.0+0d7e753227
[pip3] torch==2.0.0.dev20230129+cu117
[pip3] torchaudio==2.0.0.dev20230118+cu117
[pip3] torchvision==0.15.0.dev20230118+cu117
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.3.1               h2bc3f7f_2  
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py39h7f8727e_0  
[conda] mkl_fft                   1.3.1            py39hd3c417c_0  
[conda] mkl_random                1.2.2            py39h51133e4_0  
[conda] numpy                     1.21.5           py39he7a7128_1  
[conda] numpy-base                1.21.5           py39hf524024_1  
[conda] numpydoc                  1.2                pyhd3eb1b0_0  
[conda] torch                     1.14.0a0+gitce2f870          pypi_0    pypi

cc @ezyang @gchanan @zou3519 @soumith @msaroufim @wconstab @ngimel @bdhirsh @mlazos @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@XiaobingSuper
Copy link
Collaborator

For histogramdd, the sizes of the outputs depend on input values, it may be better to remove the shape check.

@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 7, 2023
@kshitij12345 kshitij12345 self-assigned this Apr 28, 2023
@kshitij12345
Copy link
Collaborator

The problem is with the operator histogramdd itself.

import torch
import numpy

# invalid bins
bins = [1, 1, 1, 1, 1]

# Valid bins, all asserts pass.
# bins = [1, 1, 1, 1, 1, 1]

def fn(input):
    return torch.histogramdd(input, bins)

x = torch.rand([5, 6], dtype=torch.float32)

# ValueError: The dimension of bins must be equal to the dimension of the  sample x.
o1 = numpy.histogramdd(x.numpy(), bins)
o2 = numpy.histogramdd(x.numpy(), bins)
for o_, oo_ in zip(o1, o2):
    numpy.testing.assert_allclose(o_, oo_)

# For invalid bins, consecutive call return incorrect output (with different shapes)
o1 = fn(x)
o2 = fn(x)
for o_, oo_ in zip(o1, o2):
    # AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 1, 1, 1, 1, 273]) != torch.Size([1, 1, 1, 1, 1, 33]).
    torch.testing.assert_close(o_, oo_)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants