`torch.compile` trigger assertion error when executing `histogramdd` #93274

Kristoff-starling · 2023-01-30T16:02:52Z

🐛 Describe the bug

The following program works fine in eager mode but triggers assertion error in compile mode.
The value shown in the assertion error message is non-deterministic, i.e., changes every time but not equal to 34.

import torch

def fn(input):
    return torch.histogramdd(input, [1, 1, 55, 31, 34])

x = torch.rand([5, 6], dtype=torch.float32)
ret_eager = fn(x)
print('==== Eager mode OK! ====')

compiled = torch.compile(fn)
ret_compiled = compiled(x)
print('==== torchcomp mode OK! ====')

Error logs

==== Eager mode OK! ====
[2023-01-30 09:55:40,965] torch._inductor.graph: [WARNING] Creating implicit fallback for:
  target: aten._histogramdd_bin_edges.default
  args[0]: TensorBox(StorageBox(
    InputBuffer(name='arg0_1', layout=FixedLayout('cpu', torch.float32, size=[5, 6], stride=[6, 1]))
  ))
  args[1]: [1, 1, 55, 31, 34]
[2023-01-30 09:55:40,972] torch._inductor.graph: [WARNING] Using FallbackKernel: torch.ops.aten._histogramdd_bin_edges.default
[2023-01-30 09:55:40,974] torch._inductor.graph: [WARNING] Creating implicit fallback for:
  target: aten._histogramdd_from_bin_cts.default
  args[0]: TensorBox(StorageBox(
    InputBuffer(name='arg0_1', layout=FixedLayout('cpu', torch.float32, size=[5, 6], stride=[6, 1]))
  ))
  args[1]: [1, 1, 55, 31, 34]
[2023-01-30 09:55:41,039] torch._inductor.graph: [WARNING] Using FallbackKernel: torch.ops.aten._histogramdd_from_bin_cts.default
Traceback (most recent call last):
  File "repro.py", line 11, in <module>
    ret_compiled = compiled(x)
  File "python3.10/site-packages/torch/_dynamo/eval_frame.py", line 211, in _fn
    return fn(*args, **kwargs)
  File "/home/yuyao/bug_repro/bug1.py", line 3, in fn
    def fn(input):
  File "python3.10/site-packages/torch/_dynamo/eval_frame.py", line 211, in _fn
    return fn(*args, **kwargs)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2497, in forward
    return compiled_fn(full_args)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1065, in new_fn
    fw_outs = call_func_with_args(compiled_fw, args, disable_amp=disable_amp)
  File "python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1021, in call_func_with_args
    out = normalize_as_list(f(args))
  File "/tmp/torchinductor/2v/c2v26sgp2glrt2qh24cv2shzpq7pbaeilmybqasfa7c5wui6sbon.py", line 37, in call
    assert_size_stride(buf6, (34, ), (1, ))
AssertionError: expected size 594==34, stride 1==1 at dim=0

Versions

Environment [Click to expand]

PyTorch version: 2.0.0.dev20230129+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.1 LTS (x86_64)
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Clang version: 11.1.0-6
CMake version: version 3.22.1
Libc version: glibc-2.35

Python version: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-56-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.6.124
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 3090
GPU 1: NVIDIA GeForce RTX 3090
GPU 2: NVIDIA GeForce RTX 3090

Nvidia driver version: 515.86.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.4.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] pytorch-triton==2.0.0+0d7e753227
[pip3] torch==2.0.0.dev20230129+cu117
[pip3] torchaudio==2.0.0.dev20230118+cu117
[pip3] torchvision==0.15.0.dev20230118+cu117
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.3.1               h2bc3f7f_2  
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py39h7f8727e_0  
[conda] mkl_fft                   1.3.1            py39hd3c417c_0  
[conda] mkl_random                1.2.2            py39h51133e4_0  
[conda] numpy                     1.21.5           py39he7a7128_1  
[conda] numpy-base                1.21.5           py39hf524024_1  
[conda] numpydoc                  1.2                pyhd3eb1b0_0  
[conda] torch                     1.14.0a0+gitce2f870          pypi_0    pypi

cc @ezyang @gchanan @zou3519 @soumith @msaroufim @wconstab @ngimel @bdhirsh @mlazos @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

The text was updated successfully, but these errors were encountered:

XiaobingSuper · 2023-02-07T02:49:52Z

For histogramdd, the sizes of the outputs depend on input values, it may be better to remove the shape check.

kshitij12345 · 2023-04-28T17:51:37Z

The problem is with the operator histogramdd itself.

import torch
import numpy

# invalid bins
bins = [1, 1, 1, 1, 1]

# Valid bins, all asserts pass.
# bins = [1, 1, 1, 1, 1, 1]

def fn(input):
    return torch.histogramdd(input, bins)

x = torch.rand([5, 6], dtype=torch.float32)

# ValueError: The dimension of bins must be equal to the dimension of the  sample x.
o1 = numpy.histogramdd(x.numpy(), bins)
o2 = numpy.histogramdd(x.numpy(), bins)
for o_, oo_ in zip(o1, o2):
    numpy.testing.assert_allclose(o_, oo_)

# For invalid bins, consecutive call return incorrect output (with different shapes)
o1 = fn(x)
o2 = fn(x)
for o_, oo_ in zip(o1, o2):
    # AssertionError: The values for attribute 'shape' do not match: torch.Size([1, 1, 1, 1, 1, 273]) != torch.Size([1, 1, 1, 1, 1, 33]).
    torch.testing.assert_close(o_, oo_)

Fixes pytorch#93274 Pull Request resolved: pytorch#100624 Approved by: https://github.com/lezcano

jingxu10 added the module: inductor label Jan 31, 2023

dagitses added oncall: pt2 high priority labels Jan 31, 2023

pytorch-bot bot added the triage review label Jan 31, 2023

dagitses removed the oncall: pt2 label Jan 31, 2023

albanD added oncall: pt2 and removed triage review labels Feb 6, 2023

albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 7, 2023

kshitij12345 self-assigned this Apr 28, 2023

kshitij12345 mentioned this issue May 4, 2023

[fix] check for histogramdd when bins is int[] #100624

Closed

pytorchmergebot closed this as completed in 358fe95 May 7, 2023

kiersten-stokes pushed a commit to kiersten-stokes/pytorch that referenced this issue May 8, 2023

[fix] check for histogramdd when bins is int[] (pytorch#100624)

0fe8819

Fixes pytorch#93274 Pull Request resolved: pytorch#100624 Approved by: https://github.com/lezcano

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`torch.compile` trigger assertion error when executing `histogramdd` #93274

`torch.compile` trigger assertion error when executing `histogramdd` #93274

Kristoff-starling commented Jan 30, 2023 •

edited by pytorch-bot bot

XiaobingSuper commented Feb 7, 2023

kshitij12345 commented Apr 28, 2023

torch.compile trigger assertion error when executing histogramdd #93274

torch.compile trigger assertion error when executing histogramdd #93274

Comments

Kristoff-starling commented Jan 30, 2023 • edited by pytorch-bot bot

🐛 Describe the bug

Versions

XiaobingSuper commented Feb 7, 2023

kshitij12345 commented Apr 28, 2023

`torch.compile` trigger assertion error when executing `histogramdd` #93274

`torch.compile` trigger assertion error when executing `histogramdd` #93274

Kristoff-starling commented Jan 30, 2023 •

edited by pytorch-bot bot