New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
NNPACK Conv2D operation gives wrong result for non-contiguous weights #55781
Comments
High pri as it's a silent correctness issue |
I think it was fixed earlier today by #55794 |
I don't think it was, |
I can more-or-less reproduce the problem on x86 by comparing convolution result with MKL enabled or disabled:
|
But the problem goes away if weight is contiguous... |
Summary: Added TestNN.test_conv2d_discontiguous_weight to prevent further regressions Fixes pytorch#55781 Pull Request resolved: pytorch#56569 Reviewed By: ngimel Differential Revision: D27926509 Pulled By: malfet fbshipit-source-id: fa5ce943c3e4db4aa4de1b1cba35bd399fb3c54d
馃悰 Bug
Hello,
I鈥檓 getting wrong results for a Conv2D operation on ARM CPU compared to the correct result I get for the same code on x86_64 architectures. Basically, the output tensors are identical in some parts, but have major blocks of data different in other parts. A quick view of the differences: https://github.com/octavianmm/torch_nn_functional_conv2d_problem/blob/main/results/difference.png
To Reproduce
Steps to reproduce the behavior:
A minimal working example that can reproduce this issue can be found in this Git repository:
https://github.com/octavianmm/torch_nn_functional_conv2d_problem
Expected behavior
The expected (and correct result) can be found in the above Git repo, in the folder results/output_tensor_x86_64.pt (and .txt), whereas the incorrect result I got on the ARM CPU is in results/output_tensor_arm.pt (and .txt)
Environment
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (aarch64)
GCC version: (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Python version: 3.6 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_train.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_train.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_train.so.8.0.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.8.0
[pip3] torchvision==0.9.0
[conda] Could not collect
cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @malfet
The text was updated successfully, but these errors were encountered: