-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Description
🐛 Describe the bug
I have a PyTorch model that I want to convert to ONNX format. I have managed to produce a traced version of the model with torch.jit.trace
which works correctly, but exporting this to ONNX produces a model that gives back different (wrong) outputs.
I have attached the serialised traced model is in this zip folder. The original PyTorch model is a slightly modified version of HOOD to get tracing working (inputs/outputs turned into flat tuple
s of Tensor
s instead of HeteroData
batches) etc...
See https://gist.github.com/NathanielB123/75fa9229f49ccdf4af3628abf73c4d12 for the script I used to export to onnx
and print a sample of the outputs.
The full terminal output after running this code is:
/home/nathaniel/miniconda3/envs/hood/lib/python3.10/site-packages/torch/onnx/utils.py:825: UserWarning: no signature found for <torch.ScriptMethod object at 0x7fb25f51bc40>, skipping _decide_input_format
warnings.warn(f"{e}, skipping _decide_input_format")
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
0:
[0.11268847 0.11745856 1.171965 ]
[-0.09882738 -0.75298095 0.57629055]
500:
[ 0.36263114 -0.09879217 0.8864989 ]
[-0.21008204 -0.9361933 0.81566393]
1000:
[-0.6373062 0.28653488 1.4825789 ]
[-1.2088397 -0.39165828 1.3485816 ]
1500:
[-0.17704955 0.5167098 1.6783938 ]
[ 0.0593356 -0.71061885 1.5147011 ]
2000:
[-0.41957054 0.12701258 1.0791388 ]
[-0.35199475 -0.79554534 0.8017714 ]
2500:
[-0.5762653 -0.23102543 0.46685016]
[-1.1951463 -0.57939434 0.12812436]
3000:
[-0.01403345 -0.07792548 0.9337326 ]
[-0.79627764 -0.7208227 0.6344711 ]
3500:
[-0.6358214 0.34578276 1.105367 ]
[-0.8967222 -1.1001766 0.4269399]
4000:
[-0.20787726 0.01798273 1.1856971 ]
[-0.39938125 -0.7097303 0.6083367 ]
As you can see, the vectors outputted by each model are (very) different.
Versions
Collecting environment information...
PyTorch version: 2.0.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (Ubuntu 13.1.0-8ubuntu1~22.04) 13.1.0
Clang version: Could not collect
CMake version: version 3.27.1
Libc version: glibc-2.35
Python version: 3.10.9 (main, Mar 8 2023, 10:47:38) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: Quadro T2000 with Max-Q Design
Nvidia driver version: 531.41
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz
CPU family: 6
Model: 165
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
Stepping: 2
BogoMIPS: 5616.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves flush_l1d arch_capabilities
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 192 KiB (6 instances)
L1i cache: 192 KiB (6 instances)
L2 cache: 1.5 MiB (6 instances)
L3 cache: 12 MiB (1 instance)
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Retbleed: Mitigation; Enhanced IBRS
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds: Unknown: Dependent on hypervisor status
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] pytorch3d==0.7.4
[pip3] torch==2.0.1
[pip3] torch-cluster==1.6.1
[pip3] torch-geometric==2.3.1
[pip3] torch-scatter==2.1.1
[pip3] torch-sparse==0.6.17
[pip3] torchaudio==2.0.2
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.15.2
[pip3] triton==2.0.0
[conda] blas 1.0 mkl conda-forge
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2023.1.0 h6d00ec8_46342
[conda] mkl-service 2.4.0 py310h5eee18b_1
[conda] mkl_fft 1.3.6 py310h1128e8f_1
[conda] mkl_random 1.2.2 py310h1128e8f_1
[conda] mxnet-mkl 1.6.0 pypi_0 pypi
[conda] numpy 1.22.4 pypi_0 pypi
[conda] pytorch 2.0.1 py3.10_cuda11.7_cudnn8.5.0_0 pytorch
[conda] pytorch-cuda 11.7 h778d358_5 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] pytorch3d 0.7.4 py310_cu117_pyt201 pytorch3d
[conda] torch 2.0.1 pypi_0 pypi
[conda] torch-cluster 1.6.1 pypi_0 pypi
[conda] torch-geometric 2.3.1 pypi_0 pypi
[conda] torch-scatter 2.1.1 pypi_0 pypi
[conda] torch-sparse 0.6.17 pypi_0 pypi
[conda] torchaudio 2.0.2 py310_cu117 pytorch
[conda] torchinfo 1.8.0 pypi_0 pypi
[conda] torchtriton 2.0.0 py310 pytorch
[conda] torchvision 0.15.2 py310_cu117 pytorch
[conda] triton 2.0.0 pypi_0 pypi
Metadata
Metadata
Assignees
Labels
Type
Projects
Status