build error: memory_desc.cpp:716:1: internal compiler error: Segmentation fault #123091

wencan · 2024-04-01T13:15:57Z

🐛 Describe the bug

command: python3.12 setup.py develop

error info:

[5867/8226] Building CXX object third_party/ideep/mkl-dnn/src/common/CMakeFiles/dnnl_common.dir/memory_desc.cpp.o
FAILED: third_party/ideep/mkl-dnn/src/common/CMakeFiles/dnnl_common.dir/memory_desc.cpp.o 
/usr/bin/c++ -DDNNL_ENABLE_CPU_ISA_HINTS -DDNNL_ENABLE_ITT_TASKS -DDNNL_ENABLE_MAX_CPU_ISA -DDNNL_X64=1 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -I/home/wencan/Projects/github.com/pytorch/pytorch/build/third_party/ideep/mkl-dnn/include -I/home/wencan/Projects/github.com/pytorch/pytorch/third_party/ideep/mkl-dnn/include -I/home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/benchmark/include -I/home/wencan/Projects/github.com/pytorch/pytorch/third_party/onnx -I/home/wencan/Projects/github.com/pytorch/pytorch/build/third_party/onnx -I/home/wencan/Projects/github.com/pytorch/pytorch/third_party/foxi -I/home/wencan/Projects/github.com/pytorch/pytorch/build/third_party/foxi -I/home/wencan/Projects/github.com/pytorch/pytorch/third_party/ideep/mkl-dnn/src -isystem /home/wencan/Projects/github.com/pytorch/pytorch/build/third_party/gloo -isystem /home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/gloo -isystem /home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/googletest/googletest/include -isystem /home/wencan/Projects/github.com/pytorch/pytorch/third_party/protobuf/src -isystem /home/wencan/Projects/github.com/pytorch/pytorch/third_party/gemmlowp -isystem /home/wencan/Projects/github.com/pytorch/pytorch/third_party/neon2sse -isystem /home/wencan/Projects/github.com/pytorch/pytorch/third_party/XNNPACK/include -isystem /home/wencan/Projects/github.com/pytorch/pytorch/third_party/ittapi/include -isystem /home/wencan/Projects/github.com/pytorch/pytorch/cmake/../third_party/eigen -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -fopenmp -fvisibility-inlines-hidden  -Wall -Wno-unknown-pragmas -fvisibility=internal   -fPIC -Wformat -Wformat-security -fstack-protector-strong  -Wmissing-field-initializers -Wmissing-field-initializers  -Wno-strict-overflow -Wno-maybe-uninitialized  -DITT_API_IPT_SUPPORT -O3 -DNDEBUG -DNDEBUG -D_FORTIFY_SOURCE=2 -std=c++17 -fPIC -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -MD -MT third_party/ideep/mkl-dnn/src/common/CMakeFiles/dnnl_common.dir/memory_desc.cpp.o -MF third_party/ideep/mkl-dnn/src/common/CMakeFiles/dnnl_common.dir/memory_desc.cpp.o.d -o third_party/ideep/mkl-dnn/src/common/CMakeFiles/dnnl_common.dir/memory_desc.cpp.o -c /home/wencan/Projects/github.com/pytorch/pytorch/third_party/ideep/mkl-dnn/src/common/memory_desc.cpp
during GIMPLE pass: evrp
/home/wencan/Projects/github.com/pytorch/pytorch/third_party/ideep/mkl-dnn/src/common/memory_desc.cpp: In lambda function:
/home/wencan/Projects/github.com/pytorch/pytorch/third_party/ideep/mkl-dnn/src/common/memory_desc.cpp:716:1: internal compiler error: Segmentation fault
  716 | }
      | ^
Please submit a full bug report, with preprocessed source.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
[5884/8226] Building CXX object third_party/fbgemm/CMakeFiles/fbgemm_avx2.dir/src/FbgemmI8DepthwiseAvx2.cc.o
ninja: build stopped: subcommand failed.

Versions

code version:
commit dd8a24b (HEAD -> main, origin/main, origin/gh/jgong5/34/base, origin/HEAD)

Python 3.12.2
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0

OS: Fedora Linux 39 (Workstation Edition) x86_64 
Kernel: 6.7.10-200.fc39.x86_64 
Uptime: 3 hours, 36 mins 
Packages: 2370 (rpm), 38 (flatpak) 
Shell: bash 5.2.26 
Resolution: 2560x1440 
DE: GNOME 45.5 
WM: Mutter 
WM Theme: Adwaita 
Theme: Adwaita [GTK2/3] 
Icons: Adwaita [GTK2/3] 
Terminal: gnome-terminal 
CPU: AMD Ryzen 7 1800X (16) @ 3.600GHz 
GPU: NVIDIA GeForce GTX 1080 Ti 
Memory: 3655MiB / 15892MiB

PyTorch version: 2.2.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Fedora Linux 39 (Workstation Edition) (x86_64)
GCC version: (GCC) 13.2.1 20240316 (Red Hat 13.2.1-7)
Clang version: Could not collect
CMake version: version 3.29.0
Libc version: glibc-2.38

Python version: 3.12.2 (main, Feb 21 2024, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)] (64-bit runtime)
Python platform: Linux-6.7.10-200.fc39.x86_64-x86_64-with-glibc2.38
Is CUDA available: True
CUDA runtime version: 12.4.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 550.67
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        43 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               16
On-line CPU(s) list:                  0-15
Vendor ID:                            AuthenticAMD
Model name:                           AMD Ryzen 7 1800X Eight-Core Processor
CPU family:                           23
Model:                                1
Thread(s) per core:                   2
Core(s) per socket:                   8
Socket(s):                            1
Stepping:                             1
Frequency boost:                      enabled
CPU(s) scaling MHz:                   66%
CPU max MHz:                          3600.0000
CPU min MHz:                          2200.0000
BogoMIPS:                             7185.27
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev
Virtualization:                       AMD-V
L1d cache:                            256 KiB (8 instances)
L1i cache:                            512 KiB (8 instances)
L2 cache:                             4 MiB (8 instances)
L3 cache:                             16 MiB (2 instances)
NUMA node(s):                         1
NUMA node0 CPU(s):                    0-15
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Mitigation; untrained return thunk; SMT vulnerable
Vulnerability Spec rstack overflow:   Mitigation; Safe RET
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] optree==0.11.0
[pip3] torch==2.2.2
[pip3] torchaudio==2.2.2
[pip3] torchinfo==1.8.0
[pip3] torchtext==0.17.2
[pip3] torchvision==0.17.2
[conda] Could not collect

cc @malfet @seemethere @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen

The text was updated successfully, but these errors were encountered:

vpirogov · 2024-04-01T15:43:47Z

This is an internal compiler error in GCC. I found similar report in Bugzilla, but it's not clear whether this is the same issue.

malfet · 2024-04-01T17:33:50Z

Please try downgrading to older GCC, and also file a bugzilla report against them, as we are yet to test that our code is compilable with gcc-13.2

wencan · 2024-04-02T08:52:37Z

@malfet
@vpirogov
I downgraded gcc to 13.2.1 20230918 (Red Hat 13.2.1-3), and the issue no longer occurred.

bugzilla bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114558

drisspg added the triage review label Apr 1, 2024

janeyx99 added module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration module: third_party and removed triage review labels Apr 1, 2024

ptrblck mentioned this issue Apr 4, 2024

[Bug Fix] Fix Cuda 12.4 compilation - Refactor SFINAE boxing logic #123377

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build error: memory_desc.cpp:716:1: internal compiler error: Segmentation fault #123091

build error: memory_desc.cpp:716:1: internal compiler error: Segmentation fault #123091

wencan commented Apr 1, 2024 •

edited by pytorch-bot bot

Loading

vpirogov commented Apr 1, 2024

malfet commented Apr 1, 2024

wencan commented Apr 2, 2024 •

edited

Loading

build error: memory_desc.cpp:716:1: internal compiler error: Segmentation fault #123091

build error: memory_desc.cpp:716:1: internal compiler error: Segmentation fault #123091

Comments

wencan commented Apr 1, 2024 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

vpirogov commented Apr 1, 2024

malfet commented Apr 1, 2024

wencan commented Apr 2, 2024 • edited Loading

wencan commented Apr 1, 2024 •

edited by pytorch-bot bot

Loading

wencan commented Apr 2, 2024 •

edited

Loading