Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch pollutes libgomp symbols when import _C #109446

Closed
sunqm opened this issue Sep 16, 2023 · 1 comment
Closed

torch pollutes libgomp symbols when import _C #109446

sunqm opened this issue Sep 16, 2023 · 1 comment
Labels
module: binaries Anything related to official binaries that we release to users module: openmp Related to OpenMP (omp) support in PyTorch module: third_party triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@sunqm
Copy link

sunqm commented Sep 16, 2023

🐛 Describe the bug

In torch __init__.py, the _C library is imported with the RTLD_GLOBAL mode. It pollutes the global symbols provided by libgomp. When importing other libraries later, symbols and functions from libgomp are resolved to the libgomp library provided by pytorch. This name pollution causes bugs when using pytorch and libraries that rely on libgomp. One may receive segfault or strange results due to version conflicts in libgomp.

I noticed that in __init__.py, this behavior is controlled by the variable USE_GLOBAL_DEPS. Disabling this option can solve the problem. Is there a way to configure this variable at runtime?

Versions

Collecting environment information...
PyTorch version: 2.0.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.27.4
Libc version: glibc-2.31

Python version: 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.11.0-36-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 10.1.243
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050 Ti
Nvidia driver version: 470.199.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 28
On-line CPU(s) list: 0-27
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz
Stepping: 4
CPU MHz: 3100.000
CPU max MHz: 4400.0000
CPU min MHz: 1200.0000
BogoMIPS: 6199.99
Virtualization: VT-x
L1d cache: 448 KiB
L1i cache: 448 KiB
L2 cache: 14 MiB
L3 cache: 19.3 MiB
NUMA node0 CPU(s): 0-27
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rd
tscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr p
dcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti ssbd mb
a ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clf
lushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window h
wp_epp hwp_pkg_req md_clear flush_l1d arch_capabilities

Versions of relevant libraries:
[pip3] numpy==1.25.2
[pip3] torch==2.0.1
[pip3] triton==2.0.0
[conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] mkl 2019.0 118
[conda] mkl-service 1.1.2 py37h90e4bf4_5
[conda] mkl_fft 1.0.4 py37h4414c95_1
[conda] mkl_random 1.0.1 py37h4414c95_1
[conda] numpy 1.21.6
[conda] numpy 1.15.4 py37h99e49ec_0
[conda] numpy-base 1.15.4 py37h2f8d375_0
[conda] numpydoc 0.8.0 py37_0
[conda] pytorch 1.3.1 py3.7_cuda10.1.243_cudnn7.6.3_0 pytorch

cc @seemethere @malfet

@malfet malfet added module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: third_party module: openmp Related to OpenMP (omp) support in PyTorch labels Sep 18, 2023
@wxj6000
Copy link

wxj6000 commented Sep 21, 2023

It is VERY dangerous to use RTLD_GLOBAL to load libgomp. Other libraries may depends on different versions of OpenMP. In some cases, it doesn't raise any seg fault, but gives numerical errors due to this 'name pollution'. This issue has been also discussed a long time ago.
#3059

After some investigation, as far I can tell this commit is the current implementation.
c735fd7
OpenMP is loaded along with the 'global dependencies'. If 'USE_GLOBAL_DEPS = False', the issue on my side is resolved. I am not sure if this resolve the issue completely. I hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: binaries Anything related to official binaries that we release to users module: openmp Related to OpenMP (omp) support in PyTorch module: third_party triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants