Skip to content

OpenMP runtime detects incorrect processor topology on AMD CPUs #40073

@tycho

Description

@tycho
Bugzilla Link 40727
Version unspecified
OS Linux
CC @RKSimon

Extended Description

The processor topology detection in the OpenMP runtime is confused by AMD CPUs:

$ env KMP_AFFINITY=scatter,verbose ./nbody --bodies 64 --no-crosscheck --iterations 1 --cycle-after 5
OMP: Info #​211: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #​212: KMP_AFFINITY: cpuid leaf 11 not supported - decoding legacy APIC ids.
OMP: Info #​149: KMP_AFFINITY: Affinity capable, using global cpuid info
OMP: Info #​154: KMP_AFFINITY: Initial OS proc set respected: 0-15
OMP: Info #​156: KMP_AFFINITY: 16 available OS procs
OMP: Info #​157: KMP_AFFINITY: Uniform topology
OMP: Info #​159: KMP_AFFINITY: 1 packages x 1 cores/pkg x 16 threads/core (1 total cores)
OMP: Info #​213: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #​171: KMP_AFFINITY: OS proc 0 maps to package 0 thread 0
OMP: Info #​171: KMP_AFFINITY: OS proc 1 maps to package 0 thread 1
OMP: Info #​171: KMP_AFFINITY: OS proc 2 maps to package 0 thread 2
OMP: Info #​171: KMP_AFFINITY: OS proc 3 maps to package 0 thread 3
OMP: Info #​171: KMP_AFFINITY: OS proc 4 maps to package 0 thread 4
OMP: Info #​171: KMP_AFFINITY: OS proc 5 maps to package 0 thread 5
OMP: Info #​171: KMP_AFFINITY: OS proc 6 maps to package 0 thread 6
OMP: Info #​171: KMP_AFFINITY: OS proc 7 maps to package 0 thread 7
OMP: Info #​171: KMP_AFFINITY: OS proc 8 maps to package 0 thread 8
OMP: Info #​171: KMP_AFFINITY: OS proc 9 maps to package 0 thread 9
OMP: Info #​171: KMP_AFFINITY: OS proc 10 maps to package 0 thread 10
OMP: Info #​171: KMP_AFFINITY: OS proc 11 maps to package 0 thread 11
OMP: Info #​171: KMP_AFFINITY: OS proc 12 maps to package 0 thread 12
OMP: Info #​171: KMP_AFFINITY: OS proc 13 maps to package 0 thread 13
OMP: Info #​171: KMP_AFFINITY: OS proc 14 maps to package 0 thread 14
OMP: Info #​171: KMP_AFFINITY: OS proc 15 maps to package 0 thread 15
OMP: Info #​144: KMP_AFFINITY: Threads may migrate across 1 innermost levels of machine
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7608 thread 0 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7610 thread 1 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7611 thread 2 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7612 thread 3 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7613 thread 4 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7614 thread 5 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7615 thread 6 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7616 thread 7 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7617 thread 8 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7618 thread 9 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7619 thread 10 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7620 thread 11 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7621 thread 12 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7622 thread 13 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7623 thread 14 bound to OS proc set 0-15
OMP: Info #​249: KMP_AFFINITY: pid 7608 tid 7624 thread 15 bound to OS proc set 0-15

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 8
Model name: AMD Ryzen 7 2700X Eight-Core Processor
Stepping: 2
CPU MHz: 4200.705
CPU max MHz: 3700.0000
CPU min MHz: 2200.0000
BogoMIPS: 7384.85
Virtualization: AMD-V
L1d cache: 32K
L1i cache: 64K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-15
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

It looks like the OpenMP runtime needs to be taught about the AMD-specific CPUID leaf 0x8000001E. Or as an ugly fallback, learn topology from /proc/cpuinfo when leaf 0xb is unavailable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugzillaIssues migrated from bugzillaopenmp

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions