Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hwloc_topology_check fails on macOS #580

Closed
HadrienG2 opened this issue Apr 25, 2023 · 3 comments
Closed

hwloc_topology_check fails on macOS #580

HadrienG2 opened this issue Apr 25, 2023 · 3 comments

Comments

@HadrienG2
Copy link

What version of hwloc are you using?

2.8.0

Which operating system and hardware are you running on?

Github macOS CI nodes. Unfortunately, the configuration of these could vary from one CI run to another, so if someone has a Mac at hand and can help me reproduce this in a more stable environment, it would be much appreciated.

Details of the problem

I tried to add hwloc_topology_check() calls as a debug assertion in strategic places of my Rust bindings (end of topology building, end of topology editing...), and I noticed that the assertion fails on macOS even for simple unit tests (think e.g. building the topology and checking that the depth is not zero).

Here is a failing CI log with --enable-debug, I'm not sure from which test this is exactly but judging from the hwloc logs, I suspect this is happening right at the end of topology building: https://github.com/HadrienG2/hwlocality/actions/runs/4797854399/jobs/8535371157 .

It would be a lot more convenient to debug this further (e.g. get stack traces) if I had a Mac around, or help from someone who does. A trivial C program that does hwloc_topology_init(), followed by hwloc_topology_load() followed by hwloc_topology_check() should hopefully be enough to reproduce.

Additional information

Output of sysctl hw and sysctl machdep
hw.ncpu: 3
hw.byteorder: 1234
hw.memsize: 15032385536
hw.activecpu: 3
hw.perflevel0.physicalcpu: 3
hw.perflevel0.physicalcpu_max: 3
hw.perflevel0.logicalcpu: 3
hw.perflevel0.logicalcpu_max: 3
hw.perflevel0.l1icachesize: 32768
hw.perflevel0.l1dcachesize: 32768
hw.perflevel0.l2cachesize: 262144
hw.perflevel0.cpusperl2: 1
hw.perflevel0.l3cachesize: 12582912
hw.perflevel0.cpusperl3: 3
hw.optional.floatingpoint: 1
hw.optional.mmx: 1
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.supplementalsse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hwloc verbose debug enabled, may be disabled with HWLOC_DEBUG_VERBOSE=0 in the environment.
hw.optional.x86_64: 1
CPU phase discovery...
hw.optional.aes: 1
CPU phase discovery in component no_os...
hw.optional.avx1_0: 1

hw.optional.rdrand: 1

hw.optional.f16c: 1
 * CPU cpusets *
hw.optional.enfstrg: 0

hw.optional.fma: 0
cpu 0 (os 0) has cpuset 0x00000001
hw.optional.avx2_0: 0
cpu 1 (os 1) has cpuset 0x00000002
hw.optional.bmi1: 0
cpu 2 (os 2) has cpuset 0x00000004
hw.optional.bmi2: 0

hw.optional.rtm: 0
Add missing single NUMA node
hw.optional.hle: 0

hw.optional.adx: 0
Fixup root sets
hw.optional.mpx: 0

Propagate sets
hw.optional.sgx: 0

hw.optional.avx512f: 0
Removing unauthorized sets from all sets
hw.optional.avx512cd: 0

hw.optional.avx512dq: 0
Ok, finished tweaking, now connect
hw.optional.avx512bw: 0
--- PU level has number 1
hw.optional.avx512vl: 0
hw.optional.avx512ifma: 0

hw.optional.avx512vbmi: 0

hw.features.allows_security_research: 0
Removing bridge objects if needed
hw.physicalcpu: 3

hw.physicalcpu_max: 3
Removing empty objects
hw.logicalcpu: 3

hw.logicalcpu_max: 3
Removing levels with HWLOC_TYPE_FILTER_KEEP_STRUCTURE
hw.cputype: 7

hw.cpusubtype: 4
Propagate total memory up
hw.cpu64bit_capable: 1
hwloc verbose debug enabled, may be disabled with HWLOC_DEBUG_VERBOSE=0 in the environment.
hw.cpufamily: 526772277
Sorting memory tiers...
hw.cpusubfamily: 0
  tier 0 = node L#0 P#0 with tier type 0 and local BW #0
hw.cacheconfig: 3 1 1 3 0 0 0 0 0 0
UNKNOWN-memory-tier max bandwidth 0
hw.cachesize: 15032385536 32768 262144 12582912 0 0 0 0 0 0
SPM-memory-tier min bandwidth 0
hw.pagesize: 4096
cannot assume SPM means HBM
hw.pagesize32: 4096
Cannot bind to PU P#0
hw.busfrequency: 100000000
Cannot bind to PU P#1
hw.busfrequency_min: 100000000
Cannot bind to PU P#2
hw.busfrequency_max: 100000000
hw.cpufrequency: 3337000000
hw.cpufrequency_min: 3337000000
hw.cpufrequency_max: 3337000000
hw.cachelinesize: 64
hw.l1icachesize: 32768
hw.l1dcachesize: 32768
hw.l2cachesize: 262144
hw.l3cachesize: 12582912
hw.tbfrequency: 1000000000
hw.packages: 1
hw.use_kernelmanagerd: 1
hw.serialdebugmode: 0
hw.nperflevels: 1
hw.targettype: Mac
machdep.vectors.timer: 221
machdep.vectors.IPI: 222
machdep.pmap.hashwalks: 3910844
machdep.pmap.hashcnts: 4154633
machdep.pmap.hashmax: 17
machdep.pmap.kernel_text_ps: 4096
machdep.pmap.kern_pv_reserve: 14000
machdep.memmap.Conventional: 15031271424
machdep.memmap.RuntimeServices: 585728
machdep.memmap.ACPIReclaim: 114688
machdep.memmap.ACPINVS: 16384
machdep.memmap.PalCode: 0
machdep.memmap.Reserved: 262144
machdep.memmap.Unusable: 0
machdep.memmap.Other: 0
machdep.tsc.nanotime.tsc_base: 10871024237
machdep.tsc.nanotime.ns_base: 0
machdep.tsc.nanotime.scale: 1227133513
machdep.tsc.nanotime.shift: 0
machdep.tsc.nanotime.generation: 2
machdep.tsc.frequency: 3500000000
machdep.tsc.deep_idle_rebase: 1
machdep.tsc.at_boot: 0
machdep.tsc.rebase_abs_time: 3106006857
machdep.misc.fast_uexc_support: 1
machdep.misc.panic_restart_timeout: 2147483647
machdep.misc.interrupt_latency_max: 0x0 0x51 0x3a0c58ad
machdep.misc.timer_queue_trace: 
machdep.misc.nmis: 0
machdep.xcpm.mode: 0
machdep.xcpm.pcps_mode: 0
machdep.xcpm.hard_plimit_max_100mhz_ratio: 0
machdep.xcpm.hard_plimit_min_100mhz_ratio: 0
machdep.xcpm.soft_plimit_max_100mhz_ratio: 0
machdep.xcpm.soft_plimit_min_100mhz_ratio: 0
machdep.xcpm.tuib_plimit_max_100mhz_ratio: 0
machdep.xcpm.tuib_plimit_min_100mhz_ratio: 0
machdep.xcpm.lpm_plimit_max_100mhz_ratio: 0
machdep.xcpm.tuib_enabled: 0
machdep.xcpm.lpm_enabled: 0
machdep.xcpm.power_source: 0
machdep.xcpm.bootplim: 0
machdep.xcpm.bootpst: 0
machdep.xcpm.tuib_ns: 0
machdep.xcpm.vectors_loaded_count: 0
machdep.xcpm.ratio_change_ratelimit_ns: 500000
machdep.xcpm.ratio_changes_total: 0
machdep.xcpm.maxbusdelay: 0
machdep.xcpm.maxintdelay: 0
machdep.xcpm.mid_applications: 0
machdep.xcpm.mid_relaxations: 0
machdep.xcpm.mid_mode: 1
machdep.xcpm.mid_cst_control_limit: 0
machdep.xcpm.mid_mode_active: 0
machdep.xcpm.mbd_mode: 1
machdep.xcpm.mbd_applications: 0
machdep.xcpm.mbd_relaxations: 0
machdep.xcpm.forced_idle_ratio: 100
machdep.xcpm.forced_idle_period: 30000000
machdep.xcpm.deep_idle_log: 0
machdep.xcpm.qos_txfr: 1
machdep.xcpm.deep_idle_count: 0
machdep.xcpm.deep_idle_last_stats: n/a
machdep.xcpm.deep_idle_total_stats: n/a
machdep.xcpm.cpu_thermal_level: 0
machdep.xcpm.gpu_thermal_level: 0
machdep.xcpm.io_thermal_level: 0
machdep.xcpm.io_control_engages: 0
machdep.xcpm.io_control_disengages: 0
machdep.xcpm.io_filtered_reads: 0
machdep.xcpm.pcps_rt_override_mode: 0
machdep.xcpm.io_cst_control_enabled: 0
machdep.xcpm.ring_boost_enabled: 0
machdep.xcpm.io_epp_boost_enabled: 0
machdep.xcpm.epp_override: 0
machdep.xcpm.perf_hints: 0
machdep.xcpm.pcps_rt_override_ns: 0
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.tsc_ccc.denominator: 0
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.linesize_max: 4096
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.sub_Cstates: 16
machdep.cpu.thermal.sensor: 0
machdep.cpu.thermal.dynamic_acceleration: 0
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.thresholds: 0
machdep.cpu.thermal.ACNT_MCNT: 0
machdep.cpu.thermal.core_power_limits: 0
machdep.cpu.thermal.fine_grain_clock_mod: 0
machdep.cpu.thermal.package_thermal_intr: 0
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.energy_policy: 0
machdep.cpu.xsave.extended_state: 7 832 832 0
machdep.cpu.xsave.extended_state1: 0 0 0 0
machdep.cpu.arch_perf.version: 1
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.width: 48
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.events: 127
machdep.cpu.arch_perf.fixed_number: 0
machdep.cpu.arch_perf.fixed_width: 0
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.L2_associativity: 8
machdep.cpu.cache.size: 256
machdep.cpu.tlb.inst.small: 64
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.shared: 512
machdep.cpu.address_bits.physical: 43
machdep.cpu.address_bits.virtual: 48
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
machdep.cpu.family: 6
machdep.cpu.model: 58
machdep.cpu.extmodel: 3
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 9
machdep.cpu.feature_bits: 18427078393948011519
machdep.cpu.leaf7_feature_bits: 643 0
machdep.cpu.leaf7_feature_bits_edx: 3154117632
machdep.cpu.extfeature_bits: 4967106816
machdep.cpu.signature: 198313
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI
machdep.cpu.logical_per_package: 4
machdep.cpu.cores_per_package: 4
machdep.cpu.microcode_version: 1070
machdep.cpu.processor_flag: 0
machdep.cpu.core_count: 3
machdep.cpu.thread_count: 3
machdep.user_idle_level: 0
machdep.x2apic_enabled: 1
machdep.eager_timer_evaluations: 0
machdep.eager_timer_evaluation_max: 0
machdep.x86_fp_simd_isr_uses: 0

hwloc-gather-cpuid does not seem to produce useful output, for whatever reason: https://github.com/HadrienG2/hwlocality/suites/12468995049/artifacts/664378635.

@bgoglin
Copy link
Contributor

bgoglin commented Apr 25, 2023

Likely a duplicate of #564 (machine with perflevel caches not getting ignored), fixed in hwloc 2.9.1 or in latest branch v2.8 (commit cfab985).

@HadrienG2
Copy link
Author

Thanks! I'll bump CI to v2.9.1 and see if that fixes it.

@HadrienG2
Copy link
Author

...and that does fix it. Sorry for the noise!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants