You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the main branch of today (3c8b153) and I've got incorrect output in cache-info on a machine booted with the "nosmt" kernel command line parameter:
WITH the "nosmt" parameter (note the weird "shared by XX processors" at the end of each line):
[NOSMT]$ ./cache-info
Max cache size (upper bound): 16777216 bytes
L1 instruction cache: 96 x 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 97 processors
L1 data cache: 96 x 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 97 processors
L2 unified cache: 96 x 512 KB (inclusive), 8-way set associative (1024 sets), 64 byte lines, shared by 97 processors
L3 unified cache: 24 x 16 MB (exclusive), 16-way set associative (16384 sets), 64 byte lines, shared by 100 processors
WITHOUT the "nosmt" parameter:
[SMT]$ ./cache-info
Max cache size (upper bound): 16777216 bytes
L1 instruction cache: 96 x 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 2 processors
L1 data cache: 96 x 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 2 processors
L2 unified cache: 96 x 512 KB (inclusive), 8-way set associative (1024 sets), 64 byte lines, shared by 2 processors
L3 unified cache: 24 x 16 MB (exclusive), 16-way set associative (16384 sets), 64 byte lines, shared by 8 processors
Looking at the x86 logic (I'm horribly unfamiliar with it, so bear with me here :)), it looks like we'd only accidentally generate N caches if we assigned them each different L1I IDs. Digging through the code paths here, I don't see anywhere we would accidentally mark the same ID as different IDs, so that leads me to believe that our l1i_id calculation fails somehow.
# How we calculate the apic_bits for each cache in `cpuinfo_x86_decode_cache_properties`.
const uint32_t cores = 1 + ((regs.eax >> 14) & UINT32_C(0x00000FFF));
const uint32_t apic_bits = bit_length(cores);
...
# How we later mask to acquire the L1I ID.
const uint32_t apic_id = linux_processors[i].apic_id;
const uint32_t l1i_id = apic_id & ~bit_mask(x86_processor.cache.l1i.apic_bits);
I'd hazard a guess that this means either:
The apic_id is wrong for all processors with 'nosmt' enabled?
The apic_bits is wrong for all caches with 'nosmt' enabled?
Hello,
I'm using the main branch of today (3c8b153) and I've got incorrect output in cache-info on a machine booted with the "nosmt" kernel command line parameter:
WITH the "nosmt" parameter (note the weird "shared by XX processors" at the end of each line):
WITHOUT the "nosmt" parameter:
lscpu output WITH the "nosmt" parameter:
lscpu output WITHOUT the "nosmt" parameter:
The difference is are on the On-line/Off-line CPU(s) lists.
It caught me by surprise with a SIGFPE when using the NNPACK project (Maratyszcza/NNPACK#218).
Thank you for your help.
The text was updated successfully, but these errors were encountered: