Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temperature mapping is wrong on i7-1355U #1335

Open
leahneukirchen opened this issue Nov 24, 2023 · 2 comments · May be fixed by #1352
Open

Temperature mapping is wrong on i7-1355U #1335

leahneukirchen opened this issue Nov 24, 2023 · 2 comments · May be fixed by #1352
Labels
bug 🐛 Something isn't working help wanted Extra attention is needed Linux 🐧 Linux related issues

Comments

@leahneukirchen
Copy link

This bug is related to #879.

I have a 13th Gen Intel Raptor Lake i7-1355U CPU, which has 2 performance cores (with Hyperthreading), and 8 efficiency cores.

lscpu gives the respective frequencies:

% lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0 0:0:0:0          yes 5000.0000 400.0000 2198.6541
  1    0      0    0 0:0:0:0          yes 5000.0000 400.0000 2199.9771
  2    0      0    1 4:4:1:0          yes 5000.0000 400.0000  559.3190
  3    0      0    1 4:4:1:0          yes 5000.0000 400.0000 1644.4080
  4    0      0    2 8:8:2:0          yes 3700.0000 400.0000 1799.9900
  5    0      0    3 9:9:2:0          yes 3700.0000 400.0000  400.0000
  6    0      0    4 10:10:2:0        yes 3700.0000 400.0000 1658.6350
  7    0      0    5 11:11:2:0        yes 3700.0000 400.0000 1789.9850
  8    0      0    6 12:12:3:0        yes 3700.0000 400.0000 1728.6110
  9    0      0    7 13:13:3:0        yes 3700.0000 400.0000  743.2380
 10    0      0    8 14:14:3:0        yes 3700.0000 400.0000 1759.9810
 11    0      0    9 15:15:3:0        yes 3700.0000 400.0000 1405.3361

The coretemp sensor provides these temperatures:

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +40.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +38.0°C  (high = +100.0°C, crit = +100.0°C)
Core 4:        +33.0°C  (high = +100.0°C, crit = +100.0°C)
Core 8:        +39.0°C  (high = +100.0°C, crit = +100.0°C)
Core 9:        +39.0°C  (high = +100.0°C, crit = +100.0°C)
Core 10:       +39.0°C  (high = +100.0°C, crit = +100.0°C)
Core 11:       +39.0°C  (high = +100.0°C, crit = +100.0°C)
Core 12:       +38.0°C  (high = +100.0°C, crit = +100.0°C)
Core 13:       +38.0°C  (high = +100.0°C, crit = +100.0°C)
Core 14:       +38.0°C  (high = +100.0°C, crit = +100.0°C)
Core 15:       +38.0°C  (high = +100.0°C, crit = +100.0°C)

However, htop 3.2.2 reports temperatures like this:

    1[||||||          13.5%  472MHz 38°C]   7[|||               2.4% 1414MHz 40°C]
    2[|||||||         17.4%  558MHz 33°C]   8[||||              6.0% 1683MHz 40°C]
    3[|||              7.3%  980MHz 38°C]   9[|||               5.4% 1699MHz 40°C]
    4[|||||||         16.2% 1541MHz 38°C]  10[|||               4.8% 1497MHz 39°C]
    5[||||             4.9%  790MHz 38°C]  11[||||||            12.7%  400MHz N/A]
    6[||               4.2%  564MHz 38°C]  12[||                 1.8%  655MHz N/A]

Note that the two last cores have N/A. Also Cores 1 and 2 have different temperature,
but they are the same package.

htop should print the same temperature for cores 1 and 2, as well as 3 and 4, and fill up the rest, so core 12 has 39°C in the end. The mapping is available from parsing /proc/cpuinfo:


% cat /proc/cpuinfo | grep -e processor -e 'core id' -e '^$'
processor	: 0
core id		: 0

processor	: 1
core id		: 0

processor	: 2
core id		: 4

processor	: 3
core id		: 4

processor	: 4
core id		: 8

processor	: 5
core id		: 9

processor	: 6
core id		: 10

processor	: 7
core id		: 11

processor	: 8
core id		: 12

processor	: 9
core id		: 13

processor	: 10
core id		: 14

processor	: 11
core id		: 15
@BenBE BenBE added bug 🐛 Something isn't working help wanted Extra attention is needed Linux 🐧 Linux related issues labels Nov 24, 2023
@leahneukirchen
Copy link
Author

I'm working on a patch to do this properly, htop makes a few bold assumptions here.

@leahneukirchen
Copy link
Author

leahneukirchen commented Dec 10, 2023

Ok, work in progress is here: main...leahneukirchen:htop:cpu-topology

I think this also fixes #1176, #806, #1048 in a better way.

Essentially, we store for each CPU the physical id (socket) and the core id (reused on hyperthreading). The coretemp sensors emits the physical id and the core id in the label, so we map using that. This fixes the issue on my i7-1355U, where some cores are hyperthreading and others are not.

For Ryzen, I tested on a 3700X, which has only 1 sensor, and a dual-socket EPYC 7443 with 24 Cores (48 Threads), which has 2 sensors that report 4 measurements, one for each CCD (some internal grouping). Since CCD mapping is not possible exactly to the best of my knowledge, I apply the heuristic that each CCD has same size, and each CCD contains the cores (not CPUs) in order. The CCD order should not change during htop runs (could be wrong with hotswapping CPUs...). This seems to be true, comparing with how other tools do it. I can at least verify that pinning CPU load to one core heats the CCD assigned to it.

This adds some complexity to the code, but it should be acceptable.

leahneukirchen added a commit to leahneukirchen/htop that referenced this issue Dec 25, 2023
leahneukirchen added a commit to leahneukirchen/htop that referenced this issue Dec 25, 2023
leahneukirchen added a commit to leahneukirchen/htop that referenced this issue Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working help wanted Extra attention is needed Linux 🐧 Linux related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants