Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dell Latitude 5420 / i5-1145G7 CPU Throttling Under Load #344

Open
JoshuaPK opened this issue Mar 11, 2022 · 24 comments
Open

Dell Latitude 5420 / i5-1145G7 CPU Throttling Under Load #344

JoshuaPK opened this issue Mar 11, 2022 · 24 comments

Comments

@JoshuaPK
Copy link

This defect is intended to contain information specific to the Latitude 5420 i5-1145G7 regarding the CPU throttling issue. Related defects are #341 and #334.

@61ack1ynx
Copy link

Hey Joshua.No news on this?

@sebastianha
Copy link

Did you test the Kernel (5.18-rc) with the latest thermald (2.4.9)? Perhaps this fixes the problem, too.

@61ack1ynx
Copy link

@sebastianha yes. i'm running 5.18-rc with 2.4.9 on my Manjaro. it's way better (running on 2300MHZ ) under heavy load, however does not run on 4+MHZ. I've tried previous fix(does not know if this is needed on 5.18-rc) - echo 7 > /sys/bus/platform/devices/INTC1040:00/enable_policy. And it always returns an error("Permission denied") and the enable_policy file does not exist. Looks like i cant create files in kernel space.

@61ack1ynx
Copy link

Latest s-tui shows stable 2.5 under stress

@AdamantGarth
Copy link

AdamantGarth commented Apr 14, 2022

For me it works kind of fine even on unpatched 5.17 with latest thermald. When I'm running s-tui stress test or compiling something I get ~4GHz that quickly drops down to 3.3-3.5GHz and stays there for at least 15 minutes (didn't test for longer). The weird thing is - it behaves like that only when I'm stressing it. If I play a game - CPU randomly goes from ~3.5GHz on some cores and 1.5GHz on others, to locked 1.5GHz on all, with some dropping down to like 900MHz. And when it does that, the game starts to lag very hard and the FPS drops from 60 to <20. It happens after a minute or so, despite being able to hold 3.3-3.5GHz for 15 minutes under stress test or compilation.

Maybe this is another issue, with the iGPU driver, for example. No idea.

P.S. This happens with multiple totally different games - Europa Universalis IV, Minecraft, Shadow Tactics. And all of them work great under Linux on my another laptop with an older Intel iGPU, so it's unlikely that the problem is with them.

@sebastianha
Copy link

@61ack1ynx The i5-1145G7 CPU will run max 4GHz when using all cores. You will only reach more when only one core is being used. So this seems OK for me. What does the package wattage say? The "echo 7" thing is – as far as I know – no longer valid, it was only a test when the first available patch had been applied. So this is OK, too.

Can you run "s-tui -c" and then start a stress test? After running for 2-3 minutes upload the resulting CSV file here.

@AdamantGarth Which distro are you using? Because e.g. Ubuntu has backported the patch to their kernel as far as I know. But at the end your issues sound more like GPU overheating and because this is all one package that might imply CPU throttling, too.

@AdamantGarth
Copy link

@AdamantGarth Which distro are you using? Because e.g. Ubuntu has backported the patch to their kernel as far as I know. But at the end your issues sound more like GPU overheating and because this is all one package that might imply CPU throttling, too.

I use Arch btw with regular kernel (linux). Thing is - everything is perfect on Windows. I can play Witcher 3 for hours without any noticeable throttling.

@61ack1ynx
Copy link

@sebastianha No. My CPU is 11th Gen Intel(R) Core(TM) i7-1185G7. And it's not reaching 4GHZ. its running for few seconds then drops to 2.5GHZ. Let me upload .cvs for you in a minute

@61ack1ynx
Copy link

s-tui_log_2022-04-14_15_39_35.csv
@sebastianha here you go. csv attached

@JoshuaPK
Copy link
Author

JoshuaPK commented Apr 14, 2022

I have been using my 5420 for a while now with just the latest thermald (2.4.9). Everything has been running consistently smoothly, even under very heavy load the processor would fluctuate right around 2.6GHz, and it wouldn't throttle down. Until this past weekend. I decided to watch a movie that was on a BluRay disc. The movie was running at its native 4k resolution on a 4k monitor. About an hour into the movie, the graphics became distorted. So I checked, and the processor had throttled down to 400MHz. (Disregard previous, I found that I had thermald disabled on my system. Re-enabling it stopped the throttling to 400MHz, but it did throttle to 1.5GHz instead.) I did also note that the ambient temperature in the room was approximately 80 degrees F (27 degrees C). This weekend I'll have to attempt to test with the kernel patch and see if it behaves differently.

@AdamantGarth
Copy link

AdamantGarth commented Apr 14, 2022

Here's my csv, maybe it'll be helpfull. This is on 5.17.1-arch1-1 kernel with balanced mode in BIOS.

s-tui_log_2022-04-14_17_44_44.csv Edit: Whoops, sorry - forgot to close s-tui so the file is padded with regular monitoring data at the bottom 😅

Also, something that worries me - some cores are close to 100°C all the time. Is this a cause for concern?

@sebastianha
Copy link

@61ack1ynx can you run s-tui as root? This adds power readings to the CSV file.

The composite temperature is very high and more interestingly it is always at exact 66,8°C, same for "Sensor1". I experienced that a high (above ~45°C) temperature causes thermald to throttle down on my system. Did you every see another value for this sensor? This definitely does not look correct.

@sebastianha
Copy link

@AdamantGarth This looks good for me. 100°C is not problem, this is in spec for the CPU as far as I know.

@61ack1ynx
Copy link

@sebastianha Yes. this is strange . The value of the Sensor1 is never changes. always 66.8. Sensors command gave this:

nvme-pci-e100
Adapter: PCI adapter
Composite: +66.8°C (low = -273.1°C, high = +82.8°C)
(crit = +86.8°C)
Sensor 1: +66.8°C (low = -273.1°C, high = +65261.8°C)

Looks like pcie some kind of sensor, not the processor one... New csv attached
s-tui_log_2022-04-14_17_44_33.csv

@sebastianha
Copy link

Ok, then the reading is definitely incorrect then. When now assuming that thermald is using this and detecting it is too high you are getting throttled. That is my current theory.

You can check thermald log by running it by hand with

thermald --no-daemon --loglevel=info --adaptive

and see if the sensor is being used.

@61ack1ynx
Copy link

61ack1ynx commented Apr 14, 2022

@sebastianha i can't find it in log... can you look please? attached
log.log

@61ack1ynx
Copy link

61ack1ynx commented Apr 14, 2022

@sebastianha By the way. If thermald service is disabled. its absolutely the same - after few seconds throttled to 2.5Ghz

@sebastianha
Copy link

mmh, can't find anything suspicous but I am definitely not an expert here.

Perhaps @spandruvada can help?

@sebastianha
Copy link

Just a side note: There are interestingly many typos in the psvt/apat dumps: I found "Name :sp14t-quient" ("quient") in your log and "name:GDC31_180cf_28C_Optimiazed_Idle" ("Optimiazed") in mine.

@AdamantGarth
Copy link

Month after the release of Linux 5.18 I can pretty confidently say that the problem is totally gone for me. And it went away right after I upgraded to 5.18 kernel. Before that the laptop was still throttling sometimes (but I couldn't reproduce that consistently), even though I used a patched kernel. And it consistently throttled really hard in games. But now I play for hours, and everything is buttery smooth all the time! And the inconsistent throttling under load, which I couldn't reproduce reliably - haven't seen that in a month :)

@gghh
Copy link

gghh commented Dec 19, 2022

Hello @JoshuaPK , can you tell me if you're still seeing this problem in December 2022? I work on power management at SUSE (Linux distro vendor) and I received a report for this problem back in August on the openSUSE Tubleweed distribution. I took me a while to get to it, and I'm now trying to understand if there's still work to do or the recent kernel/thermald/Dell firmware fixed this already.

@sebastianha
Copy link

sebastianha commented Dec 19, 2022

Hi,

I am also running Tumbleweed and everything works fine. But you have to delay the startup for thermald. If ThermalD is started during startup it does not work correctly.

This is my systemd override file:

### Editing /etc/systemd/system/thermald.service.d/override.conf
### Anything between here and the comment below will become the new contents of the file

[Service]
ExecStartPre=/bin/sleep 15
ExecStart=
ExecStart=/usr/sbin/thermald --systemd --dbus-enable --adaptive

@gghh
Copy link

gghh commented Dec 19, 2022

Hi @sebastianha , thanks for the info. By the way, do I remember correctly that you have a Dell Latitude 7320 and not a 5420? I've read so many threads here on the thermald issues and the dell.com forum that I may be mixing up names.
I recall that Srinivas Pandruvada (Intel employee and thermald maintainer) said the different Latitude models all behave in slightly different ways.

@sebastianha
Copy link

Yes, I have a 7320 with i7-1185G7
This was my original debugging issue: #341
If you need any more information just let me now, I spent some hours/days on this topic ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants