Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for DPTF adaptive performance policy #224

Closed
wants to merge 45 commits into from

Conversation

mjg59
Copy link
Contributor

@mjg59 mjg59 commented Apr 13, 2020

This is a very rough attempt at implementing support for the adaptive performance DPTF policy. I have no documentation on this, so it's a best guess in many places. It only supports a subset of the conditions and actions and I'm really not sure that the hack that uses the dram RAPL engine if the sensor is called TMEM is correct, so please be as critical as necessary.

mjg59 and others added 30 commits April 12, 2020 17:41
Different thermal policies require different UUIDs to be written to the
INT3400 interface. Avoid hardcoding that and allow engines to override it
as needed.
Systems may provide the DPTF policy information in an ACPI table. Allow
that to be used directly, and disable parsing of XML config to avoid
confusion.
If we're going to read a binary sysfs file, we want to know how big it is
in advance. Add a helper to do that. In addition, add support for reading
files greater than a single page.
We can obtain PPCC parameters from the firmware, so abstract that rather
than calling directly into the XML parser code.
The adaptive target tables may include information targeted at specific
device types. Add core support for passing that through to the cooling
devices.
Firmware may provide multiple PPCC entries, including one for DRAM. Try to
use it if we can, but fix up the XML parser to it makes sure that we don't
break existing setups.
Add support for setting RAPL parameters based on adaptive targets.
Incomplete at present.
In adaptive mode we prefer the use of the MMIO RAPL interface if available.
We want engines to be able to override this.
This is something of a hack right now (it should be event driven rather
than polling), but we need to update the adaptive engine state to check
whether we should transition to a new target.
The firmware doesn't pay attention to the configured UUID in the INT3400
device until _OSC is called, and this only happens when we enable the
associated thermal zone. Make sure we do so, and disable it again on exit.
DPTM provides an adaptive power policy that allows for improved integration
between the platform and the OS. Firmware can provide a set of complex
conditions to the OS, and an OS agent (in this case thermal_daemon) is
responsible for evaluating them, potentially making use of information that
is easily available to the OS and not the firmware. The agent evaluates
each set of conditions in turn, and once the first evaluates completely it
triggers a set of actions.

This implementation only covers a subset of the conditions - notably, it
only currently handles conditions that are associated with OEM-specific
variables exported via ACPI. It is also only currently capable of managing
RAPL targets or loading new passive thermal policies, since that's all that
my test system does.
Add a command line option to enable the use of the adaptive engine.
GDDV payloads can be compressed using LZMA. Add support for handling that.
PTTV entries may have a uint64 rather than a string for the target argument,
so make sure we handle that and also deal with cases where we have multiple
PSVTs in the parent object rather than in the shared space by passing
through a unique name.
Skipping a uint64 requires also skipping the type defintion, so fix up the
offset calculations.
We're not going to do anything with this data right now, so if it's not
something we can parse, bail.
This changes both the fatal into an error, and the loop to report _all_
unsupported conditions at once.

It should make it easier to gauge how much work is neded to support a
certain firmware.
Some systems use APPC to provide additional information associated with
conditions. Parse that and then merge it into our conditions.
Doing it this way was a bad idea, but here we are.
Make sure we don't inappropriately walk off the end of the APAT table.
PSVT tables may contain multiple passive trip points. Add support for that.
Add rough support for temperature comparisons. Right now this ignores
hysteresis.
If the APPC doesn't have anything that looks like a legitimate version
number at the start, ignore it.
This relies on upower, so add a dependency on that as well.
Add support for conditions that depend on whether the system is on AC or not
Add a handler to report that the workload is bursty at all times - we don't
currently have a good way to report something meaningful here.
@bhack
Copy link

bhack commented Apr 25, 2020

@mj59 Thanks for you effort. What do you think about this Lenovo statement. It Is not clear to me the how that firmware will interact with this.

@bhack
Copy link

bhack commented Apr 25, 2020

/cc @mrhpearson if he can disclose something more from Lenovo.

@mrhpearson
Copy link

Hi,
I was looking @mjg59 blog the other day about this change. First off (with my Lenovo hat off) - very cool and kudos. I wonder if this will encourage Intel to release some details publicly?

Lenovo hat back on - I don't really know how this will interact with our new firmware. I assume if you're not using thermald then it doesn't really matter, but otherwise the two pieces will potentially be fighting one another. I was planning to try it out on one of our devices to see what happens but haven't had time this week.
All our 2020 (linux supported) platforms will have the thermal firmware and we have a few older platforms that should be getting updated next month (fingers crossed...), I don't think this change is really needed for our platforms that have the updated thermal firmware - it is already taking care of the different conditions and has been tuned for optimal performance in the factory. I am sure there will be people in the community who want to tweak the behaviour and that will be trickier - we have safety regulations we have to meet and when I've asked previously about adding the ability to just disable any power throttling the response was a firm no. We have some improvements coming soon - I'm just waiting on permission to disclose publicly...

I did forward the details about @mjg59 work being done to the firmware team last week to get their feedback but they didn't really have anything to add. If there is anything specific that I should ask them please let me know.

Sorry - in all that rambling nothing very useful. If you are looking for specifics let me know and I'll do the best to find answers. Thanks for adding me to the thread.

@bhack
Copy link

bhack commented Apr 25, 2020

Thank for your feedback. Yes It would be really nice to know if the new firmware will conflict with Thermald with or without this patch and so if Is It the case to uninstall Thermald in each cases or just if this will not merged on these Lenovo devices.

I.e. the other workarund solution conflicted with Thermald: https://github.com/erpalma/throttled/blob/master/README.md#thermald

/cc @erpalma

We were out of sync for higher conditions, which caused misinterpretation
of legitimate values.
@mrhpearson
Copy link

I played with this some and I think I need some guidance on what needs testing and if what I'm doing makes sense. I appreciate I really need to read the code but it's been a crazy week and I wanted to get some results back so haven't done that.

I built and ran this version of thermald. I needed to generate a thermal-conf.xml so used dptfxtract for that - it was interesting how it generated 5 different config files and I really need to understand what it is doing as I suspect that is important (let me know if you want them posted somewhere to review) . I've not personally had much to do with thermald so if there are pointers on what is better please let me know.

I'm running the latest version of our firmware and it has a feature where you can switch between thermal modes (low, normal, high - with corresponding power settings of 7.5W for low, 14.5 for normal and 51W for high).

With this thermald running I seem to be limited to 15W (and around 75C). I'm running Intels Thermal Analysis Tool and the Package power was clamped at 15 (running a kernel build to keep things busy).

If I'm not running thermald but just Lenovo firmware then in performance mode the power goes up to 51W and Temperature to 92 - so better performance (as long as it's in desk mode - you won't get that when it's in lap mode). So...in the 'battle' to control the power thermald is winning, but in this case it's not giving the optimal performance. I suspect that's related to the config files I've generated.

I tested lap mode. In this case our firmware detects it's on a lap and lowers the power to 14.5W (or 7.5W if you have low mode selected). If I'm running thermald I get 10W (it was a bit quirky - I saw 15W at one point...but only briefly).

Hope the above is helpful - happier to collect more details or try anything out if it would be useful. Especially if I'm not doing things right :)

I'm going to try it on some platforms which don't have the updated firmware - I know we have plenty of unhappy customers because we haven't been able to backport it everywhere and I'm really hoping this is a good solution there (longer term I also suspect we'll all be better off if we can support DPTF properly)

Let me know if any questions.
Thanks
Mark

@mjg59
Copy link
Contributor Author

mjg59 commented Apr 29, 2020

@mrhpearson You need to run with the --adaptive option, which should then avoid the need to use dptfxtract and any xml config files. If you don't do that it's equivalent to the current upstream.

@mjg59
Copy link
Contributor Author

mjg59 commented Apr 29, 2020

(But also it would be super helpful to know how Lenovo firmware differentiates between Linux and a DPTF-enabled operating system, and what the differences in behaviour are)

@mrhpearson
Copy link

@mjg59 If I run with --adaptive I get:
[1588176598][ERR]Unable to open GDDV data vault
I ran with loglevel=debug but I didn't see anything obvious jump out - any ideas what I'm missing?

I'll check with the firmware team how they determine the difference - afraid you'll have to wait a while for the reply as they're in Japan and its a holiday there this week.

@mjg59
Copy link
Contributor Author

mjg59 commented Apr 29, 2020

@mrhpearson Did you build a kernel with the additional patches in the repo? They're not upstream yet.

@mrhpearson
Copy link

Crud...missed those. Sorry!

@spandruvada
Copy link
Contributor

spandruvada commented Apr 29, 2020 via email

@mrhpearson
Copy link

mrhpearson commented Apr 30, 2020

So some interesting results but not sure they will be what you are looking for. Thanks for the pointer on the patches - I hadn't seen the instructions on your fork and I think I'm now setup correctly.
Running patched 5.7.0-rc3 on a X1Carbon7 (WhiskeyLake CPU). Doing a kernel build with multiple threads to keep it busy

I confirmed with the Lenovo firmware that it was in deskmode with high performance mode enabled (power up to 51W) and then ran thermald --adaptive. At this point the power fluctuated between 10 and 15W - going up and down in small increments (MSR PL1 changing value). Temperature stayed steady around 65 to 70C.

I then picked up the laptop to get it in lap mode (I'm assuming the DPTF tables should handle this as that's their raison d'etre) and interestingly the power then fixed solidly at 15W (temp at 70C).
I waited for it to drop back into deskmode - but it didn't - so for the X1C7 at least that seems broken.

As a minor note - stopping thermald doesn't return everything to 'normal' operation. A reboot is needed at that point (I'm assuming registers different to what our FW uses have been tweaked).

Srinivas - I'll forward you my xml file shortly. The other details are:
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_1_tmax_us 0
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_0_tmin_us 28000000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_0_min_uw 125000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_1_min_uw 51000000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_1_step_uw 500000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_0_tmax_us 32000000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_0_max_uw 51000000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_1_max_uw 51000000
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_1_tmin_us 0
/sys/bus/pci/devices/0000:00:04.0/power_limits/power_limit_0_step_uw 500000
rdmsr 0x1a2
364000

@mjg59
Copy link
Contributor Author

mjg59 commented Apr 30, 2020

@mrhpearson Ok, thanks - I should have an X1v7 showing up in the next few weeks and I'll dig into it. Are you able to open an issue in the github.com/mjg59/thermal_daemon repository and attach the output of the acpidump command when run as root?

@mrhpearson
Copy link

No problem - done.

@bhack
Copy link

bhack commented Apr 30, 2020

/cc @bbian

@spandruvada
Copy link
Contributor

As an open source developer I don't have access to the DPTF/DTT specifications. I will bring in these changes to main line in the next release. It may be modified from what is submitted with due credit given to the author. We need volunteers to test implement workaround if this causes any negative user experience. So initially it will still have some command line option to turn on this feature.
Anyway this require the kernel patches merged in, which I will test first to see if we can catch 5.8.

@bhack
Copy link

bhack commented May 6, 2020

I hoped that some DPTF team member for Chrome OS could comment here but....

@superm1
Copy link

superm1 commented May 6, 2020

Afaict Chromeos only supports passive policies today. so I'm not sure how much their comments help this discussion.

@bhack
Copy link

bhack commented May 6, 2020

Probably not so much if they don't have access to the dynamic specs.
But we don't know this. It could be just that they are under NDA on the dynamic part or that the dynamic part Is in the roadmap. Who knows...

@bhack
Copy link

bhack commented May 14, 2020

@spandruvada Let us know when you will try to submit the kenel patches.

@bhack
Copy link

bhack commented Jun 18, 2020

What is the STOA?

@spandruvada
Copy link
Contributor

kernel patches are merged to 5.8-rc1. So someone can try on top of 5.8-rc1.

@estan
Copy link

estan commented Jun 27, 2020

Great initiative @mjg59.

I tried this out on my Lenovo Thinkpad X1 Carbon 6th (20KH007BMX), running Ubuntu 20.04 with mainline 5.8rc2 kernel, with laptop on desk and plugged into standard 65 W charger, running thermald --no-daemon --adaptive and starting stress -c 8:

bild

Looks like it's still being throttled down to 75-80° C / 15 W under load.

Ambient temperature was 28.5° C (hot in Sweden today and no AC :p).

This is similar to the behavior I see with default Ubuntu 20.04 kernel (5.4.0-39.43) and thermald 1.9.1-1ubuntu0.1 package.

Before I tried this, I completely removed the thermald Ubuntu package (including configuration).

Let me know if you want me to dump some tables out or something that would help.

EDIT: I posted an issue on your repo instead, with an acpidump: mjg59#7

@spandruvada
Copy link
Contributor

spandruvada commented Jun 27, 2020 via email

@spandruvada
Copy link
Contributor

spandruvada commented Jun 27, 2020 via email

@estan
Copy link

estan commented Jun 28, 2020

Please attach output of acpidump.

Thanks

@spandruvada I moved this over to an issue on @mjg59's fork here: mjg59#7. It has an acpidump attached.

@spandruvada
Copy link
Contributor

I merged all changes to v2.3_development branch with minor change for removing my name from authorship. I have to use manual merge for conflicts. I hope merged correctly. I did a diff with mjg59/thermal_daemon to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants