Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ns50 v2.3 suspend causes freeze on resume #29

Closed
7 of 49 tasks
commandline-be opened this issue Nov 22, 2023 · 12 comments · Fixed by linuxboot/heads#1561
Closed
7 of 49 tasks

ns50 v2.3 suspend causes freeze on resume #29

commandline-be opened this issue Nov 22, 2023 · 12 comments · Fixed by linuxboot/heads#1561

Comments

@commandline-be
Copy link

Please identify some basic details to help process the report

After upgrading to HEADs v2.3 there is an issue with suspend-state to resume. This has the machine seemingly comatose as it does not resume.

A. Provide Hardware Details

1. What board are you using (see list of boards here)?

2. Does your computer have a dGPU or is it iGPU-only?

  • dGPU
  • iGPU-only

3. Who installed Heads on this computer?

  • Insurgo
  • Nitrokey
  • Purism
  • Other provider
  • Self-installed

4. What PGP key is being used?

  • Librem Key
  • Nitrokey Pro 2
  • Nitrokey Storage
  • Yubikey
  • Other

5. Are you using the PGP key to provide HOTP verification?

  • Yes
  • No
  • I don't know

B. Identify how the board was flashed

1. Is this problem related to updating heads or flashing it for the first time?

  • First-time flash
  • Updating heads

2. If the problem is related to an update, how did you attempt to apply the update?

  • Using the Heads GUI
  • Flashrom via the Recovery Shell
  • External flashing

3. How was Heads initially flashed

  • External flashing
  • Internal-only / 1vyrain
  • Don't know

4. Was the board flashed with a maximized or non-maximized/legacy rom?

  • Maximized
  • Non-maximized / legacy
  • I don't know

5. If Heads was externally flashed, was IFD unlocked?

  • Yes
  • No
  • Don't know

C. Identify the rom related to this bug report

1. Did you download or build the rom at issue in this bug report?

  • I downloaded it
  • I built it

2. If you downloaded your rom, where did you get it from?

  • Heads CircleCi
  • Purism
  • Nitrokey
  • Somewhere else (please identify)

Please provide the release number or otherwise identify the rom downloaded

3. If you built your rom, which repository:branch did you use?

  • Heads:Master
  • Other (please identify)

4. What version of coreboot did you use in building?

  • 4.8.1 (current default in heads:master)
  • 4.13
  • 4.14
  • 4.15
  • Other (please specify)
  • I don't know

5. In building the rom where did you get the blobs?

  • No blobs required
  • Provided by the company that installed Heads on the device
  • Extracted from a backup rom taken from this device
  • Extracted from another backup rom taken from another device (please identify the board model)
  • Extracted from the online bios using the automated tools provided in Heads
  • I don't know

Please describe the problem

Describe the bug

Machine remains in deep sleep regardless of HID input or pressing the power button

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

the machine resumes from (deep) sleep and/or hibernation

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
As documented here this should work well. Test with pm-hibernate and pm-suspend suggest as much.
https://docs.dasharo.com/variants/novacustom_ns5x_adl/test-matrix/#module-dasharo-security

Workaround being tried out is modifiying /etc/default/acpi-support so SUSPEND_METHODS="pm-utils" since pm-utils are reported to be working and the past config pointed to dbus first.

@commandline-be
Copy link
Author

changes to SUSPEND_METHODS in /etc/default/acpi-support made no difference

@daringer
Copy link
Collaborator

daringer commented Nov 22, 2023

are you sure this happens only starting with 2.3 ?
Generally the issue is originating in the different sleep/suspend modes and respective OS support:

  • QubesOS needs S3, so we patched coreboot to have S3 being the default, this makes NV41 suspend for for QubesOS + Ubuntu
  • the same patch also changes the default for NS50, but there S3 is not working due to a missing capacitor on the motherboard, but this patch was already applied in 2.2 so I would have thought that for 2.2 NS50 suspend might also not work

the Dasharo Test results suggest that suspend should work all the way, but I suppose this is also based on setting either S0 (for ubuntu) and/or S3 for QubesOS, which is weird because as of my knowledge this should can not work...

can you share the output of /sys/power/mem_sleep and sudo dmesg | grep ACPI | grep supports ?

We'll investigate and discuss this with Dasharo, but currently this leaves the impression for me that we might need two different coreboot versions for NV41 and NS50 to maximize the available combinations, which work to suspend...

currently for me the working suspend looks like this (dasharo v1.6 release, qubes 4.1.2, Nitropad release >=2.2)

NV41 NS50
QubesOS S3 none
Ubuntu S3 / (S0 ?) S0 (?)

This might have changed with the most recent Dasharo release 1.7, we'll have to check that

@commandline-be
Copy link
Author

of that i'm sure yes. v2.2 had issues with deep sleep, the fan kept spinning even in suspend/hibernate.

for v2.3 the only real change is the command-line boot paramter intel_iommu=on instead of igfx_off

for ns50 i noticed 4 cstates up to C10
I don't know for the nv41

I'll share output later

this situation just begs to ask if no NS50 hardware is made available to test all this ?

@commandline-be
Copy link
Author

commandline-be commented Nov 22, 2023

FAIL fo running this test https://docs.dasharo.com/unified-test-documentation/dasharo-compatibility/31M-platform-suspend-and-resume/#susp001001-platform-suspend-and-resume-ubuntu-2204-wakeup-flag

must comment, set to 20 seconds no 60 as shown on this test page
state = freeze

@daringer
Copy link
Collaborator

daringer commented Nov 22, 2023

of that i'm sure yes. v2.2 had issues with deep sleep, the fan kept spinning even in suspend/hibernate.

if the fan keeps spinning, this would suggest that the sleep state is not really reached - although it's weird that its behavior changed for 2.3 ...

FAIL fo running this test https://docs.dasharo.com/unified-test-documentation/dasharo-compatibility/31M-platform-suspend-and-resume/#susp001001-platform-suspend-and-resume-ubuntu-2204-wakeup-flag

must comment, set to 20 seconds no 60 as shown on this test page state = freeze

please be aware, that these test results refer to dasharo v1.7.1 which is not integrated into the current firmware. On top these test results are generated with an EDK2 payload and not with HEADS - means sleep states are configurable. Other coreboot/EDK2 configuration details might also easily change the platform behavior here.

for ns50 i noticed 4 cstates up to C10
I don't know for the nv41

So far I understand Cx states have nothing to do with sleep, those are CPU states for low power operation on small loads / idle operation.

@commandline-be
Copy link
Author

power.max_cstates don't, setting the value wrong actually influences performance
intel_idle.max_cstates most likely does, this value sets a hard limit for the number of available states

in case you wonder what cstates are, this is a nice overview, if anything c-states invoke sleep, i mean what else would ?

https://gist.github.com/wmealing/2dd2b543c4d3cff6cab7

@commandline-be
Copy link
Author

potentially interesting note here https://github.com/torvalds/linux/blob/v6.2/drivers/idle/intel_idle.c

/*

  • On AlderLake C1 has to be disabled if C1E is enabled, and vice versa.
  • C1E is enabled only if "C1E promotion" bit is set in MSR_IA32_POWER_CTL.
  • But in this case there is effectively no C1, because C1 requests are
  • promoted to C1E. If the "C1E promotion" bit is cleared, then both C1
  • and C1E requests end up with C1, so there is effectively no C1E.
  • By default we enable C1E and disable C1 by marking it with
  • 'CPUIDLE_FLAG_UNUSABLE'.
    */

@daringer
Copy link
Collaborator

in case you wonder what cstates are, this is a nice overview, if anything c-states invoke sleep, i mean what else would ?

The S-states, here it is described in some detail: https://unix.stackexchange.com/questions/550731/difference-between-c-state-and-s-state. Power consumption can change during suspend for varying C-state configurations, but the C-State by itself is just a mechanism for the SoC to consume less power (with the trade-off of taking longer until full performance can be achieved again) based on current system load.

You can also see C-State metrics (Idle Stats) in e.g., powertop

@commandline-be
Copy link
Author

commandline-be commented Nov 22, 2023

not sure what to think of this really, given I'm reporting the issue and have no fix myself

c-states set the actual state for the CPU power consumption, p-states set the actual performance profile (Mhz), s-states also involve other components outside of the CPU to go into a different energy-mode.

best resource i could find
https://metebalci.com/blog/a-minimum-complete-tutorial-of-cpu-power-management-c-states-and-p-states/

@tlaurion
Copy link

tlaurion commented Nov 22, 2023

@commandline-be
This is where I said some of those things need to be fixed in firmware and saw lots of things written on coreboot and dasharo front, where nitrokey used coreboot fork (see modukes/coreboot) is pointing to older coreboot version hash.

That will need more testing and validation, first step being to have coreboot point to the new commit of coreboot fork for novacustom dasharo, verifying which patches are still needed on coreboot 4.21, reviewing coreboot config for n50/nv41 and then pushing a PR for willing testers to test containing all those changes. I'm not sure taking boot config alone is possible when it comes to s3/sx newer sleep states. Platform need to support s3 to be compatible with qubes at the time of writing those lines, where qubesos said maybe a month prior of having fixes ready on Xen side as approximate timeline.

Other then that, the os tries to do best with latest kernel versions and fixes are applied normally on systemd etc, but if firmware doesn't expose things correctly, sleep/resume issues are normally not resolved elsewhere then in firmware and here, that means under coreboot.


Disclosure: I will not make an habit of being under nitrokey/heads instead of linuxboot/heads. The reason why I'm replying to those issues is a concern of myself as well. From my perspective, those platforms are not completely usable from users. And to my opinion, those users deserve to know.

Hopefully those issues are worked in collaboration with their upstream source of solution, but that is not under my power.

Heads consider that hardware initialization made by coreboot is made correctly and depends on that correct initialization for the final OS to behave properly.

@commandline-be
Copy link
Author

commandline-be commented Nov 22, 2023

should that info be of use, my use case is regular Linux, not Qubes.

typically the recommendation with sleep/freeze issues is to use something similar to either of the below, sadly not without drawbacks.

https://unix.stackexchange.com/questions/419456/i915-intel-skylake-system-freeze-after-wake-up-from-hibernate-suspend-to-disk

https://hobo.house/2018/05/18/fix-for-intel-i915-gpu-freeze-on-recent-linux-kernels/

though I feel tempted I don't feel myself compiling heads/dasharo would have much use here, hence also the reason i shared the archlinux patch mentioned which reportedly is a fix for all suspend/freeze issues.

currently evaluating this page to learn more about configuration and possible erroneous events

https://wiki.archlinux.org/title/Power_management

@commandline-be
Copy link
Author

confirming v2.4 resolved suspend issue

@daringer daringer closed this as completed Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants