-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
microcode-20200609 release, intel-ucode 06-8e-0c/0x806ec revision=0xd6 causes freezes on warm boot #35
Comments
|
Does this only happen when running on battery, that means without plugged-in power cable? In our case, plugging in the power cable fixes the issue (but starting with |
|
Do you have any complains about MSR 0x123 in the kernel logs either when you resume from sleep-to-RAM (S3/suspend) or when you bring CPUs online? |
|
@hmh do you have an example? Didn't see anything containing "MSR" or "0x123". |
|
Reporter of Ubuntu bug 1883002 here... Yes, I have MSR 0x123 errors when CPUs are brought online during boot: Complete kernel log attached. (note: this is with microcode 0xca, which is loaded by the system firmware and currenlty works fine; if needed I can test with 0xd6) |
|
Upgrading a Dell XPS 13 9360 with Intel i7-7500U from Ubuntu 19.10 to Ubuntu 20.04 (microcode update 0xd6), looking through the logs, I am seeing the MSR messages at least once during the first resume from suspend. (No errors encountered.) @hmh, should I submit a separate bug report in Launchpad, or create one here? |
|
The MSR access is a kernel bug. It might be relatively harmless or not harmless at all, depending on just how much (and what) code is running on the AP before its microcode is updated. I haven't checked. But my 0x806e9 is coping with it well enough as long as I keep everything at the defaults (i.e. what the microcode has as a default for the new MSRs is actually what Linux is using). This bug will not contribute to better stability when microcode updates are required, obviously. So, it is something to be fixed ASAP. |
|
@vicamo: you will see such illegal MSR access splats in the kernel log only when your UEFI/BIOS microcode is old enough to not have such an MSR, the new microcode (updated through Linux) adds the support for the new MSR, and Linux sees a need to try to read/write such MSRs early. The bug is that it is not updating the secondary cores (read: not the core used for boot or to resume) early enough -- or that new code was added that is running too early, same thing. It is easier to show up in the resume-from-suspend path, but the boot path also needs a look just in case. |
|
@hmh: There are definitely other scenarios that cause those illegal accesses, as in the case of my logs above the microcode is not being updated by Linux. |
|
@alyf80: noted... looks like there is more than one bug involved, kernel side. I have seen possibly related fixes in the latest round of stable kernels related to access to MSR 0x123 in situations where it shouldn't be accessed in the new microcode, so the case you described might have been addressed already. But I don't recall any patches related to such accesses being done before AP ucode update in the resume-from-S3 path. In my Dell laptop with microcode revision 0xc6 in UEFI, one can clearly see the touch-MSR-0x123-before-AP-was-updated. |
|
FYI, Dell released system firmware 1.9.1 which includes microcode revision 0xd6. With the new firmware, both the lockups and the MSR 0x123 errors are gone. |
This seems not to be true, for my system, a DELL 5591, at least (respective bug report [here](https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1882943, thanks @mirekingr fot that).
So I doubt this has been solved at all. Not even on firmware 0.1.11.1 that the system was using the last couple of days/weeks. Sorry for the rant, but I can't believe it takes weeks to rollback some bad decision on whatever caused all this nuisance. |
I can confirm the problem is fixed for the Dell Precision 3540.
[…]
Is the second time when resuming? It looks very strange, that on resume you have revision 0xca, and during boot already the newer version 0xd6? Do problem only happen after suspend? […] Please contact Dell to check whether the firmware applies microcode updates when resuming. And please, open a separate bug report, but I think it has nothing to do with the upstream project. |
There are still problem though when booting without the power cable attached. […] |
|
Hello, installed microcode package 👎 mathieu@ZBook15G6:~$ sudo dmesg |grep microcode The bug is highly repeatable and occurs when booting the machine hook to the HP G2 thunderolt 3 dock. When booting the laptop from the battery, and hooking the dock after, it works properly. |
|
06-8e-0c microcode has been updated to revision 0xde in microcode-20201110 release, does the newer microcode revision help? |
|
I've just upgraded microcode and Bios (1.07.01 rev 1 from https://support.hp.com/fr-fr/drivers/selfservice/hp-zbook-15-g6-mobile-workstation/22892887) mathieu@ZBook15G6:~$ sudo dmesg |grep microcode But no more luck. Booting with the HP-G2 Thundebolt dock hooked failed several times in a row. bootlog1.txt Some boot reached the login screen, some freezed before, some freezed after entering login creds. I hope this helps and can provide additional test results if needed. Regards, Mathieu |
|
From the bootlogs, your machine is not updating the microcode at all: it seems to be already at revision 0xde in UEFI/BIOS. So, any regressions you observed were latent issues that a reboot exposed, but not related to the microcode update. Might have been something in the operating system, or an issue with the HP BIOS update you performed. If your system has a dual-boot BIOS that still has the older version, could you boot with the old BIOS, and check if the microcode update happens? If the issues you observed with your dock were caused by the BIOS update, that might also fix them... |
|
@hmh : Hello, the issue of not booting when hooked to the HP G2 TB3 dock was already present with previous BIOS / microcode, so it is not an issue related to this microcode. I was only reporting an issue that is not solved by this microcode / BIOS update. By the way, I'm not sure at all the issue is a microde related issue, but symptoms were close enough to description by other users, so I posted here. |
|
I see. Anyway, we'd need someone with an outdated BIOS that does not have the current microcode (revision 0xde) and which had issues on reboot with previous microcode updates, to try the new one and report if the freeze-on-reboot is fixed... |
|
I could revert to 1.06.00, but I never had any freeze on reboot on my laptop. Only issues I have are when booting with G2 dock hooked. I need to unhook it before booting, and replug it after boot. |
|
@Matioupi: please don't revert your BIOS, if you never had any freezes, it would not tell us anything... |
|
New revision 0xea of 06-8e-0c microcode file has been published as part of microcode-20210608 release, it may be worth to try it out. |
Per debian bug 962757, ubuntu bug 1883002, and from internal testing, some systems are seeing freezes, particularly after warm reboots, with the 06-8e-0c/0x806ec revision 0xd6 from the 20200609 microcode release.
The systems from the collected reports are:
Dell Latitude 7400, i5-8265U (debian bug)
Dell Latitude 7300, i7-8665U (ubuntu bug)
Dell Latitude 5410, i5-8365U
Specifically, the reporter from ubuntu bug report inidats that they were initially affected by the similar issue against
0xcaas reported in #24; a BIOS update from Dell addressed that, but it was re-introduced with the 20200609 update that moved from0xcato0xd6. That users testing also indicated a much higher frequency of freezes occurring with warm reboots (the freeze seen with the third system was also after a warm reboot). The debian bug reported that their system was also stable with the0xcaversion but not the0xd6version.The text was updated successfully, but these errors were encountered: