-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4.2-RC3: OpenBSD 7.3 ISO doesn't boot anymore #8502
Comments
Any non-default settings for that qube? (memory size? some qvm-features?) |
Sorry, not much to go on here. What's trap 4, because it surely isn't #OF? Any way to figure out what the faulting instruction is? |
No, I also tried with a new qube created from the GUI in the qube manager with the same result. I think it's related to the CPU (Ryzen 5 5600X). On another 4.2-RC3 with an i5-7300U this is working fine. 🤔 |
What Zen uarch is the Ryzen? First thought is maybe the PKS feature |
seems to be Zen 3 |
Huh. So it was a #GP on RDMSR, which is definitely odd |
Could you explain a bit what #GP mean? I'm not familiar with those terms. |
#GP is general protection fault. Error code of 0 is "misc" and a dumping ground for everything. |
In parallel, I'll report this on OpenBSD mailing list. Could you confirm the HVM are using qemu? (when trying to stop a stuck HVM, the qubes applet provide a way to show qemu logs.. so I guess it's using qemu) |
yup, qemu is running as device model (but based on the crash, it's rather irrelevant here) |
This will most likely be a Xen issue, but I have to admit I'm a bit confused. I wonder if there's a weird TSC mode set for the VM, and rdtsc is generally intercepted. When the VM is crashed, can you run |
Can you do the same with |
Sorry, I did mean s. (I'm debugging from a phone in breaks while driving, not in front of my usual dev setup) |
Right - so I'm an idiot so far. That's an RDMSR failing, not RDTSC. Which MSR is being used in |
So I expect it is def = rdmsr(MSR_PSTATEDEF(0)); blowing up in |
@andyhhp does that mean that OpenBSD is peeking at MSRs it has no business looking at, or that Xen is advertising a feature and then injecting #GP when the guest tries to use it? |
Not sure. I don't have the manuals to hand right now, but I do recall being dubious about the HWCR patch in the first place |
HWCR? |
I'm fairly sure it was broken in Xen 4.15 by https://lore.kernel.org/xen-devel/0c8043e3-07aa-6242-19bd-07b04f574b87@suse.com/, a series committed over my objections concerning the correctness of the changes. It appears it was to shut up Linux, which makes different and equally dubious model specific assumptions about the availability of certain MSRs.
|
@rapenne-s: I think the reason this was missed for so long is that OpenBSD is not used on Xen very much. |
does it mean it's an OpenBSD issue? |
From @andyhhp (Xen core developer and x86 expert) it appears that this is a combination of a Xen bug and an OpenBSD bug. Fixing either issue should be enough for OpenBSD to boot, but obviously fixing both would be ideal. |
I don't understand this part:
From what I can tell, none of the architecure PRMs/SDMs say that "an operating system should not perform model specific checks without checking !hv". There are certain MSRs that are gated behind specific CPUID bits; an OS should certainly be checking those bits before blindly assuming such MSRs exist (I just made a fix for this for HW power management on AMD last week in OpenBSD). But making the broad claim that "if CPUID_HV is set then all model specific checks should not be performed" is a bit odd. If that bit is set, what should an OS assume? That it's running on an 80386? Maybe I'm misunderstanding something in that claim; if the statement was intended to say: "It's buggy for OpenBSD to read an MSR if the CPUID bit describing presence of that MSR is not set" then I'm 100% in agreement. But saying "if HV is set, then no model specific checks should be performed" is a bit odd. Can you clarify? |
I’m not @andyhhp, but my understanding is that “model-specific checks” refers to physical CPU models, not to MSRs. A hypervisor is allowed to pick and choose which features it exposes to guests, so a guest cannot assume that a hypervisor is emulating a specific CPU that actually existed. In particular, a guest cannot assume that support for feature X implies support for feature Y unless the architecture specification says so, even if every processor shipped with feature X has also had feature Y. In this case, Xen claims to support TSCFREQSEL. OpenBSD seems to assume that if TSCFREQSEL is present, then so are P-states. However, Xen does not expose P-states, so OpenBSD’s attempt to access the P-state MSRs faults. This is an OpenBSD bug: OpenBSD must not assume that support for TSCFREQSEL implies P-state support, even if every physical processor that ever shipped with TSCFREQSEL did in fact have P-state support. As it happens, Xen should not actually be advertising TSCFREQSEL if live migration is enabled. This doesn’t matter for Qubes OS, though, because Qubes OS does not support live migration and so the limitations of live migration are irrelevant. |
Some MSRs are architectural; you can rely on CPUID bits declaring their presence. In the case of MSR_HWCR, you can rely on the fact that you're on AMD and Long Mode is available. Architectural MSRs are described in the APM, Vol2 System Programming. Some MSRs, including MSR_HWCR sadly, have a mix of architectural and model specific bits. The TSC_FREQ_SEL is excluded from the list of architectural bits in APM Vol2 3.2.10 Hardware Configuration Register (HWCR), so it is model specific. i.e. the meaning of that bit is not (or might not be) the same across different AMD CPUs. The C/Pstate control MSRs are entirely model specific. And this is the problem. When virtualised, you get given a Family/Model/Stepping at boot, but as you migrate around, you will end up on different hardware, and the meaning of those bits do change. Also, as you migrate around, the P0 frequency will change, which is why VMs specifically don't get given that info in the first place (in the general case - if you're willing to never migrate then it's safe to pass through more hardware details). Under virt, if you're using the Family/Model/Stepping for anything more than diagnostic information, you're in for an unwelcome surprise when migrating. |
Thanks Demi and Andrew for the clarifications. When viewed from the perspective of migration, then this makes a lot more sense, and I agree with the explanations given. I'll take a look where OpenBSD is making these assumptions and see if I can clean those up. |
OpenBSD 7.3 will unconditionally access HWCR if the TSC is reported as Invariant, and it will then attempt to also unconditionally access PSTATE0 if HWCR.TscFreqSel is set (currently the case on Xen). The motivation for exposing HWCR.TscFreqSel was to avoid warning messages from Linux. It has been agreed that Linux should be changed instead to not complaint about missing HWCR.TscFreqSel when running virtualized. The relation between HWCR.TscFreqSel and PSTATE0 is not clearly written down in the PPR, but it's natural for OSes to attempt to fetch the P0 frequency if the TSC is stated to increment at the P0 frequency. Exposing PSTATEn (PSTATE0 at least) with all zeroes is not a suitable solution because the PstateEn bit is read-write, and OSes could legitimately attempt to set PstateEn=1 which Xen couldn't handle. Furthermore, the TscFreqSel bit is model specific and was never safe to expose like this in the first place. At a minimum it should have had a toolstack adjustment to know not to migrate such a VM. Therefore, simply remove the bit. Note the HWCR itself is an architectural register, and does need to be accessible by the guest. Since HWCR contains both architectural and non-architectural bits, going forward care must be taken to assert the exposed value is correct on newer CPU families. Reported-by: Solène Rapenne <solene@openbsd.org> Link: QubesOS/qubes-issues#8502 Fixes: 14b95b3 ('x86/AMD: expose HWCR.TscFreqSel to guests') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
@marmarek Fix has gone into upstream Xen: https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e4ca4e261da3fdddd541c3a9842b1e9e2ad00525 |
I updated my dom0 and OpenBSD starts again. Thanks everyone who got involved to solve this! |
OpenBSD 7.3 will unconditionally access HWCR if the TSC is reported as Invariant, and it will then attempt to also unconditionally access PSTATE0 if HWCR.TscFreqSel is set (currently the case on Xen). The motivation for exposing HWCR.TscFreqSel was to avoid warning messages from Linux. It has been agreed that Linux should be changed instead to not complaint about missing HWCR.TscFreqSel when running virtualized. The relation between HWCR.TscFreqSel and PSTATE0 is not clearly written down in the PPR, but it's natural for OSes to attempt to fetch the P0 frequency if the TSC is stated to increment at the P0 frequency. Exposing PSTATEn (PSTATE0 at least) with all zeroes is not a suitable solution because the PstateEn bit is read-write, and OSes could legitimately attempt to set PstateEn=1 which Xen couldn't handle. Furthermore, the TscFreqSel bit is model specific and was never safe to expose like this in the first place. At a minimum it should have had a toolstack adjustment to know not to migrate such a VM. Therefore, simply remove the bit. Note the HWCR itself is an architectural register, and does need to be accessible by the guest. Since HWCR contains both architectural and non-architectural bits, going forward care must be taken to assert the exposed value is correct on newer CPU families. Reported-by: Solène Rapenne <solene@openbsd.org> Link: QubesOS/qubes-issues#8502 Fixes: 14b95b3 ('x86/AMD: expose HWCR.TscFreqSel to guests') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> master commit: e4ca4e2 master date: 2023-09-18 15:07:49 +0200
OpenBSD 7.3 will unconditionally access HWCR if the TSC is reported as Invariant, and it will then attempt to also unconditionally access PSTATE0 if HWCR.TscFreqSel is set (currently the case on Xen). The motivation for exposing HWCR.TscFreqSel was to avoid warning messages from Linux. It has been agreed that Linux should be changed instead to not complaint about missing HWCR.TscFreqSel when running virtualized. The relation between HWCR.TscFreqSel and PSTATE0 is not clearly written down in the PPR, but it's natural for OSes to attempt to fetch the P0 frequency if the TSC is stated to increment at the P0 frequency. Exposing PSTATEn (PSTATE0 at least) with all zeroes is not a suitable solution because the PstateEn bit is read-write, and OSes could legitimately attempt to set PstateEn=1 which Xen couldn't handle. Furthermore, the TscFreqSel bit is model specific and was never safe to expose like this in the first place. At a minimum it should have had a toolstack adjustment to know not to migrate such a VM. Therefore, simply remove the bit. Note the HWCR itself is an architectural register, and does need to be accessible by the guest. Since HWCR contains both architectural and non-architectural bits, going forward care must be taken to assert the exposed value is correct on newer CPU families. Reported-by: Solène Rapenne <solene@openbsd.org> Link: QubesOS/qubes-issues#8502 Fixes: 14b95b3 ('x86/AMD: expose HWCR.TscFreqSel to guests') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
How to file a helpful issue
Qubes OS release
4.2-RC3
Brief summary
The amd64boot iso for OpenBSD doesn't boot anymore since I upgraded to 4.2, the same happened for my OpenBSD HVMs that were working fine just before the upgrade.
I'd say it's not OpenBSD fault here because it was working just before the upgrade, and it's still booting fine on any other virtualization software.
7.3 is the last release to date.
Steps to reproduce
qvm-start OpenBSD --cdrom="web:/home/user/cd73.iso"
= boot the OpenBSD qube using cd73.iso downloaded in qube web (adapt the names)Expected behavior
it boots
Actual behavior
boot is failing
The text was updated successfully, but these errors were encountered: