Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XEN booting is unstable #109

Open
miczyg1 opened this issue Aug 6, 2018 · 18 comments
Open

XEN booting is unstable #109

miczyg1 opened this issue Aug 6, 2018 · 18 comments
Assignees

Comments

@miczyg1
Copy link
Member

@miczyg1 miczyg1 commented Aug 6, 2018

No description provided.

@pietrushnic pietrushnic self-assigned this Aug 6, 2018
@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Aug 7, 2018

Sometimes booting hangs at:

(XEN) CPU1: No irq handler for vector e7 (IRQ -2147483648) 
(XEN) CPU2: No irq handler for vector e7 (IRQ -2147483648)
@pietrushnic
Copy link
Member

@pietrushnic pietrushnic commented Sep 22, 2019

@artur-rs is that still the case, we have some signs that newer Xen may fix that problem also it looks like @miczyg1 added some fixes. More to that regression results lost link to that issue in recent tets results.

@artur-rs
Copy link
Member

@artur-rs artur-rs commented Sep 26, 2019

@pietrushnic @miczyg1 link has been fixed. It seems that the stability problems still occur (1/30 boots), more reliable results will be delivered with the regression testing on apu2-5 platforms on the next release

@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Oct 1, 2019

Log from 100x reboots from Xen staging https://cloud.3mdeb.com/index.php/s/iyzYr78KK9BEF3f

@pietrushnic
Copy link
Member

@pietrushnic pietrushnic commented Oct 1, 2019

@miczyg1 you mean that problem was fixed upstream?

@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Oct 2, 2019

Not 100% sure yet. This log is from debug build. In case it is a timing problem or something I would also like to conduct non-debug test round.

@jpds
Copy link

@jpds jpds commented Jul 16, 2020

My apu4d4 with Debian testing as of this week with the following software versions:

  • coreboot v4.12.0.2
  • Xen 4.11
  • Linux 5.7.0

...appears to have no trouble booting Xen whereas with Debian stable it would just crash on boot.

@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Aug 25, 2020

There is a patch in coreboot that potentially could fix this issue: https://review.coreboot.org/c/coreboot/+/42434
Will be testing it soon.

@HRio
Copy link

@HRio HRio commented Jan 14, 2021

I have APU4D4 with BIOS v4.12.0.5 and iommu enabled (doing NIC pass-trough)

Can not see this problem with XEN 4.13.2

@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Jan 15, 2021

@HRio the newer versions of Xen seems to be working better. We still working on infrastructure to automatically test up-to-date versions of Xen, so we keep this open in case anybody faces the issue.

@HRio
Copy link

@HRio HRio commented Jan 15, 2021

@miczyg1 May I take this opportunity to suggest you have a look on Alpine Linux? its a perfect fit for a device like this, and we aim to keep XEN up to date.

Alpine Version Xen Version
edge 4.14.1
3.13 4.14.1
3.12 4.13.2
@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Jan 15, 2021

@HRio sure, we will take a look on that. Thank you

@mmaney
Copy link

@mmaney mmaney commented Mar 19, 2021

I've been meaning to find the time to try Xen on apu2 here, and a couple days ago I pulled the 2E4 out and swapped in a scratch 2 1/2" SSD for some clean install fun. BIOS 4.11.0.6, not quite the very latest. Current Buster minimal install (with sshd because that's smoother than using a serial link, and standard system utils). Then the testing...

Installed Xen system (4.11), omitting qemu and his many, many friends and... Just Worked. Rebooted about a dozen times, with a power cycle or two for variety. No domUs were started (or installed). Then I noticed that some possibly useful features were disabled (by default, I assume) in the BIOS, so enabled first iommu, then EHCI, with a couple reboots each time. Never failed to boot for me.

@pietrushnic
Copy link
Member

@pietrushnic pietrushnic commented Mar 19, 2021

@mmaney I believe the failure we see is when we do 100x consecutive reboots using our automated testing environment. Please check our Regression Test Results, so maybe you were lucky :)

@miczyg1
Copy link
Member Author

@miczyg1 miczyg1 commented Mar 19, 2021

Also our testing precedure for Xen is different. We are booting from network and using Xen 4.8 (yes, not so young). So definitely we need newer versions. When installed on a physical driver with 4.11 or newer, also didn't face the issues detected in the regression. We keep this issue opened until we migrate to newer Xen hypervisor in our infrastructure and ensure the problem is no longer reproducible.

@mmaney
Copy link

@mmaney mmaney commented Mar 20, 2021

Well, this all left me with a bump of curiosity, so I hacked up a lilttle script and called it from rc.local. It was supposed to stop after it had rebooted 100 times, but I never got back to check on it, and the test to stop rebooting after 100 times never tripped. It reported having rebooted for the 653rd time before I interrupted it (by choosing the recovery boot in grub).

I'd say it works fine with the current Debian Buster kernel and Xen version (on 2e4 hardware at least).

@pietrushnic
Copy link
Member

@pietrushnic pietrushnic commented Mar 20, 2021

@mmaney thanks for spending the time on testing it. I'm not sure if we using system reboot or rather shutdown/poweroff and then power on by connecting the plug in our automated validation environment. As Michał said we have to update our infrastructure and clean this outdated bug report.

@jailbird777
Copy link

@jailbird777 jailbird777 commented Mar 24, 2021

I've been running Xen (currently 4.13.2) on OpenSuSE Leap 14.x for about 18 months now on both an apu2 and an apu4 without any stability issues at all. I've even had XCP-ng 8.2.0 running without any issues. So I think the Xen stability issues might be resolved :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants