-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot launch hard RT user VM with Ubuntu 20.04.3 and preempt-RT kernel 5.10.52-rt47 #6950
Comments
@christophsusen thanks! |
Hi @fuzhongl, thanks for your reply! I applied the patch you mentioned to the preempt-RT kernel running in my user VM and set
So far I did not wait long enough to check if the VM starts successfully at all. Do you have further suggestions on how to improve the situation? Best regards, |
Usually, guest_flag isn't needed for post VM:
To
BR. |
I made the changes to the scenario configuration and rebuild the hypervisor. Unfortunately, the start-up process of the VM is still awfully slow. EDIT: I played around a little bit with the parameters in the launch script for the RTVM. When removing the Thanks for your help! Best regards, |
Since I read at https://projectacrn.github.io/latest/tutorials/rtvm_performance_tips.html?highlight=lapic_pt#mandatory-options-for-an-rtvm that Best regards, |
Could you please disable Hyper-thread in BIOS and re-generate the board.xml? Thanks! BR. |
Hi @fuzhongl, thanks for your answer! I don't think that my CPU even supports hyper-threading. Here is the output that I get from
Best regards, |
Could you please input Thanks! BR. |
This is the output of
Please note that I changed the number of CPUs for the RTVM to 4 while experimenting with the problem. This is the output I get when running
The VM is not completely stuck at that point. There are new messages being printed really slowly. However, I never had the VM start completely even when I waited very long. Best regards, |
@mcao6 Could you please help to check the boot issue for RTVM? BR. |
@christophsusen
from the log: the time seems spending on FS module. but not sure why. |
Hi @mgcao, thanks for your quick answer! I added Afterwards, I removed
|
from the comparison, I doubt two points:
please attach your launch RTVM script here; |
also you can: ps -aux | grep dm |
Hi @mgcao, I'm sorry, yes, I added a passthrough NIC while experimenting and also tried using a passthrough SATA disk for the RTVM instead of virtio-blk. However, that did not change anything. The non-RTVM boots fine with passthrough NIC and SATA but the RTVM still has the same problems. Here is the launch script used for the RTVM to generate the
Here is the output of
And, finally, the output of Thanks a lot for helping me! |
to be honest, this is very wired issue, we need more try and log info:
More log info:
in ACRN console before you launch: vmexit clear |
I removed the lines you mentioned from the launch script. The new launch script looks like this:
Here is everything that is printed on the ACRN console until the point where the starting process becomes really slow:
Here is the log info from
I will immediately try to apply the patch to my build. Hopefully, I have enough knowledge to do that. Should I execute |
yes, in ACRN shell. |
I applied the patch. I ran
|
VMEXIT/0x28 VM0/vCPU0 VM0/vCPU1 VM0/vCPU2 VM0/vCPU3 VM2/vCPU0 VM2/vCPU1 VM2/vCPU2 VM2/vCPU3 This one shall not happens VM2/vCPU0, and so many times, let me check it. |
Happy to hear that you found something! Thanks again for helping me out! |
@christophsusen help to confirm: the rtvm kernel log: after the RTVM bootup, when it is stuck (go to a) send out the info. I'd like to confirm the data. |
@mgcao here you find the info that you asked for:
|
just check the latest data about VMEXIT/028, there is no data on VMEXIT/0x28 VM2 CPUs, it is fine. the former data is from the initial time. So my doubt is wrong. No obvious issue in ACRN hypervisor now. Maybe we need check where the kernel goes and what task caused the long time stuck. |
it could be a complex debugging work. I'm not sure if you're familiar with the linux kernel. between them, could be something wrong caused the cpu stuck. |
Okay, thanks for your help! I don't think that I know enough about the Linux kernel to do that kind of debugging. Do you have a suggestion on which guest OS to use to get an ACRN VM with RT capabilities up and running easily? I would rather change to a different guest OS or kernel than doing the kernel debugging. |
maybe this data: VMEXIT/0x1e --> it cause 363129 us, but less than 1 second, however it is not normal |
I'm not sure if you can use ACRNTrace to catch the data, parser it, it is a port IO access; or you can do more test, to check if each time it case long time, usually it shall be <1 ms. |
@christophsusen you can ask @fuzhongl if could offer a virtio-img for RTVM to test again. |
@christophsusen BR. |
How did you build your RT kernel by the way (what config are you using)? I see you also use an initrd with it, have you tried to have all drivers compiled in instead and get rid of that initrd? |
@fuzhongl, thanks for the offer! I will try to make a clean user VM image later today or tomorrow and send you a link to download it afterwards. |
Hi @gvancuts, as I started to experiment with ACRN, I first tried to configure a non-RTVM with Ubuntu 20.04. After I got this to work, I tried to use this VM image as a starting point for the RTVM and compiled the preempt RT kernel inside the Ubuntu VM. Here are all the commands that I executed to do this:
The only thing I changed in the user VM from that point are the kernel command line parameters. I think, the last set I used is the following: Regarding your second question: I have not tried to get rid of initrd. If that could potentially solve the issue, I have to figure out how to do this first. I actually never compiled a linux kernel except for this RT kernel for the VM. 😅 |
Thanks @christophsusen ! @mgcao , do we have a kernel config we could recommend? Should we have @christophsusen to compile and try use https://github.com/projectacrn/acrn-kernel/tree/5.10/preempt-rt isntead (not sure whether we may have other ACRN patches in there)? You do not need an initrd if all the essential drivers are built into the kernel, so if your kernel config is right, you can avoid it. |
@gvancuts, @mgcao, I can't find a config file in https://github.com/projectacrn/acrn-kernel/tree/5.10/preempt-rt. Do I have to generate such a config file by myself from scratch or can you recommend something to get started with? Thanks a lot for taking so much time to help me out, much appreciated! |
We have one for the 4.19 kernel here. But I have not played with this in some time and so I do not know if you can use it with the 5.10 kernel. @fuzhongl , can you share the kernel config file we use to test realtime performace? |
Please try it: rtvm_config.gz |
Thanks, @fuzhongl, @gvancuts and @mgcao! I compiled the kernel with the config file you provided and tried to use it with the non-RTVM and with the RTVM launch script. The former one works fine, as before. Unfortunately, when using the launch script for the RTVM, the starting process gets stuck again. Here is the output that I get on the ACRN console after executing
|
Just in case this might help to narrow down the issue: I tried to launch the Ubuntu user VM with Xenomai kernel 4.19.59 as described on this page in the ACRN documentation. Using this kernel, the RTVM boots without any problems. So it seems like the problem is somewhere in my kernel configuration. Do you have advice for me on how to configure a preempt-RT kernel maybe based on the working configuration for the Xenomai kernel? |
After a long time, I was able to work on this issue again and finally got everything to work. I pursued the following steps. Just in case, this might help someone else.
Thanks a lot for your help, @fuzhongl, @gvancuts and @mgcao! |
Describe the bug
Cannot launch hard RT user VM with Ubuntu 20.04.3 and preempt-RT kernel 5.10.52-rt47. The user VM gets stuck in the starting process when using the launch script for an RT-VM. When a launch script for a non-RT-VM is used with the same image, the VM starts successfully.
Platform
Dell OptiPlex 5070 with Intel Core i7-9700, 8 GB RAM and 256 GB NVME SSD. The board configuration file can be found here.
Codebase
ACRN kernel v2.6
ACRN hypervisor v2.6
SOS Ubuntu 20.04.3
UOS Ubuntu 20.04.3 with preempt-RT kernel 5.10.52-rt47
Scenario
Custom scenario derived from the industry scenario. Scenario contains one non-RT-VM (VM number one) and one RT-VM (VM number two). These VMs are basically identical to VMs three and two from the industry scenario.
To Reproduce
Steps to reproduce the behavior:
Expected Behavior
From my point of understanding, my configuration should work and the RT user VM should start. If anything with my configuration is wrong, please let me know.
Additional Context
Thanks in advance for your help!
I changed the file ending to .txt for all uploaded files since GitHub does not accept .xml and .sh files.
The text was updated successfully, but these errors were encountered: