-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch: x86_64: re-enable KVM_FEATURE_ASYNC_PF_INT in CPUID #5626
Conversation
The commit b92fe64 (vmm: cpu: Disable KVM_FEATURE_ASYNC_PF_INT in CPUID) disabled APF (Asynchronous Page Fault) mechanism to address problem that makes vcpu thread spin 100%. As the actual issue is in KVM, which has been merged in commit 2f15d027c05f (KVM: x86: Properly handle APF vs disabled LAPIC situation) since 2021, so it's okay to re-enable APF now. Signed-off-by: Yi Wang <foxywang@tencent.com>
@up2wing Please add a commit that updates README.md with the minimum released kernel that this applies to. Thanks! |
Done :) |
README.md
Outdated
@@ -79,7 +79,7 @@ The following sections describe how to build and run Cloud Hypervisor. | |||
## Host OS | |||
|
|||
For required KVM functionality the minimum host kernel version is 4.11. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will CH work with 4.11 after this change? Otherwise this is the version that should be bumped!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, what's the difference of "required KVM functionality" and "adequate performance" please? I am not sure about this.
If we update the "required KVM functionality" version to 5.13, then the "adequate performance" version should update to 5.13 together? I suppose so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the older version was the minimum that would work - lower than that it should bail on a failure to turn on one of the required extensions. The problem is that we need an new version due to a bug in the kernel that we can't easily detect - hence why that CPUID bit was disabled.
What is the impact of running with this CPUID bit turned off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the impact of running with this CPUID bit turned off?
When the guest vcpu access memory swapped out by host its execution is suspended until memory is swapped back. Asynchronous page fault is a way to try and use guest vcpu more efficiently by allowing it to execute other tasks while page is brought back into memory[1]
So the performance of guest may be impacted if the APF is disabled.
Though it's difficult to decide kernel version, how about add a option in --cpus features=
like amx
? Disable APF by default and turn it on if people need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not have the option. If we move the minimum supported version what impact will that have on our users i.e. what host distributions will be excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply.
So, do you mean update both of the minimum and recommended host kernel to 5.13?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, lets just do that!
As the commit 2f15d027c05f (KVM: x86: Properly handle APF vs disabled LAPIC situation) has been merged into kernel 5.13, so update the minimum performance host kernel version to that. Signed-off-by: Yi Wang <foxywang@tencent.com>
The commit b92fe64 (vmm: cpu: Disable KVM_FEATURE_ASYNC_PF_INT in CPUID) disabled APF (Asynchronous Page Fault) mechanism to address problem that makes vcpu thread spin 100%. As the actual issue is in KVM, which has been merged in commit 2f15d027c05f (KVM: x86: Properly handle APF vs disabled LAPIC situation) since 2021, so it's okay to re-enable APF now.