Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmm: allow coredump no need pause firstly #5604

Merged
merged 2 commits into from
Jul 31, 2023

Conversation

up2wing
Copy link
Contributor

@up2wing up2wing commented Jul 18, 2023

It makes sense pause/resume in coredump() method, so api caller doesn't need to pause vm firstly.

Also, add integration test of this modification.

@up2wing up2wing requested a review from a team as a code owner July 18, 2023 02:11
Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for this? Is pausing the VM first really that much of an inconvenience - are you worried about a race?

Also I would be worried that pausing is actually an async operation - you need to wait for the virtio/CPU threads to spin round their loop and notice that they should be paused. With the separate pause HTTP action there there is less likely to be a race. Possibly we should block the pause until everything is paused but that is quite a lot of work.

}
VmState::Paused => {}
_ => {
return Err(GuestDebuggableError::Coredump(anyhow!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message doesn't make any sense now.

@up2wing
Copy link
Contributor Author

up2wing commented Jul 18, 2023

Thanks for you quick reply, Rob.

What's the motivation for this? Is pausing the VM first really that much of an inconvenience - are you worried about a race?

Yes, it may not be so convenient to do coredump now. One should do pause -> coredump -> resume every time for now, and with this patch, just call coredump one command :)

Also, qemu does stop vm operation in qmp dump command, which may be used for our reference.

Also I would be worried that pausing is actually an async operation - you need to wait for the virtio/CPU threads to spin round their loop and notice that they should be paused. With the separate pause HTTP action there there is less likely to be a race. Possibly we should block the pause until everything is paused but that is quite a lot of work.

Yes, I didn't consider the async situation, thanks for pointing this.

I looked into the pause trait method implemented in each cpu and device, maybe it's not hard to sync pause if I am right.

For cpu, we can use join_thread() in VcpuState to wait vcpu thread loop exit. For virtio device, almost every of them call pause() in VirtioCommon, which calls paused_sync wait() to make sure thread in virtio device exits safely.

@rbradford
Copy link
Member

For cpu, we can use join_thread() in VcpuState to wait vcpu thread loop exit. For virtio device, almost every of them call pause() in VirtioCommon, which calls paused_sync wait() to make sure thread in virtio device exits safely.

Great, please create an issue (or PR) to ensure that pause is synchronised. We can then revisit this PR as I think that the coredump is only really useful if you can be sure that everything is correctly paused.

@up2wing
Copy link
Contributor Author

up2wing commented Jul 18, 2023

For cpu, we can use join_thread() in VcpuState to wait vcpu thread loop exit. For virtio device, almost every of them call pause() in VirtioCommon, which calls paused_sync wait() to make sure thread in virtio device exits safely.

Great, please create an issue (or PR) to ensure that pause is synchronised. We can then revisit this PR as I think that the coredump is only really useful if you can be sure that everything is correctly paused.

Okay, I will have a try.
Thx :)

@up2wing
Copy link
Contributor Author

up2wing commented Jul 26, 2023

I have rebased from latest upstream including the synchronous pause patch, please review again :)

vmm/src/vm.rs Outdated
VmState::Paused => {}
_ => {
return Err(GuestDebuggableError::Coredump(anyhow!(
"Trying to coredump while VM is running"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this error message accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx :)

@@ -36,6 +36,8 @@ pub struct DumpState {
pub enum GuestDebuggableError {
Coredump(anyhow::Error),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is a better commit message:

vmm: Automatically pause VM for coredump

If the VMM is not already paused then pause the VM prior to executing the coredump and then resume it after. If the VM is already paused then the original state is maintained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx :)

If the VMM is not already paused then pause the VM prior to executing
the coredump and then resume it after. If the VM is already paused then
the original state is maintained.

Signed-off-by: Yi Wang <foxywang@tencent.com>
Add integration test of coredump with no need pause.

As file of coredump has been tested in test_coredump(), so this
patch only test vm state after coredump.

Signed-off-by: Yi Wang <foxywang@tencent.com>
@rbradford rbradford merged commit 9da7435 into cloud-hypervisor:main Jul 31, 2023
21 checks passed
@up2wing up2wing deleted the dump branch August 1, 2023 01:57
@peng6662001
Copy link
Contributor

@up2wing Could you please share a way to verify the patches?
I use ./scripts/dev_cli.sh tests --integration -- --test-filter test_coredump_no_pause on aarch64 server.
But it returns:

running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

How to enable the feature of guest_debug in the case?

@up2wing
Copy link
Contributor Author

up2wing commented Sep 21, 2023

Coredump only supports x86_64 for now.
@peng6662001

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants