Skip to content

Commit 55105db

Browse files
junjiemao1lijinxia
authored andcommitted
DM: notify VHM request complete after pausing the VM
It is necessary to notify the VHM and hypervisor on the completion of a VHM request even when the UOS is in suspend or system reset mode because the VHM and hypervisor rely on the notification to reset their own states on the request. Currently the VHM request state is checked against REQ_STATE_PROCESSING instead of REQ_STATE_COMPLETE when handling system reset or suspend/resume, leading to a completed request unnotified, and causing the HV to complain on an occupied VHM request when it raises a new one. This patch fixes this issue by properly notifying completed requests to the VHM & hypervisor. Some concerns are raised during a discussion on the potential races which does not hurt for now but may in the future. These considerations and potential solutions are documented as comments for future reference. Tracked-On: #895 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Yin Fengwei <fengwei.yin@intel.com>
1 parent 4753da4 commit 55105db

File tree

1 file changed

+46
-12
lines changed

1 file changed

+46
-12
lines changed

devicemodel/core/main.c

Lines changed: 46 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -395,16 +395,12 @@ handle_vmexit(struct vmctx *ctx, struct vhm_request *vhm_req, int vcpu)
395395
(*handler[exitcode])(ctx, vhm_req, &vcpu);
396396
atomic_store(&vhm_req->processed, REQ_STATE_COMPLETE);
397397

398-
/* If UOS is not in suspend or system reset mode, we don't
399-
* need to notify request done.
400-
*
401-
* But there is little difference between reset and suspend:
402-
* - We don't need to notify request done for reset because reset
403-
* doesn't care it.
404-
* - We don't need to notify request done for suspend because
405-
* guest will offline all APs, write PM register to trigger
406-
* PM. Then the PM register writting will trigger the latest
407-
* vmexit and it doesn't care request done either.
398+
/* We cannot notify the VHM/hypervisor on the request completion at this
399+
* point if the UOS is in suspend or system reset mode, as the VM is
400+
* still not paused and a notification can kick off the vcpu to run
401+
* again. Postpone the notification till vm_system_reset() or
402+
* vm_suspend_resume() for resetting the ioreq states in the VHM and
403+
* hypervisor.
408404
*/
409405
if ((VM_SUSPEND_SYSTEM_RESET == vm_get_suspend_mode()) ||
410406
(VM_SUSPEND_SUSPEND == vm_get_suspend_mode()))
@@ -526,7 +522,42 @@ vm_system_reset(struct vmctx *ctx)
526522
struct vhm_request *vhm_req;
527523

528524
vhm_req = &vhm_req_buf[vcpu_id];
529-
if ((atomic_load(&vhm_req->processed) == REQ_STATE_PROCESSING) &&
525+
/*
526+
* The state of the VHM request already assigned to DM can be
527+
* COMPLETE if it has already been processed by the vm_loop, or
528+
* PROCESSING if the request is assigned to DM after vm_loop
529+
* checks the requests but before this point.
530+
*
531+
* Unless under emergency mode, the vcpu writing to the ACPI PM
532+
* CR should be the only vcpu of that VM that is still
533+
* running. In this case there should be only one completed
534+
* request which is the APIC PM CR write. Notify the completion
535+
* of that request here (after the VM is paused) to reset its
536+
* state.
537+
*
538+
* When handling emergency mode triggered by one vcpu without
539+
* offlining any other vcpus, there can be multiple VHM requests
540+
* with various states. Currently the context of that VM in the
541+
* DM, VHM and hypervisor will be destroyed and recreated,
542+
* causing the states of VHM requests to be dropped.
543+
*
544+
* TODO: If the emergency mode is handled without context
545+
* deletion and recreation, we should be careful on potential
546+
* races when reseting VHM request states. Some considerations
547+
* include:
548+
*
549+
* * Use cmpxchg instead of load+store when distributing
550+
* requests.
551+
*
552+
* * vm_reset in VHM should clean up the ioreq bitmap, while
553+
* vm_reset in the hypervisor should cleanup the states of
554+
* VHM requests.
555+
*
556+
* * vm_reset in VHM should hold a mutex to block the
557+
* request distribution tasklet from assigned more
558+
* requests before VM reset is done.
559+
*/
560+
if ((atomic_load(&vhm_req->processed) == REQ_STATE_COMPLETE) &&
530561
(vhm_req->client == ctx->ioreq_client))
531562
vm_notify_request_done(ctx, vcpu_id);
532563
}
@@ -558,7 +589,10 @@ vm_suspend_resume(struct vmctx *ctx)
558589
struct vhm_request *vhm_req;
559590

560591
vhm_req = &vhm_req_buf[vcpu_id];
561-
if ((atomic_load(&vhm_req->processed) == REQ_STATE_PROCESSING) &&
592+
/* See the comments in vm_system_reset() for considerations of
593+
* the notification below.
594+
*/
595+
if ((atomic_load(&vhm_req->processed) == REQ_STATE_COMPLETE) &&
562596
(vhm_req->client == ctx->ioreq_client))
563597
vm_notify_request_done(ctx, vcpu_id);
564598
}

0 commit comments

Comments
 (0)