Skip to content

over-eager to destroy VM when device start() fails #1115

@iximeow

Description

@iximeow

device start is a fallible operation! we try to start devices in propolis-server:

// Send synchronous start commands to all devices.
for (name, dev) in objects.device_map() {
info!(self.log, "sending start request to {}", name);
let res = dev.start();
if let Err(e) = res {
error!(
self.log, "device start() returned an error";
"device" => %name,
"error" => %e
);
return VmStartOutcome::Failed;
}
}

if a device fails to start (or a block backend, below, fails to start), we'll then return VmStartOutcome::Failed and eventually eventually get to HandleEventOutcome::Exit. then we set_rundown() and drop the StateDriver, eventually getting through

impl Drop for VmObjects {
fn drop(&mut self) {
// Signal to these objects' owning VM that rundown has completed and a
// new VM can be created.
//
// It is always safe to complete rundown at this point because the state
// driver ensures that if it creates VM objects, then it will not drop
// them without first moving the VM to the Rundown state.
let parent = self.parent.clone();
tokio::spawn(async move {
parent.complete_rundown().await;
});
}
}

at this point we'll have dropped the Machine, uninstalled the guest's memory (and MSI-X handle!) from devices, and dropped everything. some devices will have been started, some will not yet have started. devices that have started will have some parts dropped by Propolis, but spawned threads and spawned tasks may not be cancelled, stopped, joined on, etc.

#1110 is, in part, because the vsock poller expects that once it has memory, memory only goes away after the device is paused and the poller thread is told to exit. we probably should expect that Propolis embedders follow the valid state transitions described by Indicator, and that if a device is started it must be paused before dropping the Machine, that a device is halted when dropped, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions