I was double-checking my understanding of the 1110-relevant bits and happened to notice this. it's "if a guest driver behaves badly enough it'll hang a thread and prevent propolis from stopping in an orderly way" which is annoying but eventually sled-agent will come clean things up, so it's not the worst.
the relevant bit is actually VsockPollerNotify. unlike most pause()s, VsockPollerNotify::pause is a Result. the error cases here are "whatever port_send(3C) might do". then over in PciVirtioSock::pause we'll call this pause, and then wait_stopped(). but if port_send fails we might not have actually sent the notification to ask the poller to stop, so we'll never finish wait_stopped(), and we'll never finish pausing devices (assuming the typical case of trying to pause all devices).
I always get itchy about EINTR but I kind of expect that EAGAIN is the more likely case here; imagine a guest that's notified the TX queue 4096 times (or just in a loop) right as we go to stop a VM. I don't think anyone could actually see this in practice.
I was double-checking my understanding of the 1110-relevant bits and happened to notice this. it's "if a guest driver behaves badly enough it'll hang a thread and prevent propolis from stopping in an orderly way" which is annoying but eventually sled-agent will come clean things up, so it's not the worst.
the relevant bit is actually
VsockPollerNotify. unlike mostpause()s,VsockPollerNotify::pauseis aResult. the error cases here are "whateverport_send(3C)might do". then over inPciVirtioSock::pausewe'll call this pause, and thenwait_stopped(). but ifport_sendfails we might not have actually sent the notification to ask the poller to stop, so we'll never finishwait_stopped(), and we'll never finish pausing devices (assuming the typical case of trying to pause all devices).I always get itchy about
EINTRbut I kind of expect thatEAGAINis the more likely case here; imagine a guest that's notified the TX queue 4096 times (or just in a loop) right as we go to stop a VM. I don't think anyone could actually see this in practice.