qubesdb sync blocking dom0+domU processes #25
both mfw and sysfw seem to be doing a multiread to sync on startup.
but no matter what mfw is doing different there, it should not block dom0 that hard.
filing this on both qubes and mirage side since i see room for improvement on either.
qubes side issue here: QubesOS/qubes-issues#4498
The text was updated successfully, but these errors were encountered:
The firewall doesn't keep any state between runs, so if starting it a second time is causing trouble, then dom0 must be doing something different. What happens if you kill mirage-firewall and start sys-firewall, or the other way around? Maybe there is some QubesDB shutdown protocol you're supposed to do, but mirage-firewall doesn't bother. It does sound like a bug in dom0 though.
the difference between the runs is that the "second time" is with a handful of downstream VMs already running. this means there is a lot more qubesdb rows to transfer in the initial qubesdb sync.
if this was comms by tcp, my first guess would be "the client is not flushing after ack-write" or so.
the blocking itself (and it having global impact) is surely a qubes/dom0 fault. (DoS potential)
were you able to repro it?
I assume it's this
It doesn't look to me as though this
The only thing the pselect could be waiting for from the firewall is a
I managed to reproduce the problem myself by restarting the firewall while it still had client VMs attached.
The multiread sync worked fine, but then QubesDB starts pushing updates, two at a time, with 10s between them.
Here's a bit from the strace:
So it seems that it was waiting for FD 4, which is some kind of local socket connection, not an event channel. So I guess the problem is that whatever is calling QubesDB is taking its time with the updates.
While I was looking at the code, there does also appear to be a race in the vchan path:
However, I do not believe this is the cause of the bug reported, because:
Here's a snapshot of the state of the vchan half way though the 10s timeout, taken by the firewall:
So, both sides want to be notified of writes (notify bit 1) but neither cares about reads, and both rings are empty (consumer=producer). The last time the firewall signalled an event, the state was:
So even if this isn't completely up-to-date, QubesDB must have seen that there was plenty of free space in the rings.
@marmarek : the above is based on my understanding of the vchan code from reading the source. I've documented what I learnt here: https://github.com/talex5/spec-vchan/blob/master/vchan.tla
Note the 10s delay before the
There is state diagram for block devices, which is the same for all xenbus devices, including network one: https://github.com/xen-project/xen/blob/master/xen/include/public/io/blkif.h#L452-L523
I recently debugged similar issue with misbehaving backend and 10s timeout looks familiar. It's about waiting for the backend to set XenbusStateInitWait (2) state. You can get more details in
With this, you should get full state transitions log, including what exactly was expected and failed.
Ah, that explained it! When you restart with client VMs attached,
I did, however, end up with a whole stack of
Sadly not true. The connection setup is subtly different between block and net. Block has an extra step in the backend where it tries to open the block device. This is infuriating because it prevents writing a single reference implementation of the state transition logic.
To @talex5 's point in his TLA/vchan blog, yes it would be lovely if any of this was written down.