Skip to content
Switch branches/tags
Go to file
Cannot retrieve contributors at this time
259 lines (196 sloc) 10.8 KB
---===[ Qubes Security Bulletin #23 ]===---
December 17, 2015
Race condition bugs in Xen code (XSA-155 and XSA-166), other Xen bugs
Quick Summary
The Xen Security Team has announced several bugs affecting the
hypervisor, as well as some of the Xen backends. The one most
interesting of these is the XSA 155 [1]:
| The compiler can emit optimizations in the PV backend drivers which
| can lead to double fetch vulnerabilities. Specifically the shared
| memory between the frontend and backend can be fetched twice (during
| which time the frontend can alter the contents) possibly leading to
| arbitrary code execution in backend.
As discussed below, the problem affects also some of the Xen
frontends, which might have security implications when Xen is used in
highly decomposed systems, such as Qubes OS.
Discussion (XSA-155)
The problem results from the Xen front- and back-end code operating on
data structures located in Xen shared memory, which allows both of the
parties (located in different Xen VMs) to modify them asynchronously.
This asynchronous nature of communication would not be a problem in
itself, however, as it's generally considered a rather standard craft,
even for C programmers, how to write code that avoids potential race
conditions in such situations.
What makes this vulnerability tricky is the difficulty to see the
problems when looking at the source code. E.g. it might look as if the
value of some data from the shared page had been already (atomically)
stored in a local variable, and thus could be safely used by the
following code (after being sanitized), while in fact the actual code
generated by the compiler would still be accessing the shared memory
(which is still under the control of the other party) instead of using
the local (sanitized) copy. This creates the classic TOCTOU class of
vulnerabilities. The Xen advisory provides further details.
Arguably one could see this vulnerability as a failure of Xen exposing
a safer "API" for development of its backends. Indeed, leaving
developers with bare shared-memory mechanism might not have been the
most fortunate decision, as we see now. Perhaps Xen should have
introduced something like vchan (but for kernel code) right from
The vchan library is an inter-VM communication library originally
built by the Qubes OS Project in the early days on top of the "raw"
Xen shared memory, later improved by others and merged into the
upstream Xen a few years ago. The good thing about vchan is that it
does _not_ expose its customers to the raw shared memory, but instead
provides them with local copies of the buffers. While this might sound
like a significant performance cost at first sight, it not necessarily
must be the case -- e.g. we were able to implement our very efficient
GUI virtualization on top of vchan, and even managed to make this a
zero-copy protocol.
While it is not immediately obvious it would be trivial to rewrite all
the Xen front- and backends based on something like vchan, it's
definitely something worth a consideration. Because one thing is
certain: whenever we expose unsafe API, we can be sure someone will
write insecure code based on that API, sooner or later. This remains
true even for kernel developers.
The significance of frontend bugs for Qubes OS
It's worth stressing that the Xen Security Team decided to consider
only bugs in the backends as security-critical.
| Applying the appropriate attached patches should fix the problem for
| PV backends. Note only that PV backends are fixed; PV frontend
| patches will be developed and released (publicly) after the embargo
| date.
The logic behind this decision is not especially surprising -- most of
the Xen deployments assume all the backends to be part of the system
TCB, as they are almost always located in Dom0.
In Qubes OS, on the other hand, we have been more careful about
considering the backends as part of the TCB. E.g. we have been
distrusting the network backend right from start with our signature
NetVM(s). Also, while the default block backend (i.e. disk backend) is
located inside Dom0, and so trusted, Qubes allows to host block
backends for other devices, such as e.g. USB mass storage or ISO
images, in other, untrusted VMs. This allows e.g. to mount a
potentially malicious USB stick and have its (encrypted) filesystem
accessible in another VM, all without trusting the USB-handling code.
The above means that on a system like Qubes OS, i.e. with
highly-decomposed TCB, we should also consider the frontends, not just
the backends, for potential vulnerabilities. This might give an
impression that in this case Qubes offers worse security than a
standard deployment of Xen. That would be a false conclusion, however.
We should stress that a vulnerability in a frontend can only be
exploited by an already compromised backend, which means that a
non-decomposed Xen system must have been already fatally
In any case, we have analyzed the code for the Xen network and block
frontends (which are part of the Linux Kernel tree), and created
additional patches to address the above problem.
Lack of explicit sanitization of variables controllable by other guests
Incidentally we have also discovered some bugs that are independent
from the double fetch problem discussed thus far. The problems have
been spotted while analyzing the frontends and are manifested by lack
of proper sanitization of some of the variable that are controlled by
the corresponding backend. Again, as discussed above, given that
majority of Xen deployments assume backends to be trusted, it comes as
not-such-a-big surprise the developers were more relaxed when writing
the frontends and thus forgot to sanitize these inputs.
We think only one problem of those we've found might be classified as
a potential code execution vulnerability in the frontend. The
following code is a fragment from xennet_tx_buf_gc() function from the
Xen network frontend:
for (cons = queue->tx.rsp_cons; cons != prod; cons++) {
struct xen_netif_tx_response *txrsp;
txrsp = RING_GET_RESPONSE(&queue->tx, cons);
// ...
id = txrsp->id;
skb = queue->tx_skbs[id].skb;
// ...
queue->grant_tx_ref[id] = GRANT_INVALID_REF;
queue->grant_tx_page[id] = NULL;
// ...
We see that the txrsp points to the Xen shared memory, which is also
controlled by the network backend, running in some other (untrusted)
VM. The 'id' variable is read from this shared area and later is used
to index an array for two write operations. A malicious backend might
thus trigger quasi-arbitrary write transactions within the kernel
where the associated frontend is executing.
In Qubes we have been very careful about sanitization of any variables
that come from untrusted parties (such as from another VM) and have
been recommending use of explicit naming conventions to reflect that
clearly in the code [2][3] for all such variables. Perhaps a similar
convention might be adopted by Xen developers?
Other Xen bugs addressed by this bulletin potentially affecting Qubes OS
Apart for the XSA-155 discussed above, the Xen Security Team has also
announced several other advisories, all of which seem even more
theoretical, or have been classified as DoS only.
Among these the XSA-166 [4] deserves some attention as it seems to be
another example of the double fetch problem, this time applied to the
hypervisor operating on data exposed by an HVM guest, but according to
the analysis performed by the Xen team the attack seems to be
theoretical only (in addition to requiring a chained attack to a stub
domain first).
Others bugs disclosed and patched today include: the XSA-165 [5] with
a potential leak of the FPU stack and XMM registers, which we haven't
got time to thoroughly analyze, and several vulnerabilities which
might lead to a DoS [6][7] (we generally treat DoS-only
vulnerabilities as normal bugfixes)
The specific packages that resolve the problem discussed in this
bulletin have been uploaded to the security-testing repository:
For Qubes R2:
* New Dom0 kernel packages (kernel-3.12.40-3, kernel-qubes-vm-3.12.40-3)
* New Xen packages (xen-
* New libvchan package (qubes-libvchan-xen-2.2.12)
For Qubes R3.0:
* New Dom0 kernel packages (kernel-3.18.17-8, kernel-qubes-vm-3.18.17-8, kernel-3.19.8-101)
* New Xen packages (xen-4.4.3-11, xen-libs-4.4.3-11)
For Qubes R3.1:
* New Dom0 kernel packages (kernel-4.1.13-7)
* New Xen packages (xen-4.6.0-11, xen-libs-4.6.0-11)
The packages are to be installed in Dom0 via qubes-dom0-update command
or via the Qubes graphical manager. Also all the templates needs
updating for new xen-libs package (qubes-libvchan-xen in case of Qubes
A system restart will be required afterwards.
If you use Anti Evil Maid, you will need to reseal your secret
passphrase to new PCR values, as PCR18+19 will change because of new
Xen and kernel binaries, and because of the regenerated initramfs.
These packages will be moved to the current repository over the coming
days once they receive more testing from the community.
About Qubes Security Testing Repository
The security-testing is a new Qubes package repository that has been
introduced recently. It is disabled by default, and its purpose is to
allow better and wider testing of security critical updates, before
they make it to the "current" (default) repository.
This allows the users (rather than Qubes developers) to make the
tradeoffs of whether to install security updates early vs. wait until
they get more tested by the community. This accounts for the fact that
Qubes developers have limited ability to perform thorough testing
themselves. To help with the process we provide detailed analysis of
the security problems addressed by each QSB.
This bugs has been made available to us by Xen Security Team via the
Xen pre-discourse list.
The bugs in the Xen network frontend have been discovered and patched
by Marek Marczykowski-Górecki of the Qubes OS Project.
The analysis and discussions have been provided by Joanna Rutkowska of
the Qubes OS Project.
The Qubes Security Team