New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qubes is unusable on Radeon HD 7xxx+ (and probably nVidia Maxwell+ GPUs) due to Glamor XShmPutImage being extremely slow #1133

Closed
qubesuser opened this Issue Aug 19, 2015 · 8 comments

Comments

Projects
None yet
3 participants
@qubesuser

Installing Qubes on Radeon HD 7xxx and newer (and nVidia Maxwell and newer) cards results in an unusable desktop.

This is because these cards require the GLAMOR GL-based X11 acceleration path, and that code is very slow with XShmPutImage, at least in the ancient Xorg version you are shipping in the Fedora 20 based dom0.

A very easy way to reproduce it is to boot with the XFCE desktop on any Radeon 7xxx or later card, start a full-screen browser in any AppVM, and then quickly move a smaller dom0 terminal window over it. The dom0 terminal window will "whiten" parts of the underlying AppVM window and redrawing those parts will happen in slow motion and take literally seconds or even minutes.

If you start qubes-guid with debug output, you'll see that it is doing lots of XShmPutImage calls, and they are very slow. This is because old Xorg versions (probably including the one being shipped in Qubes) upload the whole texture on each such call rather than just the damaged rectangle and in general this path has higher overhead in the GLAMOR acceleration architecture.

Upgrading the Xorg packages in dom0 to the Fedora 23 or Rawhide versions may improve the situation, and in general the OS should be the latest version rather than Fedora 20.

The more general issue is that you are doing the GPU memory mapping wrong, because the VMs should be mapping GPU memory exposed by dom0 (in a secure way) rather than the opposite as is currently done, so that the data is written directly to VRAM or at least to system RAM that is already setup to be sent to the GPU (proper format, mapped in IOMMU, and so on).

Also you should at least optionally use OpenGL in the dom0 guid since newer GPUs don't have any non-GL 2D acceleration and thus you are using OpenGL in Xorg via GLAMOR anyway and you might as well cut out the middleman (which also removes the need for the shm preload hack).

You should also look at getting rid of Xorg in dom0 and using Wayland instead.

BTW, installing the proprietary AMD or nVidia drivers also probably fixes the issue, but that's of course undesirable from a security and reliability standpoint

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Aug 26, 2015

Member

It is all really interesting. Especially the idea about GPU memory
mapping. The question is - do you want to provide some help in
implementing this? From the first sight it looks like it needs totally new
Xorg/Wayland virtual video driver for a VM, written from scratch. It
would be challenging to do it right. But maybe I'm mistaken? Maybe for
example VirtualBox have something like this already implemented?

Anyway as a short term solution we probably will simply ship newer Xorg.
For example here is (really early) alpha version with Xorg backported
from Fedora 21:
https://ftp.qubes-os.org/iso/Qubes-R3.1-alpha1.1-x86_64-LIVE.iso
https://ftp.qubes-os.org/iso/Qubes-R3.1-alpha1.1-x86_64-LIVE.iso.asc

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Aug 26, 2015

It is all really interesting. Especially the idea about GPU memory
mapping. The question is - do you want to provide some help in
implementing this? From the first sight it looks like it needs totally new
Xorg/Wayland virtual video driver for a VM, written from scratch. It
would be challenging to do it right. But maybe I'm mistaken? Maybe for
example VirtualBox have something like this already implemented?

Anyway as a short term solution we probably will simply ship newer Xorg.
For example here is (really early) alpha version with Xorg backported
from Fedora 21:
https://ftp.qubes-os.org/iso/Qubes-R3.1-alpha1.1-x86_64-LIVE.iso
https://ftp.qubes-os.org/iso/Qubes-R3.1-alpha1.1-x86_64-LIVE.iso.asc

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@qubesuser

This comment has been minimized.

Show comment
Hide comment
@qubesuser

qubesuser Aug 29, 2015

Actually, I no longer think GPU mapping is a good idea.

There is in fact a security problem: if you let domains write on-board GPU memory they can read it too, which means that a buggy GPU hardware or driver can reveal the image data of a domain to another domain if an unexpected memory copy is somehow triggered. In the current setup either copying with the CPU or DMA with a read-only IOMMU mapping can protect against that.

With a GPU with non-Glamor acceleration performance seems really good with the current system so it doesn't really seem worthwhile although in theory it might improve performance.

QEMU/KVM has the QXL driver which does basically this, although I'm not sure how much the codebase has been audited for security. There are also systems like VMWare that even expose a virtual GPU with 3D acceleration, with the same security worries.

Anyway as a short term solution we probably will simply ship newer Xorg.

I have switched to another GPU at the moment, will see if I can test if that solves the problem. It's possible that the Fedora 21 Xorg is still too old though (mesa and libdrm need to be backported too).

Actually, I no longer think GPU mapping is a good idea.

There is in fact a security problem: if you let domains write on-board GPU memory they can read it too, which means that a buggy GPU hardware or driver can reveal the image data of a domain to another domain if an unexpected memory copy is somehow triggered. In the current setup either copying with the CPU or DMA with a read-only IOMMU mapping can protect against that.

With a GPU with non-Glamor acceleration performance seems really good with the current system so it doesn't really seem worthwhile although in theory it might improve performance.

QEMU/KVM has the QXL driver which does basically this, although I'm not sure how much the codebase has been audited for security. There are also systems like VMWare that even expose a virtual GPU with 3D acceleration, with the same security worries.

Anyway as a short term solution we probably will simply ship newer Xorg.

I have switched to another GPU at the moment, will see if I can test if that solves the problem. It's possible that the Fedora 21 Xorg is still too old though (mesa and libdrm need to be backported too).

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Aug 29, 2015

Member

On Fri, Aug 28, 2015 at 07:19:09PM -0700, qubesuser wrote:

Actually, I no longer think GPU mapping is a good idea.

There is in fact a security problem: if you let domains write on-board GPU memory they can read it too, which means that a buggy GPU hardware or driver or a malicious GPU can reveal the image data of a domain to another domain. In the current setup either copying with the CPU or DMA with a read-only IOMMU mapping can protect against that.

Malicious GPU can do that anyway - for example using DMA to dom0 memory.
IOMMU could help here, but I believe currently it is set only do domain
basis. I've seen recently some patches for Xen to improve that further
(allow domain to set more strict IOMMU map - for example to allow the
device access only dedicated buffer area), but it's far from inclusion.

But the point about buggy GPU still stands. That's why I've written
that it would be challenging - you need to make sure that GPU memory
access is not only granted when needed, but also revoked.

With a non-Glamor GPU performance seems really good with the current system so it doesn't really seem worthwhile although in theory it might improve performance.

If we'd ever think of implementing some kind of 3D acceleration, we'd
need some solution here. But currently we don't have such plans.

QEMU/KVM has the QXL driver which does basically this, although I'm not sure how much the codebase has been audited for security. There are also systems like VMWare that even expose a virtual GPU with 3D acceleration, with the same security worries.

Anyway as a short term solution we probably will simply ship newer Xorg.

I have switched to another GPU at the moment, will see if I can test if that solves the problem. It's possible that the Fedora 21 Xorg is still too old though (mesa and libdrm need to be backported too).

I'll look into it. I know the newer the better in this case, but I worry
about dependencies... Fedora 20 have mesa 10.3.3, Fedora 21 - 10.4.4,
Fedora 22 - 10.6.0. libdrm: 20 - 2.4.58, 21 - 2.4.60, 22 - 2.4.62. I
didn't check the changelogs, but do you think it worth the effort here?

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Aug 29, 2015

On Fri, Aug 28, 2015 at 07:19:09PM -0700, qubesuser wrote:

Actually, I no longer think GPU mapping is a good idea.

There is in fact a security problem: if you let domains write on-board GPU memory they can read it too, which means that a buggy GPU hardware or driver or a malicious GPU can reveal the image data of a domain to another domain. In the current setup either copying with the CPU or DMA with a read-only IOMMU mapping can protect against that.

Malicious GPU can do that anyway - for example using DMA to dom0 memory.
IOMMU could help here, but I believe currently it is set only do domain
basis. I've seen recently some patches for Xen to improve that further
(allow domain to set more strict IOMMU map - for example to allow the
device access only dedicated buffer area), but it's far from inclusion.

But the point about buggy GPU still stands. That's why I've written
that it would be challenging - you need to make sure that GPU memory
access is not only granted when needed, but also revoked.

With a non-Glamor GPU performance seems really good with the current system so it doesn't really seem worthwhile although in theory it might improve performance.

If we'd ever think of implementing some kind of 3D acceleration, we'd
need some solution here. But currently we don't have such plans.

QEMU/KVM has the QXL driver which does basically this, although I'm not sure how much the codebase has been audited for security. There are also systems like VMWare that even expose a virtual GPU with 3D acceleration, with the same security worries.

Anyway as a short term solution we probably will simply ship newer Xorg.

I have switched to another GPU at the moment, will see if I can test if that solves the problem. It's possible that the Fedora 21 Xorg is still too old though (mesa and libdrm need to be backported too).

I'll look into it. I know the newer the better in this case, but I worry
about dependencies... Fedora 20 have mesa 10.3.3, Fedora 21 - 10.4.4,
Fedora 22 - 10.6.0. libdrm: 20 - 2.4.58, 21 - 2.4.60, 22 - 2.4.62. I
didn't check the changelogs, but do you think it worth the effort here?

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@qubesuser

This comment has been minimized.

Show comment
Hide comment
@qubesuser

qubesuser Aug 29, 2015

for example using DMA to dom0 memory. IOMMU could help here, but I believe currently it is set only do domain basis. I've seen recently some patches for Xen to improve that further

Using KVM instead of Xen would give that for free since "dom0" runs directly on the hardware and apparently qemu has a virtual IOMMU that can be exposed to guests. Now that Qubes mostly uses libvirt it might be worth trying it.

Anyway, this is getting a bit offtopic.

Regarding the original topic, hopefully I'll find the time to test some newer backported Xorg/mesa/libdrm stack on an affected GPU, or perhaps someone else can.

If not fixed before release, it definitely needs to be noted in the release notes and it might even make sense to show a warning popup on boot since otherwise the user will think that Qubes sucks rather than that he needs to use another GPU.

for example using DMA to dom0 memory. IOMMU could help here, but I believe currently it is set only do domain basis. I've seen recently some patches for Xen to improve that further

Using KVM instead of Xen would give that for free since "dom0" runs directly on the hardware and apparently qemu has a virtual IOMMU that can be exposed to guests. Now that Qubes mostly uses libvirt it might be worth trying it.

Anyway, this is getting a bit offtopic.

Regarding the original topic, hopefully I'll find the time to test some newer backported Xorg/mesa/libdrm stack on an affected GPU, or perhaps someone else can.

If not fixed before release, it definitely needs to be noted in the release notes and it might even make sense to show a warning popup on boot since otherwise the user will think that Qubes sucks rather than that he needs to use another GPU.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Aug 29, 2015

Member

On Sat, Aug 29, 2015 at 03:14:43AM -0700, qubesuser wrote:

for example using DMA to dom0 memory. IOMMU could help here, but I believe currently it is set only do domain basis. I've seen recently some patches for Xen to improve that further

Using KVM instead of Xen would give that for free since "dom0" runs directly on the hardware and apparently qemu has a virtual IOMMU that can be exposed to guests. Now that Qubes mostly uses libvirt it might be worth trying it.

This will give also obligatory qemu in dom0, without any reasonable way
to sandbox it (no support for stub domains). Check its latest security
advisories to see why we have avoided it all the time...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Aug 29, 2015

On Sat, Aug 29, 2015 at 03:14:43AM -0700, qubesuser wrote:

for example using DMA to dom0 memory. IOMMU could help here, but I believe currently it is set only do domain basis. I've seen recently some patches for Xen to improve that further

Using KVM instead of Xen would give that for free since "dom0" runs directly on the hardware and apparently qemu has a virtual IOMMU that can be exposed to guests. Now that Qubes mostly uses libvirt it might be worth trying it.

This will give also obligatory qemu in dom0, without any reasonable way
to sandbox it (no support for stub domains). Check its latest security
advisories to see why we have avoided it all the time...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@qubesuser

This comment has been minimized.

Show comment
Hide comment
@qubesuser

qubesuser Aug 29, 2015

I guess it should be possible to implement something similar to stub domains with KVM by adaptic qemu to run in the Linux seccomp sandbox, and it's also possible to run an alternative minimal KVM userland. It might be a significant undertaking though, not sure if it's worth it or whether the Xen or KVM approach is fundamentally better.

I guess it should be possible to implement something similar to stub domains with KVM by adaptic qemu to run in the Linux seccomp sandbox, and it's also possible to run an alternative minimal KVM userland. It might be a significant undertaking though, not sure if it's worth it or whether the Xen or KVM approach is fundamentally better.

@marmarek marmarek added this to the Far in the future milestone Sep 1, 2015

andrewdavidwong added a commit that referenced this issue May 31, 2016

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Jun 12, 2016

Member

@qubesuser: Did you ever have a chance to test this? Do you have any plans to work on it in the future? (Asking for tracking purposes.)

Member

andrewdavidwong commented Jun 12, 2016

@qubesuser: Did you ever have a chance to test this? Do you have any plans to work on it in the future? (Asking for tracking purposes.)

andrewdavidwong added a commit that referenced this issue Jun 12, 2016

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Mar 30, 2017

Member

Closing and untracking due to lack of response and prolonged inactivity. Feel free to comment or reopen if this is still and issue and anyone is willing to work on it.

Member

andrewdavidwong commented Mar 30, 2017

Closing and untracking due to lack of response and prolonged inactivity. Feel free to comment or reopen if this is still and issue and anyone is willing to work on it.

andrewdavidwong added a commit that referenced this issue Mar 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment