Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux: add zero-copy screen capture using KMS and EGL #1758

Draft
wants to merge 14 commits into
base: master
from

Conversation

Projects
None yet
9 participants
@w23
Copy link

w23 commented Mar 18, 2019

Note: This is still a work-in-progress change. It does work reliably, but the code quality is yet to become mergeable to upstream. There are two reasons for opening this pull request now: visibility and discussion. There are things that I'd like to have feedback on. More context is here: https://obsproject.com/forum/threads/experimental-zero-copy-screen-capture-on-linux.101262/

Approach

The purpose of this change is to add zero-copy screen capture on Linux, avoiding slow XSHM copies, and also accidentally enabling Wayland screen capture.
It works by importing KMS (DRM) display framebuffer directly as GL texture, using EGL_EXT_image_dma_buf_import extension. Access to that framebuffer requires either root or CAP_SYS_ADMIN, so a helper setuid/setcap binary is used to retrieve framebuffer's fd and sendmsg() it to main OBS process.

There are two main changes:

  • add EGL OpenGL context creation path alongside older GLX. EGL is newer and has necessary extensions for importing DMA-BUF objects as GL textures. GLX-vs-EGL is user-selectable in settings, and it defaults to GLX.
  • add dmabuf_source plugin that imports KMS framebuffer as GL texture. It is added as a part of linux-capture module.

dmabuf_source

Alongside the main plugin there's obs-drmsend utility that requires either setuid or setcap cap_sys_admin+ep. It works like this:

  1. dmabuf_source plugin opens a unix domain socket and runs the obs-drmsend utility.
  2. It enumerates currently available framebuffers, acquires their DMA-BUF fds, and sends them along with metadata to this socket (fds are sent using ancillary sendmsg data with SCM_RIGHTS flag) and exits.
  3. dmabuf_source remembers these fds and allows the user to pick from available framebuffers.

Issues

  • Linking EGL still requires presence of libEGL.so on clients machine, even if it is not used. Not depending on libEGL requires splitting or programmatically loading (dlopen, dlsym) libEGL in glad and linux-capture.
  • Xcomposite source is not patched to support EGL.
  • Naming is hard. "dmabuf_source" feels weird. Common term for such functionality is "kmsgrab".
  • Running compton as X compositor messes with framebuffers, changing them too often. This breaks this screen capture method. (It is not known whether other compositors do the same)
  • xrandr/mode change wasn't tested. It will definitely break something (e.g. replace framebuffer, or change it's dimensions).
  • cursor is supported using xcursor-xcb and kind of depends that framebuffer and X11 coordinates are aligned.
  • It likely doesn't work on nvidia.
  • No synchronization with framebuffer updates is made

Further possible improvements

  • follow CRTCs (display devices), not framebuffers. Necessary to properly support xrandr/mode changes. Requires running obs-drmsend in background, and more involved IPC.
  • read cursor from planes, not from xcb. Will also support wayland.

I will strive to update the information above as I make changes.

w23 added some commits Feb 25, 2019

add draft EGL context initialization on X11
EGL is a modern replacement for GLX, and it is required for
importing/exporting DMA-BUF fds as GL textures. This is needed for
zero-copy screen grabbing and GPU encoding on Linux.

This implementation is naive and only make a first step into the right
direction.
- It is a compile-time decision for now. I plan to make it a runtime
decision using a separate libobs-opengl-egl.so module (or something).
- It breaks all current ways to capture screen on X11, because those
assume GLX. It would be possible to fix them, but I don't have any plans
to do that ATM.
add experimental linux dmabuf source input
- requires github.com/w23/drmtoy/drmsend running as root or
cap_sys_admin
- drmsend socket is hardcoded as /home/steaream/tmp/drmsend.sock
- see previous commit comment about EGL
add drmsend socket file chooser
Unfortunately QFileDialog won't show unix sockets, so user has to type
in the final filename manually.

sad
make EGL user-selectable in settings on Linux
- Detect libEGL at build time
- Create special libobs-opengl-egl for EGL
- Show renderer selection in settings on Linux
- Fallback on GLX if EGL failed
- check for GS_DEVICE_OPENGL_EGL in dmabuf_source

linux-capture/XSHM works under EGL context
linux-capture/Xcomposite is disabled under EGL context

fixes #1
dmabuf: add drmsend utility
Basically just a copy fron w23/drmtoy repo.
It is build as `obs-drmsend` executable alongside `obs`.
Note that cmake requires to be run under superuser to be able to add
CAP_SYS_ADMIN or set setuid bit on executable, so this step is commented
for now. Builder must perform that step manually, and also ensure that
this binary is ran from fs mounted w/o nosuid option.

Fix #4
dmabuf: add user settings for choosing framebuffer
Doesn't require drmtoy/enum/drmsend anymore.

Stil work-in-progress. Hardcoded paths:
- DRI card: /dev/dri/card0
- obs-drmsend executable: ./obs-drmsend
- unix socket: /tmp/drmsend.sock

Developer still needs to manually `setcap cap_sys_admin+ep obs-drmsend`
on non-nosuid mounted fs.

Framebuffers are selectable from properties screen. Framebuffer id is
stored in settings. But these ids are volatile, they can change on mode
setting, xrandr updates, compositor (compton) stuff, and they certainly
don't survive reboots.
libobs-opengl: retry EGL creation w/o debug attr
On some machines EGLContext cannot be created with
EGL_CONTEXT_OPENGL_DEBUG attribute. Retry without.
@jp9000

This comment has been minimized.

Copy link
Member

jp9000 commented Mar 18, 2019

Hi, thanks for making this PR, I've been looking forward for a higher performance way to capture on Linux.

Just to warn you, we're a bit backlogged on PRs at the moment, so it might be quite some time before we'll be able to review it. Looking forward to it though.

@TheMuso

This comment has been minimized.

Copy link
Contributor

TheMuso commented Mar 18, 2019

@w23

This comment has been minimized.

Copy link
Author

w23 commented Mar 18, 2019

Just to warn you, we're a bit backlogged on PRs at the moment, so it might be quite some time before we'll be able to review it. Looking forward to it though.

No problem. I think it will take me at least a few more weeks to clean this out of WIP status, given how little free time I have to work on this.
In the meantime I'd still appreciate a high-level feedback on whether I'm doing things right or moving the system in the right way. I'll update the description with more details a bit later to give possible reviewers easier time understanding this.

If we really must have a suid binary to fetch the fd, then it should be authenticated via policykit IMO. That way if distros want to lock it down with a password, then they can do so, even if the default is to allow it with no need for a password.

Thanks for the suggestion! I was scratching my head on how to approach this elevated capabilities thing, but had zero knowledge of polkit. I will have to read about it and figure out whether it supports running a binary with just one required capability, instead of full-root.

@TheMuso

This comment has been minimized.

Copy link
Contributor

TheMuso commented Mar 18, 2019

@Sunderland93

This comment has been minimized.

Copy link

Sunderland93 commented Mar 19, 2019

There is no PipeWire support?

@w23

This comment has been minimized.

Copy link
Author

w23 commented Mar 19, 2019

There is no PipeWire support?

Nope. While I'm fascinated by pipewire and certainly looking forward to it being stabilized, it is out of scope of this relatively conservative change.

There was a separate pipewire-source effort here: https://gitlab.com/petejohanson/obs-pipewire-screen-casting and discussed here: https://obsproject.com/mantis/view.php?id=719

@Sunderland93

This comment has been minimized.

Copy link

Sunderland93 commented Mar 19, 2019

Does it support capture of game's window only?

@w23

This comment has been minimized.

Copy link
Author

w23 commented Mar 19, 2019

Does it support capture of game's window only?

It doesn't. Moreover, in its current state it breaks existing Xcomposite support :D (on EGL, GLX mode is not affected), but I will fix that in a subsequent update before marking it as ready for review.

For this particular plugin to support capture of a single game window, there should be support for that in either the game itself, or window system (X11 or Wayland). I'm not aware of any robust way around that (it's technically possible to crop a region from a full framebuffer based on X11 window rect, but that is a bit weird).

I feel that more pressing is the issue of capturing a particular CRTC/monitor regardless of framebuffer/mode/rotation changes. I haven't thought about it that much.

@Sunderland93

This comment has been minimized.

Copy link

Sunderland93 commented Mar 19, 2019

there should be support for that in either the game itself, or window system

Ok. Can PipeWire solve that (like Syphon on macOS)?

@kkartaltepe

This comment has been minimized.

Copy link
Contributor

kkartaltepe commented Mar 19, 2019

there should be support for that in either the game itself, or window system

Ok. Can PipeWire solve that (like Syphon on macOS)?

PipeWire is just a transport, it doesnt do anything more than send audio/video streams between two programs. Syphon is an openGL capture solution which is a very different beast.

This PR is a wonderful example of how to use dmabuf's which someone could use to inform an implementation of one of the wayland screen recording protocol that sits on top of pipewire for a zero-copy capture on supporting wayland compositors (though if its actually zerocopy depends on the compositor as they may chose to serialize the textures in any format pipewire supports).

@Sunderland93

This comment has been minimized.

Copy link

Sunderland93 commented Mar 19, 2019

This PR is a wonderful example of how to use dmabuf's which someone could use to inform an implementation of one of the wayland screen recording protocol that sits on top of pipewire for a zero-copy capture on supporting wayland compositors (though if its actually zerocopy depends on the compositor as they may chose to serialize the textures in any format pipewire supports).

What about this? https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-export-dmabuf-unstable-v1.xml and this https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-screencopy-unstable-v1.xml

@kkartaltepe

This comment has been minimized.

Copy link
Contributor

kkartaltepe commented Mar 20, 2019

What about this?

One of the wayland screen capture protocols not based on pipewire. Basically all the protocols will result in passing dmabuf's (with or without pipewire) if they want to be performant. I didnt mean to derail this PR with wayland talk as its not directly related to this PR.

@phaitonican

This comment has been minimized.

Copy link

phaitonican commented Mar 26, 2019

having much better performance with this on wayland and xorg. I can get it running on Xorg though (for FreeSync), but once I go into a Fullscreen game, it seems it doesn't record anymore? The Image kinda freezes, maybe it's only a bug for me? Using Xorg from arch repos... Thanks!

@w23

This comment has been minimized.

Copy link
Author

w23 commented Mar 26, 2019

once I go into a Fullscreen game, it seems it doesn't record anymore? The Image kinda freezes, maybe it's only a bug for me? Using Xorg from arch repos... Thanks!

This is expected at this stage. Basically, when a game goes fullscreen it changes a framebuffer that is assigned to your monitor. Currently this plugin doesn't monitor for framebuffer changes, so it continues to read the old one, which becomes stale.

I plan to investigate into what can be done. We'd need to keep obs-drmsend running, somehow listening for framebuffer changes, and notifying parent obs process. There are also issues like: framebuffer can be resized, what to do with multi-monitor configurations, how to remap cursor if it is enabled, etc.

How does performance compare to using XComposite for capturing your game?

@YaLTeR

This comment has been minimized.

Copy link

YaLTeR commented Mar 26, 2019

Trying to test it here on Arch Linux, I have the Intel integrated GPU as card0 (I don't use that) and AMD RX 580 as card1. It seems that DMA-BUF defaults to card0 and upon adding outputs

obs-drmsend: Opening card /dev/dri/card0
obs-drmsend: DRM planes 9:
obs-drmsend: 	0: fb_id=0
obs-drmsend: 	1: fb_id=0
obs-drmsend: 	2: fb_id=0
obs-drmsend: 	3: fb_id=0
obs-drmsend: 	4: fb_id=0
obs-drmsend: 	5: fb_id=0
obs-drmsend: 	6: fb_id=0
obs-drmsend: 	7: fb_id=0
obs-drmsend: 	8: fb_id=0
obs-drmsend: sent 392
error: Received fd size mismatch: 80 received, 16 expected
error: Unable to enumerate DRM/KMS framebuffers
error: Failed to create source 'DMA-BUF source'!

and shows this window:

image

Changing the hardcoded card0 to card1 seems to make everything work well!

For some reason EGL is having issues keeping up with 60 FPS though. OBS FPS counter always shows 58-59 with EGL and recording 60+ FPS footage results in stuttering, as if resampled to lower FPS.

@foxcpp

This comment has been minimized.

Copy link

foxcpp commented Mar 26, 2019

On wlroots-based Wayland compositors, it is possible to access framebuffer using wlr-export-dmabuf extension without any setuid helpers.

@phaitonican

This comment has been minimized.

Copy link

phaitonican commented Mar 26, 2019

but once I go into a Fullscreen game, it seems it doesn't record anymore? The Image kinda freezes, maybe it's only a bug for me? Using Xorg from arch repos... Thanks!

I plan to investigate into what can be done. We'd need to keep obs-drmsend running, somehow listening for framebuffer changes, and notifying parent obs process. There are also issues like: framebuffer can be resized, what to do with multi-monitor configurations, how to remap cursor if it is enabled, etc.

How does performance compare to using XComposite for capturing your game?

I tried XComposite on Xorg, to make it work I had to change to GLX like you said. But the recorded picture seems to miss some colors which is strange. ONLY For games, desktop seems fine.
obsphoto
Output file looks same like in the picture. I think the performance is better.

About dmabuf_source:

This is expected at this stage. Basically, when a game goes fullscreen it changes a framebuffer that is assigned to your monitor. Currently this plugin doesn't monitor for framebuffer changes, so it continues to read the old one, which becomes stale.

This makes totally sense. I find it confusing though that Wayland and Xorg behave so differently there, because on Wayland it seems to get the "right" framebuffer when I open the fullscreen game. Maybe it's only a Xorg thing to reassign the framebuffer or something once fullscreen. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.