Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Screencast session support #22

Merged
merged 2 commits into from Apr 16, 2020
Merged

Conversation

danshick
Copy link
Collaborator

@danshick danshick commented Apr 8, 2020

Fixes #13.

This is designed to take pairs of output id and cursor mode and create a single pipewire stream for each unique combo. Until output selection can be controlled at screencast initialization, this doesn't serve much of a purpose, but it is designed with that capability in mind.

At the moment, this gets all the way to a pipewire stream PAUSED state and continues processing wlroots frames like it is supposed to, but the stream is not able to activate. I need to investigate changes made to the pipewire portions of the code to see if I created a regression.

Edit: The pipewire issues are fixed.

@emersion, in order to give all of these event driven functions an understanding of which session they belong to, I've started to pass around "cast" session objects instead of the xdpw_state or xdpw_screencast_context, which are both more global in nature, but some functions still need access to data in the context or global state object (pw_loop comes to mind). To accomplish this, each session object has a pointer back to the screencast_context and the context holds a pointer back up to the state. Wanted to point this out in case you see any issue with this approach.

@danshick danshick force-pushed the session_support branch 5 times, most recently from 1c7e97d to f2c80f3 Compare April 8, 2020 02:41
@danshick danshick self-assigned this Apr 8, 2020
@danshick
Copy link
Collaborator Author

danshick commented Apr 8, 2020

This will likely fix #18 as well before it's completed.

@danshick danshick linked an issue Apr 8, 2020 that may be closed by this pull request
@emersion
Copy link
Owner

emersion commented Apr 8, 2020

I've started to pass around "cast" session objects instead of the xdpw_state or xdpw_screencast_context

Yeah, this sounds perfectly fine

@danshick
Copy link
Collaborator Author

danshick commented Apr 8, 2020

Okay, this is working-ish. It's still a little finicky. If anyone wants to try it out (especially stress test it, open lots of streams, close them, in parallel or in series), that'd be great.

There's something weird going on right now with chromium still. Has to do with it requesting individual window shares from us, even though we advertise that we don't support that. I'll work on cleaning that up.

@alebastr
Copy link
Contributor

alebastr commented Apr 8, 2020

Thanks!
Haven't done any meaningful testing, however ASan already reported some issues:

ERROR: AddressSanitizer: heap-use-after-free on address 0x612000001c2c at pc 0x00000049b312 bp 0x7ffe5a4f7230 sp 0x7ffe5a4f69e0
2020/04/08 14:50:08 [INFO] - wlroots: wl_display fd: 4
2020/04/08 14:50:13 [INFO] - dbus: create session method invoked
2020/04/08 14:50:13 [INFO] - dbus: request_handle: /org/freedesktop/portal/desktop/request/1_217/u1
2020/04/08 14:50:13 [INFO] - dbus: session_handle: /org/freedesktop/portal/desktop/session/1_217/u1
2020/04/08 14:50:13 [INFO] - dbus: app_id:
2020/04/08 14:50:13 [INFO] - dbus: select sources method invoked
2020/04/08 14:50:13 [INFO] - dbus: request_handle: /org/freedesktop/portal/desktop/request/1_217/u2
2020/04/08 14:50:13 [INFO] - dbus: session_handle: /org/freedesktop/portal/desktop/session/1_217/u1
2020/04/08 14:50:13 [INFO] - dbus: app_id:
2020/04/08 14:50:13 [INFO] - dbus: option types:3
2020/04/08 14:50:13 [INFO] - dbus: option multiple: 0
=================================================================
==48441==ERROR: AddressSanitizer: heap-use-after-free on address 0x612000001c2c at pc 0x00000049b312 bp 0x7ffe5a4f7230 sp 0x7ffe5a4f69e0
READ of size 49 at 0x612000001c2c thread T0
    #0 0x49b311 in __interceptor_strcmp.part.0 (/home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/xdg-desktop-portal-wlr+0x49b311)
    #1 0x4fe493 in method_screencast_select_sources /home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/../src/screencast/screencast.c:259:6
    #2 0x7f5903d4ffb5  ../src/libsystemd/sd-bus/bus-objects.c:409:21
    #3 0x7f5903d4ffb5  ../src/libsystemd/sd-bus/bus-objects.c:1332:21
    #4 0x7f5903da3edc  ../src/libsystemd/sd-bus/bus-objects.c:1452:21
    #5 0x7f5903da3edc  ../src/libsystemd/sd-bus/sd-bus.c:2763:13
    #6 0x7f5903da3edc  ../src/libsystemd/sd-bus/sd-bus.c:2805:13
    #7 0x7f5903da3edc  ../src/libsystemd/sd-bus/sd-bus.c:3025:21
    #8 0x4f5eb6 in main /home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/../src/core/main.c:142:11
    #9 0x7f590393e041 in __libc_start_main (/lib64/libc.so.6+0x27041)
    #10 0x41d7fd in _start (/home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/xdg-desktop-portal-wlr+0x41d7fd)

0x612000001c2c is located 236 bytes inside of 304-byte region [0x612000001b40,0x612000001c70)
freed by thread T0 here:
#0 0x4bf317 in free (/home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/xdg-desktop-portal-wlr+0x4bf317)
#1 0x7f5903d36d0b ../src/libsystemd/sd-bus/bus-message.c:131:17

previously allocated by thread T0 here:
#0 0x4bf998 in realloc (/home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/xdg-desktop-portal-wlr+0x4bf998)
#1 0x7f5903d5b704 ../src/libsystemd/sd-bus/bus-socket.c:1179:13

SUMMARY: AddressSanitizer: heap-use-after-free (/home/alebastr/sources/xdg-desktop-portal-wlr/build-asan/xdg-desktop-portal-wlr+0x49b311) in __interceptor_strcmp.part.0

Build command, for reference:

env CC=clang CXX=clang++ meson setup build-asan --buildtype debug -Db_lundef=false -Db_sanitize=address,undefined

Will try normal build once my working day ends.

Edit: normal build with latest commit crashed when I restarted screen sharing in Firefox.
method_screencast_start didn't found session and used uninitialized pointer to xdpw_screencast_instance:

(gdb) p ((struct xdpw_session*) state->xdpw_sessions)[0]
$14 = {
  link = {
    prev = 0x7ffff81ccc60,
    next = 0x7ffff81ccc60
  },
  slot = 0x13a19e0,
  session_handle = 0x1429314 "/freedesktop/portal/desktop/session/1_56/webrtc_session633632397",
  screencast_instance = 0x13a1040
}
(gdb) p session_handle
$15 = 0x1429310 "/org/freedesktop/portal/desktop/session/1_56/webrtc_session633632397"
(gdb) bt
#0  xdpw_wlr_register_cb (cast=0x7fcc4b36d9d8 <sd_bus_message_rewind+136>) at ../src/screencast/wlr_screencast.c:178
#1  0x0000000000403961 in start_screencast (data=0x7fcc4b36d9d8 <sd_bus_message_rewind+136>) at ../src/screencast/screencast.c:91
#2  0x000000000040439a in method_screencast_start (msg=0x13a1500, data=0x7ffff81ccc60, ret_error=0x7ffff81ccaa0) at ../src/screencast/screencast.c:352
#3  0x00007fcc4b375fb6 in method_callbacks_run (found_object=0x7ffff81ccb68, require_fallback=false, c=<optimized out>, m=0x13a1500, bus=0x135a2d0)
    at ../src/libsystemd/sd-bus/bus-objects.c:409
#4  object_find_and_run (bus=0x135a2d0, m=0x13a1500, p=0x1429238 "/org/freedesktop/portal/desktop", require_fallback=false, found_object=0x7ffff81ccb68)
    at ../src/libsystemd/sd-bus/bus-objects.c:1332
#5  0x00007fcc4b3c9edd in bus_process_object (m=0x13a1500, bus=0x135a2d0) at ../src/libsystemd/sd-bus/bus-objects.c:1452
#6  process_message (m=0x13a1500, bus=0x135a2d0) at ../src/libsystemd/sd-bus/sd-bus.c:2763
#7  process_running (hint_priority=<optimized out>, priority=<optimized out>, ret=0x0, bus=0x135a2d0) at ../src/libsystemd/sd-bus/sd-bus.c:2805
#8  bus_process_internal.isra.0 (bus=0x135a2d0, ret=0x0, priority=<optimized out>, hint_priority=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:3025
#9  0x0000000000402a77 in main (argc=5, argv=0x7ffff81cce78) at ../src/core/main.c:142

tl;dr: C is hard and AddressSanitizer saves a lot of time by catching things like that early.

@danshick
Copy link
Collaborator Author

danshick commented Apr 9, 2020

Hey, thanks for testing this PR. It definitely still has some issues where it can get into strange states.

Your first ASan run indicates line number 259 in screencast.c, which in the latest commit is a closing brace. Not sure what to make of that.

I love ASan but sadly xdpw has often crashed when using ASan for an unknown reason. If that's no longer the case, id be very happy. Confused, but happy.

Edit: I'm an idiot. Looking at the wrong branch, but line 259 still doesn't make sense. That's just a call to sd_bus_message_exit_container.

Edit 2: Okay, you're right, ASan is working again (go figure). I am getting the exact same error on a similar line, so I think I know where to look. Valgrind hinted at this error as well, and I can't quite make sense of it yet.

Edit 3: Got it, session_handle was being freed in the create_session method. Just needed to duplicate that string if I want to persist it. This is fixed.

@danshick
Copy link
Collaborator Author

danshick commented Apr 9, 2020

For your second issue, maybe I need to initialize the cast local to NULL so the NULL pointer check right above it triggers. Didn't think i needed to do that.

But the better question is why it didn't find the session. If that's reproducible, could you share a trace log? Might wanna use a paste service as it could be very large.

Edit: I've ensured that cast is a NULL pointer now, so it shouldn't crash, it should just not work if the session can't be found. Firefox is working fine on my end, so could you try this again and report back with a TRACE log if you can?

Also, I've noticed that pipewire has been getting into a bad state with misbehaving plugins. I have no idea how or why, but I've found I occasionally need to stop (and start, if you don't have pipewire.socket enabled) my pipewire user service when xdpw segfaults or stops working.

@danshick
Copy link
Collaborator Author

danshick commented Apr 9, 2020

This PR does, in fact, fix #18, which in turn means that this now correctly supports obs-xdg-portal. Not sure this is in any way better than wlrobs, but it's another option.

@alebastr
Copy link
Contributor

alebastr commented Apr 9, 2020

👍
No more crashes and multiple simultaneous sessions are working. Got the same issue with pipewire though.
I believe the restart crash I reported was another case of use-after-free that you've just fixed.

maybe I need to initialize the cast local to NULL so the NULL pointer check right above it triggers. Didn't think i needed to do that.

I personally prefer to initialize everything, just to be safe. And then remove redundant initializations where it's clear that all branches below are setting value to the variable before using it. Saved me a lot of time in the past.

@danshick
Copy link
Collaborator Author

danshick commented Apr 10, 2020

I haven't been able to figure out what exactly is wrong with pipewire. If you run it manually with the environment variable PIPEWIRE_DEBUG=3 (or even 4) and you manage to get it into a bad state, that log might help @wtay and others to improve its stability. Our plugin shouldn't put it into an unstable state no matter what we're doing. I honestly can't even figure out when or how it gets that way.

I'd also love if someone could help me track down the issues with Chromium now. It's a strange failure case that I can't make sense of.

@nostdm
Copy link

nostdm commented Apr 10, 2020

I tested both chromium-dev-ozone and regular chromium and it seems that the xdg-desktop-portal-wlr no longer crashes and stops/restarts properly. That said I can see output only in the thumbnail when picking the screen to share, but no output after that.

The popup correctly shows a real-time thumbnail for the whole screen:

popup

After sharing the screen there is no output (I tried both https://meet.google.com and https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing). Below screenshot shows no output with a loading wheel:

webrtc

I also occasionally get this error from xdg-desktop-portal:

[E][000054978.592812][core.c:71 core_event_error()] core 0x7f20e403ff00: proxy 0x7f20e403ff00 id:0: seq:5 res:-13 (Permission denied) msg:"no permission to call method 2 on 0 (requires 00000040, have 00000000)"

@danshick
Copy link
Collaborator Author

Thanks, that's helpful, and actually a totally different failure mode than I was experiencing. Pipewire has a permissions model that, until this point, I've been happy to ignore. I think the default is permissive. This looks like it might be related.

My experience with Chromium started similarly with the preview functioning, but my full screen share appeared garbled instead of not at all.

@danshick danshick requested a review from emersion April 11, 2020 19:22
@danshick
Copy link
Collaborator Author

danshick commented Apr 13, 2020

@emersion I'm debating ignoring this Chromium issue for the purposes of this PR. It definitely deserves its own issue and fix, but this is behaving correctly for any number of Firefox, xdp-screen-cast.py, and obs-xdg-portal sessions that I can throw at it.

include/xdpw.h Outdated Show resolved Hide resolved
src/core/main.c Outdated Show resolved Hide resolved
src/core/main.c Outdated Show resolved Hide resolved
src/core/request.c Outdated Show resolved Hide resolved
src/core/session.c Outdated Show resolved Hide resolved
Copy link
Owner

@emersion emersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole looks good, thanks a lot for your work!

@danshick
Copy link
Collaborator Author

danshick commented Apr 14, 2020

Thanks for taking the time to review. I agree across the board.

One of these days, I'm going to develop the muscle memory to start including spaces around my conditionals (or strong arm my editor into doing it on save).

Edit:

And after these corrections, I'll rebase this a bit too.

@lsjostro
Copy link

I tested both chromium-dev-ozone and regular chromium and it seems that the xdg-desktop-portal-wlr no longer crashes and stops/restarts properly. That said I can see output only in the thumbnail when picking the screen to share, but no output after that.

The popup correctly shows a real-time thumbnail for the whole screen:

popup

After sharing the screen there is no output (I tried both https://meet.google.com and https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing). Below screenshot shows no output with a loading wheel:

webrtc

I also occasionally get this error from xdg-desktop-portal:

[E][000054978.592812][core.c:71 core_event_error()] core 0x7f20e403ff00: proxy 0x7f20e403ff00 id:0: seq:5 res:-13 (Permission denied) msg:"no permission to call method 2 on 0 (requires 00000040, have 00000000)"

I get the same above with chrome/chromium.
here some more logs from pipewire

systemctl --user status pipewire

[W][000001265.154665][impl-node.c:327 suspend_node()] node 0x559a50dd4c10: error unset format input: Input/output error
[W][000001265.162238][impl-node.c:327 suspend_node()] node 0x559a50dbd580: error unset format input: Input/output error
[W][000001265.190417][module-protocol-native.c:370 client_new()] server 0x559a50cd9a90: no peersec: Protocol not available
[W][000001265.190832][impl-client.c:608 pw_impl_client_update_permissions()] client 0x559a50db24c0: invalid global 104495344
[W][000001265.192569][connection.c:311 prepare_packet()] old version detected
[E][000001265.346362][impl-link.c:118 pw_impl_link_update_state()] link 0x559a50dd8010: update state negotiating -> error (no more input formats)

@danshick
Copy link
Collaborator Author

Hmmm, most of that is safely ignorable, but that last line of pipewire output may be a hint. I'm not sure why, but it looks like we can't find an agreeable format in common? Weird because the WebRTC code is common to Firefox and Chromium. I'll open a new issue once this PR lands and debug it further.

@lsjostro
Copy link

created a gist with full pipewire debug log.
look at the end for relavent error

https://gist.github.com/lsjostro/f08398b120a2309af36d000a0aa3c2c7

Remove libdrm dependency

Remove display from context, add dbus properties

Use random names for shm and pw_stream, init the stream only for new cast instances

Separate cast initialized flag from refcount, cleanup names and comments
Properly report xdp screencast implementation version
@danshick danshick changed the title [WIP] Screencast session support Screencast session support Apr 15, 2020
Copy link
Owner

@emersion emersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@emersion emersion merged commit 55f873d into emersion:master Apr 16, 2020
@abmantis
Copy link

@danshick Just as a note (couldn't find the issue you mentioned you would open): I had the same problem of black screen in chrome, and was able to make it work by specifying the -p option (either RGBx or BGRx work, but the first makes the colors funky).

@danshick
Copy link
Collaborator Author

danshick commented Apr 29, 2020

@abmantis Thanks, this is a known issue, thus the presence of that flag. I didn't create an issue but there is a PR open to document this. A more permanent fix is also forthcoming both here and in WebRTC (we will both start being more flexible with supported formats). The mentioned PR will be adding significantly more documentation in the mean time that will mention this and other common questions. Check out the current open PR to see those new docs and tell me what you think.

Glad you got it working. Hit me up on IRC if you have any issues. #sway on freenode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

screencast: properly report cursor mode screencast: properly handle sessions
6 participants