Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault - While screencasting (firefox nightly WebRTC). #24

Closed
jameswalmsley opened this issue Apr 16, 2020 · 18 comments
Closed

Segfault - While screencasting (firefox nightly WebRTC). #24

jameswalmsley opened this issue Apr 16, 2020 · 18 comments
Labels
invalid This doesn't seem right

Comments

@jameswalmsley
Copy link

Hi @danshick great work.

I've got this working on the Mozilla WebRTC demo in firefox nightly, and can see my screen being shared. After a few seconds I get the following segfault:

I ran the tools with the following command lines:

env PIPEWIRE_DEBUG=3 pipewire 2>&1 | tee pipewire.log
gdb --args xdg-desktop-portal-wlr -p BGRx -l TRACE
/usr/local/libexec/xdg-desktop-portal --verbose -r
(gdb) bt
#0  0x0000005400000003 in ?? ()
#1  0x00007ffff7e9b83b in clear_buffers (stream=stream@entry=0x5555555a67a0) at ../src/pipewire/stream.c:570
#2  0x00007ffff7ea3d64 in impl_port_set_param (object=0x5555555a67a0, direction=<optimised out>, port_id=<optimised out>, id=4, flags=<optimised out>, param=<optimised out>) at ../src/pipewire/stream.c:608
#3  0x00007ffff7e9735e in pw_impl_port_set_param (port=port@entry=0x5555555a8c00, id=id@entry=4, flags=flags@entry=0, param=param@entry=0x0) at ../src/pipewire/impl-port.c:1143
#4  0x00007ffff6989a88 in client_node_port_set_param (object=<optimised out>, direction=<optimised out>, port_id=<optimised out>, id=<optimised out>, flags=0, param=0x0) at ../src/modules/module-client-node/remote-node.c:576
#5  0x00007ffff699b55f in client_node_demarshal_port_set_param (object=0x7ffff68a7010, msg=<optimised out>) at ../src/modules/module-client-node/protocol-native.c:451
#6  0x00007ffff69d361d in process_remote (impl=impl@entry=0x555555594440) at ../src/modules/module-protocol-native.c:667
#7  0x00007ffff69d3b78 in on_remote_data (data=0x555555594440, fd=<optimised out>, mask=<optimised out>) at ../src/modules/module-protocol-native.c:707
#8  0x00007ffff7fc3edc in loop_iterate (object=0x55555556ef28, timeout=<optimised out>) at ../spa/plugins/support/loop.c:302
#9  0x0000555555557d8a in main ()
(gdb) 

Let me know if I can get more information or help you debug.
I've pulled everything from the latest master branch and compiled.

pipewire.log

Thanks for your efforts.

James

@soyuka
Copy link

soyuka commented Apr 16, 2020

Latest master for me with fedora-firefox-wayland-bin (from aur), using https://mozilla.github.io/webrtc-landing/gum_test.html full screen sharing. Pipewire logs:

[E][000017593.677537][impl-link.c:118 pw_impl_link_update_state()] link 0x55b76d7dcd50: update state negotiating -> error (no more input formats)
[E][000017596.515528][private.h:218 pw_core_resource_errorv()] resource 0x55b76d0d5980: id:0 seq:348 res:-22 (Invalid argument) msg:"unknown resource 64 op:4"
[E][000017596.515607][core.c:71 core_event_error()] core 0x559612a37fc0: proxy 0x559612a37fc0 id:0: seq:348 res:-22 (Invalid argument) msg:"unknown resource 64 op:4"
[E][000017596.515614][media-session.c:1629 core_error()] error id:0 seq:348 res:-22 (Invalid argument): unknown resource 64 op:4

The xdg-portal-wlr doesn't segfault though (latest master):

2020/04/16 13:03:34 [TRACE] - dbus: start: found matching session /org/freedesktop/portal/desktop/session/1_211/webrtc_session1021323991
2020/04/16 13:03:34 [TRACE] - event-loop: got wayland event
2020/04/16 13:03:34 [TRACE] - wlroots: flags event handler
2020/04/16 13:03:34 [TRACE] - wlroots: damage event handler
2020/04/16 13:03:34 [TRACE] - wlroots: ready event handler
2020/04/16 13:03:34 [TRACE] - wlroots: frame destroyed
2020/04/16 13:03:34 [TRACE] - xdpw: destroying cast instance
2020/04/16 13:03:34 [TRACE] - pipewire: destroying stream
2020/04/16 13:03:34 [INFO] - pipewire: stream state changed to "unconnected"
2020/04/16 13:03:34 [INFO] - pipewire: node id is -1
2020/04/16 13:03:34 [TRACE] - event-loop: got pipewire event
2020/04/16 13:03:34 [TRACE] - event-loop: got pipewire event

Hope that helps.

@danshick
Copy link
Collaborator

danshick commented Apr 16, 2020

Have either of you built the changes in xdpw that were merged this morning for session support?

Edit:

And @jameswalmsley, could you provide a more complete backtrace? Maybe run xdpw in gdb and do a bt full after the segfault? I'm hoping to see if there was any additional context as the segfault is occurring pretty deep into the linked pipewire plugin code.

@danshick
Copy link
Collaborator

Also, @soyuka, did you start xdpw with -p BGRx? Looks like you might just have a pixelformat issue.

@soyuka
Copy link

soyuka commented Apr 16, 2020

Have either of you built the changes in xdpw that were merged this morning for session support?

First thing I did in the morning :).

I was pretty sure I did start it with -p BGRx but now I tried again and this works (actually so happy haha, thanks for the work, I can finally share my alacritty :D )!!!
Now I got it to work for some time (full-screen sharing only), just reporting some errors I see in the pipewire log that occured (I didn't got any segfaults):

[E][000001624.926795][private.h:218 pw_core_resource_errorv()] resource 0x555e4fe93980: id:0 seq:254 res:-22 (Invalid argument) msg:"unknown resource 46 op:4"
[E][000001624.926825][core.c:71 core_event_error()] core 0x55fdc8359fc0: proxy 0x55fdc8359fc0 id:0: seq:254 res:-22 (Invalid argument) msg:"unknown resource 46 op:4"
[E][000001624.926831][media-session.c:1629 core_error()] error id:0 seq:254 res:-22 (Invalid argument): unknown resource 46 op:4

When I closed the video:

[I][000001658.713545][module-protocol-native.c:288 connection_data()] protocol-native 0x555e4fe5c6e0: client 0x555e4ff309f0 disconnected
[W][000001658.713660][impl-node.c:337 suspend_node()] node 0x555e4ff35e90: error unset format input: Input/output error
[I][000001658.713711][client-node.c:457 do_update_port()] node 0x555e50387b50: port 0 update 0 params
[E][000001661.715603][private.h:218 pw_core_resource_errorv()] resource 0x555e4fe93980: id:0 seq:268 res:-22 (Invalid argument) msg:"unknown resource 52 op:4"
[E][000001661.715665][core.c:71 core_event_error()] core 0x55fdc8359fc0: proxy 0x55fdc8359fc0 id:0: seq:268 res:-22 (Invalid argument) msg:"unknown resource 52 op:4"
[E][000001661.715673][media-session.c:1629 core_error()] error id:0 seq:268 res:-22 (Invalid argument): unknown resource 52 op:4

In a weird way I managed to stream the opened firefox window and broke the video permissions somehow when I stopped the stream (had to close/open firefox again to make it work), got this error:

[E][000001963.966379][private.h:218 pw_core_resource_errorv()] resource 0x555e4fe93980: id:0 seq:498 res:-22 (Invalid argument) msg:"unknown resource 84 op:4"
[E][000001963.966444][core.c:71 core_event_error()] core 0x55fdc8359fc0: proxy 0x55fdc8359fc0 id:0: seq:498 res:-22 (Invalid argument) msg:"unknown resource 84 op:4"
[E][000001963.966454][media-session.c:1629 core_error()] error id:0 seq:498 res:-22 (Invalid argument): unknown resource 84 op:4

May I ask what this unknown resource is?

Some logs of xdpw:

XDP: Using wlr.portal for org.freedesktop.impl.portal.Screenshot in sway
XDP: providing portal org.freedesktop.portal.Screenshot
XDP: Using wlr.portal for org.freedesktop.impl.portal.ScreenCast in sway
XDP: providing portal org.freedesktop.portal.ScreenCast
XDP: org.freedesktop.portal.Desktop acquired
XDP: screen cast session owned by ':1.106' created
XDP: screen cast session owned by ':1.106' started
XDP: screen cast session owned by ':1.106' closed
XDP: screen cast session owned by ':1.106' created
XDP: screen cast session owned by ':1.106' started
XDP: screen cast session owned by ':1.106' closed

Everything went fine.

I'm using sway, archlinux and master version of pipewire, xdp, xdpw, and the fedora-firefox-wayland. Btw do you know or when will this pipewire patch in firefox become available?

20200416_18h35m05s_grim
20200416_18h34m56s_grim

@danshick
Copy link
Collaborator

I haven't looked into the pipewire logs yet in depth. Nothing you're seeing there looks out of the ordinary or problematic to me.

This is the issue preventing Firefox from building with pipewire support...https://bugzilla.mozilla.org/show_bug.cgi?id=1430775

It is a tricky fix because it has to do with the build tool in use, and its inability to set custom flags. For now, patched releases will have to do.

@jameswalmsley
Copy link
Author

Hi @danshick,

I've pulled everything to master (even wlroots and sway).
I can connect with firefox nightly and still works, but I get the segfault.. here's the full trace:

No symbol table info available.
#1  0x00007ffff7e9b83b in clear_buffers (stream=stream@entry=0x5555555a6a50) at ../src/pipewire/stream.c:570
        _f = <optimised out>
        list = 0x5555555a6ab8
        s = 0x5555555a6ab8
        cursor = {link = {next = 0x7fffffffd868, prev = 0x5555555cff18}, cb = {funcs = 0x0, data = 0x0}, 
          removed = 0x0, priv = 0x0}
        ci = <optimised out>
        count = <optimised out>
        b = 0x5555555a6d58
        impl = 0x5555555a6a50
        i = 0
        j = <optimised out>
        __func__ = "clear_buffers"
#2  0x00007ffff7ea3d64 in impl_port_set_param (object=0x5555555a6a50, direction=<optimised out>, 
    port_id=<optimised out>, id=4, flags=<optimised out>, param=<optimised out>)
    at ../src/pipewire/stream.c:608
        impl = 0x5555555a6a50
        stream = 0x5555555a6a50
        res = <optimised out>
        __func__ = "impl_port_set_param"
#3  0x00007ffff7e9735e in pw_impl_port_set_param (port=port@entry=0x5555555a8eb0, id=id@entry=4, 
    flags=flags@entry=0, param=param@entry=0x0) at ../src/pipewire/impl-port.c:1143
        _f = <optimised out>
        _res = -95
        _n = <optimised out>
        res = <optimised out>
        node = 0x5555555a81e0
        __func__ = "pw_impl_port_set_param"
#4  0x00007ffff6989a88 in client_node_port_set_param (object=<optimised out>, direction=<optimised out>, 
    port_id=<optimised out>, id=<optimised out>, flags=0, param=0x0)
    at ../src/modules/module-client-node/remote-node.c:576
        proxy = <optimised out>
        data = <optimised out>
        port = 0x5555555a8eb0
        res = <optimised out>
        __func__ = "client_node_port_set_param"
#5  0x00007ffff699b55f in client_node_demarshal_port_set_param (object=0x7ffff68a7010, msg=<optimised out>)
    at ../src/modules/module-client-node/protocol-native.c:451
        _f = <optimised out>
        list = 0x7ffff68a7068
        s = 0x7ffff68a7068
        cursor = {link = {next = 0x7ffff68a7068, prev = 0x7ffff692f120}, cb = {funcs = 0x0, data = 0x0}, 
          removed = 0x0, priv = 0x0}
        ci = <optimised out>
        count = <optimised out>
        proxy = 0x7ffff68a7010
        prs = {data = 0x55555559e908, size = 80, _padding = 0, state = {offset = 80, flags = 0, frame = 0x0}}
        direction = 1
        port_id = 0
        id = 4
        flags = 0
        param = 0x0
#6  0x00007ffff69d361d in process_remote (impl=impl@entry=0x5555555946f0) at ../src/modules/module-protocol-native.c:667
        proxy = 0x7ffff68a7010
        demarshal = <optimised out>
        marshal = <optimised out>
        msg = 0x5555555957f0
        conn = 0x5555555947a0
        this = 0x5555555937d0
        res = <optimised out>
        __func__ = "process_remote"
#7  0x00007ffff69d3b78 in on_remote_data (data=0x5555555946f0, fd=<optimised out>, mask=<optimised out>) at ../src/modules/module-protocol-native.c:707
        impl = 0x5555555946f0
        this = 0x5555555937d0
        conn = <optimised out>
        context = <optimised out>
        loop = 0x55555556ed50
        res = <optimised out>
        __func__ = "on_remote_data"
#8  0x00007ffff7fc3edc in loop_iterate (object=0x55555556ef28, timeout=<optimised out>) at ../spa/plugins/support/loop.c:302
        s = <optimised out>
        impl = 0x55555556ef28
        loop = 0x55555556ef40
        ep = {{events = 1, data = 0x5555555a6960}, {events = 4158825920, data = 0x55555555f290}, {events = 4294956656, data = 0x37f4f212646e100}, {events = 4294958805, data = 0x7ffff7e2a5c0 <_IO_2_1_stderr_>}, {events = 4158829728, data = 0x7ffff7cc3546 <__GI__IO_fflush+134>}, {events = 0, data = 0x5}, {events = 4158825920, data = 0x5555555582c3 <logprint+483>}, {events = 1587065620, data = 0x3000000010}, {events = 4294957088, data = 0x7fffffffd740}, {events = 808595506, data = 0x33333a3032203631}, {events = 3159098, data = 0x24}, {events = 1431737240, data = 0x555555567388}, {events = 4294956816, data = 0x7ffff7f9de4f <wl_connection_flush+367>}, {events = 1431804948, data = 0x24}, {events = 4294956848, data = 0x2b1455579b30}, {events = 0, data = 0x7fff00000000}, {events = 4294956816, data = 0x1}, {events = 0, data = 0x0}, {events = 0, data = 0x7fffffffd740}, {events = 1431731868, data = 0x24}, {events = 1431804720, data = 0x555555579b60}, {events = 1432009152, data = 0x37f4f212646e100}, {events = 1431724661, data = 0x37f4f212646e100}, {events = 4294967295, data = 0x7ffff7d5396f <__GI___poll+79>}, {events = 0, data = 0x1}, {events = 2, data = 0x7ffff7f9beb8 <dispatch_event+264>}, {events = 1431804768, data = 0x555555566210}, {events = 117, data = 0x37f4f212646e100}, {events = 4294958805, data = 0x555555566210}, {events = 1431724768, data = 0x37f4f212646e100}, {events = 0, data = 0x0}, {events = 1431724560, data = 0x0}}
        i = <optimised out>
        nfds = <optimised out>
#9  0x0000555555557d8a in main ()

Pipewire gives me this when I start it:

pipewire
[E][000038038.851737][pipewire.c:117 open_plugin()] can't load /usr/local/lib/x86_64-linux-gnu/spa-0.2/bluez5/libspa-bluez5.so: /usr/local/lib/x86_64-linux-gnu/spa-0.2/bluez5/libspa-bluez5.so: cannot open shared object file: No such file or directory
[E][000038038.851762][pipewire.c:246 pw_load_spa_handle()] can't load 'bluez5/libspa-bluez5': No such file or directory
[E][000038038.851771][bluez-monitor.c:452 sm_bluez5_monitor_start()] can't load api.bluez5.enum.dbus: No such file or directory
[E][000038038.860984][pipewire.c:117 open_plugin()] can't load /usr/local/lib/x86_64-linux-gnu/spa-0.2/jack/libspa-jack.so: /usr/local/lib/x86_64-linux-gnu/spa-0.2/jack/libspa-jack.so: cannot open shared object file: No such file or directory
[E][000038038.861007][pipewire.c:246 pw_load_spa_handle()] can't load 'jack/libspa-jack': No such file or directory
[E][000038038.861018][spa-device.c:144 pw_spa_device_load()] can't load device handle: No such file or directory
[E][000038038.861025][module-device-factory.c:142 create_object()] can't create device: No such file or directory
[E][000038038.861041][private.h:218 pw_core_resource_errorv()] resource 0x55f28fda8220: id:4 seq:4 res:-2 (No such file or directory) msg:"can't create device: No such file or directory"
[E][000038038.861358][core.c:71 core_event_error()] core 0x562764797310: proxy 0x5627647e4440 id:4: seq:4 res:-2 (No such file or directory) msg:"can't create device: No such file or directory"
[E][000038038.861388][media-session.c:1629 core_error()] error id:4 seq:4 res:-2 (No such file or directory): can't create device: No such file or directory

Let me know how else I can help. (Going to try building chromium-ozone, and check if I need to update my patched firefox).

@danshick
Copy link
Collaborator

danshick commented Apr 16, 2020

Not worried about the pipewire errors. Those are missing dependencies unrelated to the functionality we're using (bluetooth and audio stuff).

The stacktrace is more detailed, thanks. Still mostly in the linked pipewire SPA code, which makes this tricky.

Let me upgrade everything myself and see if I can replicate this issue.

Edit:

I actually think there are bigger issues with Chromium at the moment. I've had all of my luck with Firefox.

@danshick
Copy link
Collaborator

I just noticed @jameswalmsley, you're using Firefox Nightly. Where are you getting it (what distro and package)?

@jameswalmsley
Copy link
Author

Hi @danshick @soyuka

I'm using the flatpak built nightly with built-in support from:
https://firefox-flatpak.mojefedora.cz/

Also I'm on Ubuntu 20.04, but most of the packages in play here are self compiled (AFAIK).

Just did a flatpak update to the latest and get the same result.

Best

James

@jameswalmsley
Copy link
Author

FYI, I'm using master branch builds of wayland, xwayland and wayland protocols etc.
(as of today).

@danshick
Copy link
Collaborator

Okay, I've never tested a version of Firefox with these specific patches. I can't promise I'll be able to reproduce this soon, but I'll see what I can do. Flatpak really complicates things.

@emersion
Copy link
Owner

IMHO, this crash should be reported to PipeWire. If PipeWire puts itself into a state where an event loop dispatch crashes it, it's a PipeWire bug.

@danshick
Copy link
Collaborator

There definitely is a pipewire bug out there, but I'm not convinced we are behaving correctly either with regards to ways in which sessions can end.

I can even get pipewire into a state where restarting xdpw doesn't fix anything, so it is definitely an issue on their side. I just cant find a reliable way to reproduce, nor have i seen a smoking gun in the pipewire debug logs.

@jameswalmsley
Copy link
Author

@DanSchick Do you know of a binary build for debian based systems that I could download and try,
just to remove flatpak?

Otherwise I can look into building firefox directly.

@soyuka
Copy link

soyuka commented Apr 17, 2020

You should be able to make something work using the aur source https://aur.tuna.tsinghua.edu.cn/packages/fedora-firefox-wayland-bin/. There's a rpm listed there and some changes can be followed in the PKGBUILD (see prepare section). Hope this helps.

@danshick danshick added the invalid This doesn't seem right label Apr 29, 2020
@danshick
Copy link
Collaborator

I'm going to close this and mark as invalid. Everything I can test at the moment is working. @jameswalmsley, if you're still having issues, feel free to hit me up at #sway on freenode. I'll be happy to troubleshoot with you.

@jameswalmsley
Copy link
Author

Hi @danshick many thanks.
I've tried with newer nightlies and your lasted code.. but now I just get a black screen.

Sorry I've been a bit busy.

I will try setting things up again, and try to find you on free node if I have issues.

Many thanks for the great work and effort.

J

@danshick
Copy link
Collaborator

@jameswalmsley
Build our latest, but also be sure to build pipewire master. Wim just helped us fix a segfault in pipewire this morning. Check out the updated documentation as well. It is much more comprehensive now and includes more detailed steps for getting things working. You may need to set XDG_SESSION_TYPE=wayland if it isn't. And I think we've discussed the pixelformat flag workaround already, but make sure you're doing that too. It's all in the compatibility guide in the wiki. Let me know how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

4 participants