-
Notifications
You must be signed in to change notification settings - Fork 343
Freezes and eventual crash when copy pasting in Xwayland apps #2425
Comments
This seems to be very similar to swaywm/sway#4007 (possibly the same). |
Looks very similar indeed. I don't see anybody reporting crashes though so maybe there's an extra layer here. A lot of people commenting about QT apps - I use none. |
There are some backtraces in that thread as well, but the cause of the one here is a bit clearer:
The selection has no associated Xwayland instance. I assume this is because when you killed Xwayland, wlroots cleaned it up and freed all associated memory, but didn't unregister the fd from the event loop. It becomes "readable" again (but with 0 bytes to read, oh well), tries accessing the now-freed Moving this over to wlroots, since the crash happens there. |
Fixes swaywm#2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | swaywm#1: started | swaywm#2: pending | swaywm#3: pending | swaywm#4: pending | The file descriptor for transfer swaywm#1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up swaywm#4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for swaywm#1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove swaywm#3 and swaywm#2. Finally, we remove swaywm#1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of swaywm#1's file descriptor left in the epoll loop, since we erroneously added them as part of removing swaywm#2/3/4. When we `close` the file descriptor as part of swaywm#1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
Fixes swaywm#2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | T1: started | T2: pending | T3: pending | T4: pending | The file descriptor for transfer T1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up T4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for T1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove T3 and T2. Finally, we remove T1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of T1's file descriptor left in the epoll loop, since we erroneously added them as part of removing T2/3/4. When we `close` the file descriptor as part of T1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
Fixes swaywm#2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | T1: started | T2: pending | T3: pending | T4: pending | The file descriptor for transfer T1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up T4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for T1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove T3 and T2. Finally, we remove T1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of T1's file descriptor left in the epoll loop, since we erroneously added them as part of removing T2/3/4. When we `close` the file descriptor as part of T1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
Fixes swaywm#2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | T1: started | T2: pending | T3: pending | T4: pending | The file descriptor for transfer T1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up T4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for T1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove T3 and T2. Finally, we remove T1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of T1's file descriptor left in the epoll loop, since we erroneously added them as part of removing T2/3/4. When we `close` the file descriptor as part of T1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
Fixes #2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | T1: started | T2: pending | T3: pending | T4: pending | The file descriptor for transfer T1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up T4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for T1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove T3 and T2. Finally, we remove T1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of T1's file descriptor left in the epoll loop, since we erroneously added them as part of removing T2/3/4. When we `close` the file descriptor as part of T1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
Great stuff, thank you 👍 |
Just to report back: I've had a couple of Xwayland copy/paste freezes followed by killing xwayland, and have had no crashes 👍 |
Fixes swaywm#2425. wlroots can only handle one outgoing transfer at a time, so it keeps a list of pending selections. The head of the list is the currently-active selection, and when that transfer completes and is destroyed, the next one is started. The trouble is when you have a transfer to some app that is misbehaving. fcitx is one such application. With really large transfers, fcitx will hang and never wake up again. So, you can end up with a transfer list that looks like this: | T1: started | T2: pending | T3: pending | T4: pending | The file descriptor for transfer T1 is registered in libwayland's epoll loop. The rest are waiting in wlroots' list. As a user, you want your clipboard back, so you `pkill fcitx`. Now Xwayland sends `XCB_DESTROY_NOTIFY` to let us know to give up. We clean up T4 first. Due to a bug in wlroots code, we register the (fd, transfer data pointer) pair for T1 with libwayland *again*, despite it already being registered. We do this 2 more times as we remove T3 and T2. Finally, we remove T1 and `free` all the memory associated with it, before `close`-ing its transfer file descriptor. However, we still have 3 copies of T1's file descriptor left in the epoll loop, since we erroneously added them as part of removing T2/3/4. When we `close` the file descriptor as part of T1's teardown, we actually cause the epoll loop to wake up the next time around, saying "this file descriptor has activity!" (it was closed, so `read`-ing would normally return 0 to let us know of EOF). But instead of returning 0, it returns -1 with `EBADF`, because the file descriptor has already been closed. And finally, as part of error-handling this, we access the transfer pointer, which was `free`'d. And we crash.
I've had another crash today, I've opened a different issue on the sway tracker as I don't know where the issue belongs and whether it's a regression of this earlier fix. New issue at https://github.com/swaywm/sway/issues/5852 |
Please fill out the following:
Sway Version: 1.5 / wlroots 0.11 / xwayland 1.20.8-2ubuntu2.4 (ubuntu 20.04)
Debug Log:
Nothing relevant on the debug logs:
Configuration File: https://github.com/luispabon/sway-dotfiles/tree/master/configs/sway
Stack Trace:
bt
This one is pretty hard to reproduce. It happens when I've been in a sway session for a couple of days - cannot be reliably reproduced on a brand new session.
The text was updated successfully, but these errors were encountered: