New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CRASH BUG] Pop Shell adds override-redirect (OR) windows to the GNOME Shell Overview, leading to crashes #1251
Comments
Which child dialogs are you referring to? I'm not aware of creating or adding any windows to the overview anywhere. Pop Shell does not create any windows of its own. It's simply getting them from GNOME Shell. |
Also, any help with knowing how a window is "override-redirect" would be great. |
Hi Michael, thanks for taking a look at this!
The second screenshot above shows how the "Save/Exit" dialog from Text Editor is visible as a second window in the overview. This was mentioned by GNOME devs as being caused by Pop Shell. It might be an OR window (I don't know), and if so it would only bug out on GNOME 41 without the Pop Shell version check fix (without #1249). If it is an OR window, you would also need to be using the patched GNOME Shell to see the issue without getting crashes.
I agree. I don't know, so I asked the experts at GNOME here, hopefully they'll have a good suggestion since they know the code base inside and out: https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/4751#note_1319153 |
@mmstick Alright, I asked them for methods that work on both GNOME 3.38 and 40+. Sebastian Keller has given a fantastically detailed answer about how to solve both issues (OR windows, and child windows such as save dialogs): Detecting OR Windows:
Detecting child dialogs that shouldn't be separately added to overview:
I did not realize how much different the two issues were. The OR windows, which should never be added, are easy to exclude with that single API above. But the child dialogs have multiple solutions depending on what you want to achieve. Should I split that issue off into a separate ticket? PS: Another GNOME developer mentioned that https://discourse.gnome.org/ is usually the best place to ask about interacting with the GNOME Shell programmatically. The relevant category for extensions is https://discourse.gnome.org/c/desktop/6 with a tag of "extensions", and I've seen that they give great help there, if there's ever any other difficult questions about obscure parts of the API. They also have a Matrix chat room at https://gnome.element.io/#/room/#extensions:gnome.org for developers. |
Yeah I don't know what OR windows they are referring to. Pop Shell doesn't have any. So I guess it's just adding these types of windows to the ignore list. |
@mmstick Yeah the OR windows don't come from inside Pop Shell itself. Instead, the problem is that anytime any running application creates an override-redirect window with a skip-taskbar flag, Pop Shell is adding it to the overview. That's what your "Show Minimize to Tray Window" option is doing (it's adding skip-taskbar windows). The problem is that a good portion of those skip-taskbar windows are OR windows. One way to create one is with the crash reproducer script I linked to in the initial post. Another way is just to drag and drop a file anywhere which creates an 1x1 OR window offscreen. Since such windows don't make sense to add to the overview, GNOME Shell wasn't tracking them and wasn't creating So the crash flow is: Pop Shell inadvertently adds the OR windows to the overview. The user goes to the overview. GNOME tries to retrieve the As mentioned, they are patching this on the GNOME Shell side to always track all windows no matter what, just to avoid these kinds of crashes. But Pop Shell needs to be revised a bit to not even add OR Windows at all, to avoid seeing those weird windows (the first embedded image above shows one example of an OR Window which becomes visible after GNOME Shell has been patched to not crash from the issue). Pop Shell's fix would be to run Anyway, there's two issues here. OR windows need to be excluded. And child windows need to be excluded. I didn't realize that they would need separate fixes. Should I split the issues into two tickets? :) |
If by child windows you're referring to dialogs / transient windows, these have been ignored since day one. |
@mmstick Yeah I am referring to those. Here's a Fedora 35 (GNOME 41.1) live environment with zero extensions enabled: Then I built Pop Shell with the latest git MINUS the "4x" version check patch (merge request #1249) since that patch just covers up the actual issues by disabling skip-tb windows on GNOME 40+ (if the core problems are fixed, that whole "if 40+, disable the 'show skip-tb windows' feature" code block can safely be removed and the feature would work again). I then ran Pop Shell with default settings: As is seen in the image, Pop Shell is adding the Text Editor's child dialog to the overview. This is something that needs code review, and the GNOME devs provided a few different ways to identify these child dialogs and fully remove them from Pop Shell. I've now also confirmed that the "save dialog" is not an OR window, by the way. But since this child dialog issue is a separate problem, and isn't related to the OR windows, I'm wondering if I should make a separate ticket for the child dialogs? Since the bugs turned out to be unrelated to each other, it would be best to avoid mixing these two issues? If so, I'll edit all posts above to remove the references to the problematic child windows and move that information to a new ticket. |
Hi @Bananaman - if you look into the version 4x check on extension.ts, added a comment there why it is being skipped to be overriden when 4.x. https://github.com/pop-os/shell/blob/master/src/extension.ts#L2644-L2647. Gnome 4x, on WindowPreview.js_init() line 124, does not have a null check. This null check on upstream for 4x only (short term) can be done. Or as they said they are going to MR on 4x and hopefully backported (long term). Since most users might not use the feature (some apps have minimize to tray which still shows on desktop but not on Overview, but would like that to be handled by tiling and so that's why this is included) and added a toggle to be sure there's a fallback. Thanks for the research, so the underlying term is an OR (override redirect) for a Meta.Window with a skip_taskbar set to true :) but not in all cases it is an OR as one of the upstream devs explained. So pop-shell need to add the Meta.Window.is-override-redirect check and prevent gnome-4x users from seeing a 1x1 icon. |
One way to test the dialogs does not come from the same issue is if disabling the Show Skip Taskbar option and see if it goes away. It looks like upstream also explained to check for attached dialogs. There's a whole lot to test on users using Android dev environment or Jetbrains apps. |
Adding random Currently without that MR this specifically means no OR windows and no windows that are not of these types, otherwise you will get But that's only to fix the stuck overview. There are other types of windows that are tracked, have the skip taskbar hint, but don't make much sense to include in the overview as separate windows, like attached dialogs, which already get shown on the window they are attached to.
Seeing the 1x1 pixel gtk DND IPC windows will only happen once/if the MR is merged. Currently it will result in |
@sbstnk - thanks for replying and taking the time to explain the details.
My bad and I agree with you, @Bananaman and Florian, I was actually trying to override it to add the null check like the I am working on a PR. Thank you as well for the break-my-shell script that was helpful. |
@sbstnk - I would assume upstream still needed the MR for users other than pop-shell ext (some random user experiencing overview freezes even without pop-shell). When the 1 x 1 appears and given if System 76 still uses gnome 4x or users updated to 4x with that patch; will check back how it looks like and maybe an additional fix need to be done. |
Hey @jmmaranan, thanks a lot for trying to solve these issues! I'm looking at your #1255 patches now and will comment there. :)
Yeah, there is a rare situation (nobody was able to find the exact condition) where vanilla GNOME without extensions can freeze itself because something other than NORMAL, DIALOG, UTILITY or MODAL_DIALOG was added to the overview. Removing the check for specific window types and tracking every window (except "OR") was the first patch of the GNOME Shell merge request. Later, we found that extensions can also cause crashes, and narrowed it down to override-redirect (OR) windows and Pop Shell. So the choice was made to also track OR windows to protect against extension crashes. This is the 1x1 pixel windows appearing in overview with patched GNOME Shell and unpatched Pop Shell. The OR windows are most commonly used for popup menus and when people drag-and-drop files (which creates OR windows offscreen). So, to sum it up, upstream will most likely merge the "track all windows except OR windows" to fix the vanilla GNOME crash. But may not merge the "track OR windows" patch since it's a bit dirty (I didn't understand the details of their discussion on the merge request, but it's something about it using "actors" when some other system would have been better). |
Related issues: #1233 and #1250
Partially related merge requests: #1249
The previous issue (1233) is too cluttered and doesn't describe the actual reason for the crashes.
This is the result of a few weeks of debugging this with the GNOME developers. They managed to figure out that Pop Shell is inadvertently adding override-redirect ("OR") windows to the GNOME Shell overview. Those are windows such as popup menus, and are not supposed to be in the overview. GNOME therefore doesn't track those windows and doesn't have any associated
app
object for them. Therefore crashing the Activities Overview withapp is null
when it attempts to retrieve app info about the OR windows that Pop Shell has added.GNOME has created some patches for GNOME 41 and decided to work around it by making GNOME Shell always track all windows, including OR windows. Just to prevent crashes if extensions ever add such windows to the tracking. The result is that, if you run with GNOME Shell's latest patches (not merged yet, they're for GNOME 41 and will be backported), then you won't crash with Pop Shell anymore, but will see small 1x1 pixel windows floating in the overview instead.
It was also discovered that this only happens when the Pop Shell "Show Minimize to Tray Windows" option is ENABLED.
The theory is that when Pop Shell is adding "skip-taskbar" windows, it needs to also add a check to NOT add the window if it's an "OR window".
Here is the GNOME Shell patch where they added more robust protection against this issue (link leads to a comment which demonstrates the tiny "OR" window that Pop Shell has added to the overview):
Here is a crash reproducer. It needs to be run after a fresh login to the session, on GNOME 41, with Pop Shell enabled (without the changes from #1249 which hide the issue but doesn't solve the core problem). It creates an OR window which Pop Shell will add to the Overview, which will then lead to a crash if you run unpatched GNOME Shell, or a floating 1x1 pixel window if you run the patched GNOME 41:
Here is a very long discussion where this and other crashes were being investigated, but I have extracted all important information so you don't have to read it:
Here are the most relevant, technical quotes describing how Pop Shell is inadvertently adding OR windows, and how it's also showing duplicate dialogs that are already shown in their parent windows:
Suggested fixes:
The text was updated successfully, but these errors were encountered: