Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Widget and window reference cleanup #2066

Merged
merged 23 commits into from Oct 18, 2023

Conversation

samschott
Copy link
Member

@samschott samschott commented Aug 6, 2023

Fixes #1215.

This is an alternative solution to #2061 which removes leftover references to a window and its widgets after the window is closed by removing the window's widgets from the toga.App.widgets registry on close.

This is based on changes from #2058, diff against those here.

PR Checklist:

  • All new features have been tested
  • All new features have been documented
  • I have read the CONTRIBUTING.md file
  • I will abide by the code of conduct

@samschott samschott marked this pull request as draft August 6, 2023 19:05
@samschott samschott force-pushed the ownership-cleanup-samschott branch 2 times, most recently from 311431c to 32e1b3d Compare August 6, 2023 19:16
Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming this resolves the window memory issue for macOS in my testing.

However, it's a weird partial fix on GTK - the garbage collection seems to be consistently 1 window behind (i.e., creating and closing the second window causes the first window to be deleted).

It doesn't appear to have any impact on Windows - no Window instances are deleted.

The CI tests also reveal a deeper problem - segfaults on iOS and GTK (The Android segfault is present in #2058). From what I can make out, this is caused by resetting Window.app to None; if I comment that line out of this diff, the macOS cleanup still occurs, but I don't get the segfault. IIRC, windows and widgets are created in an app context, so you can't "unassign" them from an app; however, that's a moot point, because the app is a singleton so there's never any other app you could assign it to. Assigning to "None" could also have some interesting effects.

Although you've presented this as an alternate fix to #1215, it's pretty clearly a necessary change regardless of whether it's a full fix for #1215 - having stray (dead) widgets referenced in app.widgets is clearly a bad idea. I'd suggest dropping the app=None change, .

The only other approach that would make sense in this context would be to make the references held by the registry weak; that would potentially allow some external object to retain a reference to a widget that was attached to a window, and not have that widget disappear from the registry when the window is destroyed. I'm not completely convinced that's worth the effort, though.

@samschott
Copy link
Member Author

samschott commented Aug 7, 2023

The CI tests also reveal a deeper problem - segfaults on iOS and GTK (The Android segfault is present in #2058). From what I can make out, this is caused by resetting Window.app to None; if I comment that line out of this diff, the macOS cleanup still occurs, but I don't get the segfault.

That's strange. IIUC, setting app = None should be a no-op for macOS at the implementation layer:

def set_app(self, app):
pass

The reason why I do want this change: Without it, when replacing the content of a window, the old content and all of its widgets will also remain in the App's widget registry. This is a test case which I actually forgot to add. Added this now in 8bf3bb4.

However, it's a weird partial fix on GTK - the garbage collection seems to be consistently 1 window behind (i.e., creating and closing the second window causes the first window to be deleted).

Hm, that could be an issue with the timing of the GC and when exactly the counter label is updated in the UI. Or there is some additional reference on GTK which only get deleted once a new window is created.

It doesn't appear to have any impact on Windows - no Window instances are deleted.

That is strange. I've never really looked at the Windows backend, maybe its finally time to do so. I'm assuming that #2061 addresses both the Windows and Gtk as well then?

@samschott samschott force-pushed the ownership-cleanup-samschott branch 2 times, most recently from eeb0675 to 9b3f80d Compare August 7, 2023 22:34
@freakboy3742
Copy link
Member

The CI tests also reveal a deeper problem - segfaults on iOS and GTK (The Android segfault is present in #2058). From what I can make out, this is caused by resetting Window.app to None; if I comment that line out of this diff, the macOS cleanup still occurs, but I don't get the segfault.

That's strange. IIUC, setting app = None should be a no-op for macOS at the implementation layer:

Yes - but macOS works fine; It's iOS and GTK that are the problem. iOS also has set_app as a no-op, so I'm not sure what's going on there; but GTK isn't a no-op.

The reason why I do want this change: Without it, when replacing the content of a window, the old content and all of its widgets will also remain in the App's widget registry. This is a test case which I actually forgot to add. Added this now in 8bf3bb4.

Oh - no disagreement that this (or something like it) is desirable - although, as I flagged in my last comment, there's an edge case:

  1. Create widget
  2. Store reference to widget
  3. Add widget to window
  4. Destroy window

(1) Adds the widget to the app registry; (3) adds the widget the the window registry; but (4) removes the widget from both registries. The window registry is being destroyed, so that makes sense - but it could be argued that the app registry reference should be preserved because the widget still exists in the app - it's just not on a window.

One solution to this would be to replace the internals of the registry with a WeakValueDictionary;

However, it's a weird partial fix on GTK - the garbage collection seems to be consistently 1 window behind (i.e., creating and closing the second window causes the first window to be deleted).

Hm, that could be an issue with the timing of the GC and when exactly the counter label is updated in the UI. Or there is some additional reference on GTK which only get deleted once a new window is created.

I can't say I know; my guess is that there might be something internal to PyGObject that isn't being released immediately - but that's a wild stab in the dark guess.

That is strange. I've never really looked at the Windows backend, maybe it's finally time to do so. I'm assuming that #2061 addresses both the Windows and Gtk as well then

Yes, it does.

@samschott
Copy link
Member Author

The window registry is being destroyed, so that makes sense - but it could be argued that the app registry reference should be preserved because the widget still exists in the app - it's just not on a window.

One solution to this would be to replace the internals of the registry with a WeakValueDictionary;

The reason why I like removing widgets from the app registry after they are removed from the window: The app' widget registry can be used as a way to fetch all widget's which are in some way reachable in the UI by the end user.

That being said, I do get your point and would not mind a WeakValueDictionary either. I just suspect that it won't fix the segfaults. I would bet that they are not caused by unassigning the app from the window's content, but by no longer having references to the widgets themselves that keep them alive. Let me try that.

I'll also rebase this CL to main, I get horribly confusing about existing and new test failures otherwise.

@freakboy3742
Copy link
Member

The window registry is being destroyed, so that makes sense - but it could be argued that the app registry reference should be preserved because the widget still exists in the app - it's just not on a window.

One solution to this would be to replace the internals of the registry with a WeakValueDictionary;

The reason why I like removing widgets from the app registry after they are removed from the window: The app' widget registry can be used as a way to fetch all widget's which are in some way reachable in the UI by the end user.

I guess that's true; however, at the other end of the lifecycle, you can create a widget, assign it to an app and not add it to a window, and it will still be in the registry. That doesn't happen in practice, because common usage is to add the content to the main window, and app/window is assigned transitively at essentially the same time; but it is technically possible.

@samschott
Copy link
Member Author

samschott commented Aug 12, 2023

I've rebased this to main and added a single test for the memory leak. This should provide a cleaner picture of test failures and changes. My current understanding:

  1. This does fix memory leaks on macOS and Gtk.
  2. It does not fix memory leaks on Windows because of cyclical references at the CLR level. I've removed the unnecessary ones, but the remaining issues are with bound instance methods which are used as callbacks set on native widgets / windows. Removing those breaks existing functionality but it does fix the GC problems.
  3. The segfault on Gtk still needs to be fixed. This is likely due to widget references no longer staying on forever.

Regarding changing the Windows and Widget registries to weak refs, your reasoning makes sense to me. Done.

@samschott
Copy link
Member Author

FYI, the off-by-one refcount for Gtk in the manual test case appears to be a timing issue. Adding an await asyncio.sleep(1) before the updating the refcount in the UI "fixes" it. I suspect some delay for dereferencing on the Gtk side.

@samschott
Copy link
Member Author

The Gtk segfault is definitely particular to the WebView tests only, but which one of the WebView tests crashes seems to be random. I suspect a timing issue around a callback that is failing because the Python object was already garbage collected. However, I am struggling to reproduce the segfault outside of the testbed app.

@freakboy3742, any ideas as to what might be causing it? Might be in any way related to the tests running in a thread?

@freakboy3742
Copy link
Member

@freakboy3742, any ideas as to what might be causing it? Might be in any way related to the tests running in a thread?

I can confirm I can reproduce the problem; but the cause is a mystery to me.

My best guess is that because the widget is being created as an async fixture, there's something cross-thread that is getting tied in knots about who owns the widget when it comes to disposal.

@samschott samschott force-pushed the ownership-cleanup-samschott branch 3 times, most recently from 559ba71 to d60f839 Compare August 16, 2023 20:41
@samschott
Copy link
Member Author

I've managed to work around the Gtk segfaults, but its very hacky. The workaround forces garbage collection of the widget after each test case. As I've described in the in-line comment, I suspect the underlying issue has something to do with multiple WebViews sharing the same resources and deallocation of one interfering with another in an unsafe manner when the WebViews are not created in the main thread. This is mostly speculation, but based on failing to reproduce the segfaults in a non-test setup.

I've also marked the memory leak test as xFail on Windows. I'd rather not try to fix it in this PR, especially since my knowledge of pythonnet is very limited.

I'll mark the PR as ready for review now, but would understand if you have concerns regarding the Gtk workaround.

@samschott samschott marked this pull request as ready for review August 16, 2023 20:49
Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased this against #2058 to get back to a clean diff.

The GTK hack isn't anywhere near as ugly as I was expecting, given your comment about it :-) I won't deny that I'd rather not need it... but in the grand scheme of complications I've had with GTK, this seems like a minor inconvenience, rather than a major concern.

The rest of the patch makes a lot of sense, and seems remarkably unintrusive, given the nature of the problem. I also prefer this approach to #2061.

The only concern I have isn't strictly with this PR, but with the side effect it makes more stark: that once closed, a window is effectively dead, and it can't be re-used.

This has always been mostly true; however, this current state of this PR makes macOS explosive if you try to show() a window that has previously been close()d, and definitely removes any possibility that it might be supported API (since it destroys the content of the window).

I've added a documentation note to this effect; but I wonder if other protections are in order. As it currently stands, it's possible for my app to retain a reference to Window, but there's no way to establish whether that reference is still live. I'm wondering if there is a need to add a closed attribute that is False by default, and set True once close() is invoked. As an extra layer of protection, we could raise an exception if any attribute is accessed once the window has been closed - but I'm not sure whether that's worth it.

@freakboy3742 freakboy3742 force-pushed the ownership-cleanup-samschott branch 2 times, most recently from 1d8b12b to 1ca8a85 Compare August 16, 2023 23:00
@samschott
Copy link
Member Author

I see your concern with the close semantics. I think it is definitely worth having a closed attribute, if not explicit exceptions.

An alternative which I have briefly considered: Actually allow closed windows to be shown again and keep them associated with the app. However, this does not seem to work at all on Winforms.

For macOS, the default behaviour is indeed to release windows on close, making it possible to show() them again. However, this can be overridden by setting isReleasedWhenClosed = False. Gtk, IIUC, keeps an internal reference to closed windows, the default behaviour here seems that it its ok to show them again. Winforms however seems to have no option to allow this. From their documentation, it appears that a "form" is always disposed of when closed, except for dialogs.

self.app.windows -= self
for widget in self.widgets:
self.app.widgets.remove(widget.id)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the widget registry is weak, the intention is to keep widgets associated with the app for their entire life. In this case, we no longer need to explicitly remove widgets from the registry on window close, right? I'll remove this part again.

@samschott samschott force-pushed the ownership-cleanup-samschott branch 2 times, most recently from dfbc5f0 to 5d33ad0 Compare August 17, 2023 14:34
@samschott samschott force-pushed the ownership-cleanup-samschott branch 2 times, most recently from ea6a93f to 95039c4 Compare August 18, 2023 14:22
This makes it clear that we don't collect the WebView from the current test run (which is still referenced by fixtures) but from the previous run.
@samschott samschott force-pushed the ownership-cleanup-samschott branch 3 times, most recently from c23d77f to e87562c Compare August 19, 2023 00:02
@mhsmith mhsmith changed the title Widget and window refernce cleanup Widget and window reference cleanup Oct 9, 2023
@freakboy3742 freakboy3742 merged commit 18137ad into beeware:main Oct 18, 2023
41 checks passed
@samschott samschott deleted the ownership-cleanup-samschott branch March 1, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows are not released when main reference is deleted
2 participants