Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: do not try to help CPython with garbage collection #23712

Merged
merged 1 commit into from Aug 25, 2022

Conversation

tacaswell
Copy link
Member

@tacaswell tacaswell commented Aug 23, 2022

Matplotlib has a large number of circular references (between figure and
manager, between axes and figure, axes and artist, figure and canvas, and ...)
so when the user drops their last reference to a Figure (and clears it from
pyplot's state), the objects will not immediately deleted.

To account for this we have long (goes back to
e34a333 the "reorganize code" commit in 2004
which is the end of history for much of the code) had a gc.collect() in the
close logic in order to promptly clean up after our selves.

However, unconditionally calling gc.collect and be a major performance
issue (see #3044 and
#3045) because if there are a
large number of long-lived user objects Python will spend a lot of time
checking objects that are not going away are never going away.

Instead of doing a full collection we switched to clearing out the lowest two
generations. However this both not doing what we want (as most of our objects
will actually survive) and due to clearing out the first generation opened us
up to having unbounded memory usage.

In cases with a very tight loop between creating the figure and destroying
it (e.g. plt.figure(); plt.close()) the first generation will never grow
large enough for Python to consider running the collection on the higher
generations. This will lead to un-bounded memory usage as the long-lived
objects are never re-considered to look for reference cycles and hence are
never deleted because their reference counts will never go to zero.

closes #23701

I'm not sure how to test this, I do not want to put a maybe memory exhausting test in. There might be something that can be done using gc's logging / debugging / callback hooks.

Matplotlib has a large number of circular references (between figure and
manager, between axes and figure, axes and artist, figure and canvas, and ...)
so when the user drops their last reference to a `Figure` (and clears it from
pyplot's state), the objects will not immediately deleted.

To account for this we have long (goes back to
e34a333 the "reorganize code" commit in 2004
which is the end of history for much of the code) had a `gc.collect()` in the
close logic in order to promptly clean up after our selves.

However, unconditionally calling `gc.collect` and be a major performance
issue (see matplotlib#3044 and
matplotlib#3045) because if there are a
large number of long-lived user objects Python will spend a lot of time
checking objects that are not going away are never going away.

Instead of doing a full collection we switched to clearing out the lowest two
generations.  However this both not doing what we want (as most of our objects
will actually survive) and due to clearing out the first generation opened us
up to having unbounded memory usage.

In cases with a very tight loop between creating the figure and destroying
it (e.g. `plt.figure(); plt.close()`) the first generation will never grow
large enough for Python to consider running the collection on the higher
generations.  This will lead to un-bounded memory usage as the long-lived
objects are never re-considered to look for reference cycles and hence are
never deleted because their reference counts will never go to zero.

closes matplotlib#23701
@tacaswell tacaswell added this to the v3.5.4 milestone Aug 23, 2022
@tacaswell tacaswell marked this pull request as ready for review August 23, 2022 01:40
@tacaswell tacaswell requested a review from QuLogic August 23, 2022 02:45
@nschloe
Copy link
Contributor

nschloe commented Aug 23, 2022

This might also fix #22448.

@anntzer
Copy link
Contributor

anntzer commented Aug 23, 2022

Not to completely nerd-snipe this issue, but I had wondered some time ago (basically when thinking about this gc.collect call, although I certainly didn't go through all the analysis you did!) whether CPython could have an API like gc.try_collect_at_frame_exit(obj) which says, at the end of this frame, after all decrefs for frame locals have been done, check whether obj could be gc'ed (i.e. whether it is only kept alive by an isolated refcycle, and if so, gc that cycle).

@tacaswell
Copy link
Member Author

whether CPython could have an API like...

I'm also not sure. I do not think there is currently enough information to do that but am not clear on how much more you would have to track to get there. My instinct is that this would add more overhead in cases where it is not used than you would ever get back in the cases when it is used.

I think a Python exposed function for "run the full collection, but only if you would have otherwise" is what we really want here.

The other thing that just connected in my brain is that dictionaries (which I assume includes instance dictionaries) are only swept in the oldest generation (which is why our objects all survived the gc.collect(1).

I had a thought about being aggressive about calling fig.cla() when we destroy the manager, but I think that would break the workflow of

fig, ax = plt.subplots()
ax.plot(...)
plt.show(block=True)
fig.savefig(...)

Similarly, any way I can think of putting in weak references is going to break someone terribly.

@tacaswell tacaswell modified the milestones: v3.5.4, v3.6.0 Aug 25, 2022
@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Aug 25, 2022
@anntzer
Copy link
Contributor

anntzer commented Aug 25, 2022

Perhaps we could call cla() (and recursively detach artists from the parent figure) when plt.close(fig) is called, but let pressing-on-the-X-to-close just unregister the manager and not bother to call cla()? I think a user that programatically closes the figure is more likely to be running a loop where the memleak may matter, whereas interactive users are more likely to be OK with a delayed GC?

@tacaswell
Copy link
Member Author

That is reasonable. It will be a bunch of work, but something like

  • 3.7
    • add a "and recursively destroy" flag to plt.close
    • in plt.closeif the flag is not set by the user mark the Figure with a fig._zombie flag
    • wrap all of the figure methods to warn if the zombie flag is set
  • 3.8 flip the flag to default to True and remove the zombie warnings

might be a deprecation path to that?

That said, it leaves a weird inconsistency between doing plt.close() and hitting the 'x' if you have done plt.ion() and are working interactively.

@QuLogic
Copy link
Member

QuLogic commented Aug 25, 2022

Doesn't the inline backend auto-close figures? I'm pretty sure many people use plt.savefig after that, so we can't really break that.

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably target 3.6 on this and not 3.5.4.

@greglucas greglucas merged commit 87b801b into matplotlib:main Aug 25, 2022
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Aug 25, 2022
@tacaswell tacaswell deleted the fix_no_gc_collect branch August 25, 2022 19:41
tacaswell added a commit that referenced this pull request Aug 25, 2022
…712-on-v3.6.x

Backport PR #23712 on branch v3.6.x (FIX: do not try to help CPython with garbage collection)
@QuLogic QuLogic mentioned this pull request Sep 9, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: plt.figure(), plt.close() leaks memory
6 participants