Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of file resources #3577

Open
wants to merge 4 commits into
base: master
from

Conversation

@jdufresne
Copy link
Contributor

commented Jan 13, 2019

Better follow Python's file object semantics. User code is responsible for closing resources (usually through a context manager) in a deterministic way.

To achieve this, remove __del__ functions. These functions used to close open file handlers in an attempt to silence Python ResourceWarnings. However, using __del__ has the following drawbacks:

  • __del__ isn't called until the object's reference count reaches 0. Therefore, resource handlers remain open or in use longer than necessary.

  • The __del__ method isn't guaranteed to execute on system exit. See the Python documentation:

    https://docs.python.org/3/reference/datamodel.html#object.__del__

    It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.

  • Exceptions that occur inside __del__ are ignored instead of raised. This has the potential of hiding bugs. This is also in the Python documentation:

    Warning: Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead.

Instead, always close resource handlers when they are no longer in use. This will close the file handler at a specified point in the user's code and not wait until the interpreter chooses to. It is always guaranteed to run. And, if an exception occurs while closing the file handler, the bug will not be ignored.

Now, when code receives a ResourceWarning, it will highlight an area that is mishandling resources. It should not simply be silenced, but fixed by closing resources with a context manager.

All warnings that were emitted during tests have been cleaned up. To enable warnings, I passed the -Wa CLI option to Python. This exposed some mishandling of resources in ImageFile.__init__() and
SpiderImagePlugin.loadImageSeries(), they too were fixed.

@radarhere

This comment has been minimized.

Copy link
Member

commented Jan 17, 2019

See also #2818

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2019

Would you like me to add a "Fixes: ..." to the commit message?

@radarhere

This comment has been minimized.

Copy link
Member

commented Jan 17, 2019

No, that's not necessary.

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch 2 times, most recently from 08851cc to 5b232c2 Jan 27, 2019

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Jan 27, 2019

Any idea why Travis fails immediately with the "canceled" state?

@hugovk

This comment has been minimized.

Copy link
Member

commented Jan 27, 2019

We have auto-cancellation turned on:

Auto Cancellation allows you to only run builds for the latest commits in the queue. This setting can be applied to builds for Branch builds and Pull Request builds separately. Builds will only be canceled if they are waiting to run, allowing for any running jobs to finish.

https://travis-ci.org/python-pillow/Pillow/settings

I guess the grey ones were cancelled because they hadn't started building yet and a newer build was queued:

image

https://travis-ci.org/python-pillow/Pillow/pull_requests

@radarhere radarhere referenced this pull request Feb 1, 2019

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch 2 times, most recently from 53fc47b to f5ef94d Feb 1, 2019

@radarhere radarhere added Needs Rebase and removed Needs Rebase labels Feb 13, 2019

@hugovk
hugovk approved these changes Mar 17, 2019

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch from 6464c14 to e93146a Mar 17, 2019

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Mar 17, 2019

Rebased 🙂

@radarhere radarhere removed the Needs Rebase label Mar 17, 2019

@radarhere

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

#2019 is a memory leak issue that I believe should be currently fixed on master. My concern with this PR is that it would trigger those memory leaks, deliberately causing a reported issue. However, I imagine that you would hold that this is a problem with the end code, rather than with Pillow. Let me know if I've misunderstood.

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Mar 18, 2019

I'm not sure I fully follow how this would produce a memory leak. Sorry, the thread of that issue is a bit hard for me to know exactly where the issue will be. Are you able to provide sample code that you believe will now result in a memory leak? Or is this already captured in a test?

Is it this?

s = 0
for i in range(len(onlyfiles)):
    fname = onlyfiles[i]
    image = PIL.Image.open(fname)
    # Do something
    s += image.size[0]*image.size[1]

If so, then yeah. I would argue that the user isn't using this interface correctly and should instead do the following to close the resource after finished in a deterministic way.

s = 0
for i in range(len(onlyfiles)):
    fname = onlyfiles[i]
    with PIL.Image.open(fname) as image:
        # Do something
        s += image.size[0]*image.size[1]

If that is still memory leak then I can take a deeper look.

If you think this is too much of a backwards incompatible change, maybe it would make sense to wait until the next major release to include this.

@radarhere

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

Yes, I'm just discussing the first code sample you provided, and agree that the context manager should resolve the situation.

The next release is actually a major release, so if everyone is fine with this PR, then go for it, I just wanted to make sure that people were aware of this consequence.

@hugovk

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

Let's first confirm that it really is the case that memory leaks.

And if so, we'll need to communicate clearly the recommended way of doing things (we need to update the docs anyway).

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch from e93146a to 7139f6d Mar 22, 2019

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

Let's first confirm that it really is the case that memory leaks.

After rereading that full linked thread, it looks like it was concluded that the leak occurred in Python and has since been fixed. Am I understanding this right? If I understand, then no new action needs to be taken for this PR w/r/t a memory leak. Do you agree?

And if so, we'll need to communicate clearly the recommended way of doing things (we need to update the docs anyway).

Good call. I have updated the docs in the latest revision. This includes many changes to open_files.rst, a paragraph in the release notes, and some updates in the turorial to promote best practices.

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

I'm open to this, no preference. Please let me know what you'd like and I'll adjust the PR.

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch 2 times, most recently from 8f30323 to 79b5efd Mar 23, 2019

@radarhere radarhere force-pushed the jdufresne:warn-close branch from 79b5efd to fea17fc Mar 30, 2019

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch 2 times, most recently from 21ce529 to 3edc5f7 May 25, 2019

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch from 3edc5f7 to 89d6623 Jun 11, 2019

@hugovk

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

Hi, sorry for the delay on this. This obviously didn't make the last Pillow 6.0.0 release.

Would it help to announce the change ahead of making it? If so, a major release is also planned for 2020-01-01 (dropping Python 2.7).

I'm open to this, no preference. Please let me know what you'd like and I'll adjust the PR.

Let's go for this.

Please could you add something to the deprecations and removals and any other relevant docs? Can we add some deprecation warnings in code too?

One thing, that page says:

Deprecated features are only removed in major releases after an appropriate period of deprecation has passed.

If we deprecate now, it's 6 months until Pillow 6.0.0. Is that long enough? I estimate Pillow 7.0.0 will be when Python 3.5 is removed (it's EOL on 2020-09-13) in the 2020-10-01 release.


PS I guess most of the merge conflicts are from formatting with Black, so many can probably be resolved by running Black on this branch.

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch 4 times, most recently from 8f2948f to a6c64ea Jun 26, 2019

@jdufresne

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2019

Great! I have rebased and fixed all newly discovered warnings. I updated the docs given that 6.0.0 has been released.


Can we add some deprecation warnings in code too?

After thinking about this some more, what do you see a deprecation warning providing that isn't also provided by Python 3's resource warning? I guess this isn't a true backwards incompatible change in the sense that the resultant behavior is the same (CPython closes the file when the reference count hits 0) except it now emits a resource warning (which would otherwise be a deprecation warning). So the end result is the same, just it is a resource warning instead of a deprecation warning. WDYT?

@hugovk

This comment has been minimized.

Copy link
Member

commented Jun 28, 2019

Oops, I made some typos! This is what I meant to write (in bold):

If we deprecate now, it's 6 months until Pillow 6.0.0 7.0.0. Is that long enough? I estimate Pillow 7.0.0 8.0.0 will be when Python 3.5 is removed (it's EOL on 2020-09-13) in the 2020-10-01 release.

Let's make a new PR to put the text in the deprecations section saying this is deprecated in 6.1.0, and will be removed in 7.0.0. We merge that new PR for Monday's release.

And then this PR can be merged during Q4.

Sorry for the mixup!


Let's just go for the deprecations in docs. Thank you!

@hugovk

This comment has been minimized.

Copy link
Member

commented Jul 1, 2019

Let's make a new PR to put the text in the deprecations section saying this is deprecated in 6.1.0, and will be removed in 7.0.0. We merge that new PR for Monday's release.

Please see #3929.

@hugovk hugovk referenced this pull request Jul 1, 2019
22 of 22 tasks complete

@hugovk hugovk added this to the 7.0.0 milestone Jul 1, 2019

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch from 20731b0 to 037e2ab Jul 3, 2019

@radarhere radarhere force-pushed the jdufresne:warn-close branch 3 times, most recently from 89ebab6 to 96b71ab Jul 17, 2019

Improve handling of file resources
Follow Python's file object semantics. User code is responsible for
closing resources (usually through a context manager) in a deterministic
way.

To achieve this, remove __del__ functions. These functions used to
closed open file handlers in an attempt to silence Python
ResourceWarnings. However, using __del__ has the following drawbacks:

- __del__ isn't called until the object's reference count reaches 0.
  Therefore, resource handlers remain open or in use longer than
  necessary.

- The __del__ method isn't guaranteed to execute on system exit. See the
  Python documentation:

  https://docs.python.org/3/reference/datamodel.html#object.__del__

  > It is not guaranteed that __del__() methods are called for objects
  > that still exist when the interpreter exits.

- Exceptions that occur inside __del__ are ignored instead of raised.
  This has the potential of hiding bugs. This is also in the Python
  documentation:

  > Warning: Due to the precarious circumstances under which __del__()
  > methods are invoked, exceptions that occur during their execution
  > are ignored, and a warning is printed to sys.stderr instead.

Instead, always close resource handlers when they are no longer in use.
This will close the file handler at a specified point in the user's code
and not wait until the interpreter chooses to. It is always guaranteed
to run. And, if an exception occurs while closing the file handler, the
bug will not be ignored.

Now, when code receives a ResourceWarning, it will highlight an area
that is mishandling resources. It should not simply be silenced, but
fixed by closing resources with a context manager.

All warnings that were emitted during tests have been cleaned up. To
enable warnings, I passed the `-Wa` CLI option to Python. This exposed
some mishandling of resources in ImageFile.__init__() and
SpiderImagePlugin.loadImageSeries(), they too were fixed.

@jdufresne jdufresne force-pushed the jdufresne:warn-close branch from 96b71ab to 9df4e6d Aug 11, 2019

radarhere added 3 commits Sep 6, 2019
@hugovk

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

Any idea why centos_6_amd64 is failing?

@aclark4life aclark4life added this to New Issues in Pillow Sep 12, 2019

@radarhere

This comment has been minimized.

Copy link
Member

commented Sep 14, 2019

The centos failure is an intermittent problem that should be fixed by #4057

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
You can’t perform that action at this time.