Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected memory consumption since release 8.3.0 #5797

Closed
helgeerbe opened this issue Oct 27, 2021 · 29 comments · Fixed by #5844
Closed

Unexpected memory consumption since release 8.3.0 #5797

helgeerbe opened this issue Oct 27, 2021 · 29 comments · Fixed by #5844

Comments

@helgeerbe
Copy link

helgeerbe commented Oct 27, 2021

What did you do?

I'm the owner of picframe. This is a picture frame viewer for raspi, controlled via mqtt and automatticly integrated as mqtt device in homeassistant.

In an endless loop this programs displays one image on the frame, opens the next one and blend the new image smoothly in. To extract the image data PIL is used. So at the end just 2 images are kept in memory.

What did you expect to happen?

A stable memory consumption over a period of time. This was true up and including release 8.2.0.
I'm logging in a influxdb the system load and found this behavior.
Bildschirmfoto 2021-10-27 um 10 30 26

Memory consumption is as expected stable. When suddenly the sawtooth line appears. (High cpu load two most left yellow peaks are the nightly backups).

What have I done?

  • apt get full-upgrade
  • pull all python modules required by picframe to their latest releases
pi@picframe:~ $ python3 --version
Python 3.7.3
pi@picframe:~ $ picframe -v
INFO:start.py:starting ['/home/pi/.local/bin/picframe', '-v']
picframe version:  0+untagged.365.gfc64728

Checking required packages......
PIL :  8.4.0
exifread :  2.3.2
pi3d :  2.48
yaml :  6.0
paho.mqtt :  1.5.1
iptcinfo3 :  2.1.4
numpy :  1.21.2
ninepatch : installed, but no version info

Checking optional packages......
pyheif :  0.5.1

What actually happened?

Starting with release 8.3.0 memory consumption is strange. Picframe starts with 6% total memory. While running, PIL allocated permanently memory which is freed frequently (sawtooth line). But at the end memory consumption reaches 100% in total and picframe crashes.

What are your OS, Python and Pillow versions?

  • OS: Rasbian
    Description: Raspbian GNU/Linux 10 (buster)
    Release: 10
    Codename: buster
    Linux picframe 5.10.63-v7+ 1459 SMP Wed Oct 6 16:41:10 BST 2021 armv7l GNU/Linux
  • Python: 3.7.2
  • Pillow: 8.3.0 - 8.4.0 (earlier releases behave as expected)

issue in picframe

Issue for picframe is tracked here

@radarhere
Copy link
Member

Hi. Let me ask two questions

  1. Are you installing Pllow from a wheel? I'm curious whether compiling Pillow from source fixes the problem. Or in other words, does the problem lie in our code itself, or does it maybe lie in our wheel and one of our packaged dependencies?

  2. Are you able to put together a simple self-contained Python script, using just Pillow, that demonstrates the increased memory usage?

@helgeerbe
Copy link
Author

Hi @radarhere

  1. We use your provided packages. A simple pip3 uninstall Pillow and a pip3 install Pillow==version is enough to change the behavior.
  2. I will check our code and try to write a simple example. I'm curious to myself

@helgeerbe
Copy link
Author

helgeerbe commented Oct 29, 2021

Hi @radarhere

I hope I found the root cause in our code. We use the lib pi3d, down there in the code I see, that an numpy array is created. Here is an example code that shows this behavior.

from PIL import Image
import numpy as np

# endless loop
while(True) :
    np.array(Image.open("/home/pi/Pictures/Unu-2819.jpg"))

Using

  • Pillow 5.4.1 => CPU 100%, mem solid around 6%
  • Pillow 8.4.0 => CPU 100%, mem starting from 6% and rapidly growing to 100% and finally the process crashes.

@paddywwoof
Copy link
Contributor

paddywwoof commented Oct 29, 2021

@helgeerbe are you saying that the garbage collector in python can't cope with the creation of numpy arrays that use byte arrays generated by PIL.Image.open()? What happens if you create python variables to "hang" the info on. i.e.

from PIL import Image
import numpy as np

while(True) :
    im = Image.open("/home/pi/Pictures/Unu-2819.jpg")
    np_im = np.array(im)
    # then with
    # im = None
    # np_im = None

pi3d does a little bit of manual 'destruction' of the OpenGL buffers as they slip through the python GC

PS and what happens if you comment out the np_im = .. line?

@helgeerbe
Copy link
Author

I can run some further tests. Watching top in real-time I can see that memory is freed. But at the end memory consumption is faster than freeing. That explains the sigtooth graph.
Switching Pillow back to an older version, memory seems to be freed immediately. Memory consumption is always between 6 to 8 %. No growth as expected.

@paddywwoof
Copy link
Contributor

When I try it here (on Raspberry Pi, it's fine on ubuntu x64 with lots of memory) I find the installed version of pillow is often running Image.__del__() when I ctrl-C to stop. That method was completely removed after 30 Oct 2019 cc63f66#diff-4805c79264fea07df59058db82ed74bb2f5c5023e212ac678536a534c56e5be2 and it was wrapped in an only python3 block before.

It's generally a bad idea to try to second guess the GC but maybe something is needed for small memory computers.

@helgeerbe
Copy link
Author

while(True) :
    Image.open("/home/pi/Pictures/Unu-2819.jpg")

Works fine. No memory consumption.

while(True) :
    np_arr = np.array(Image.open("/home/pi/Pictures/Unu-2819.jpg"))

Does not work.

while(True) :
    im = Image.open("/home/pi/Pictures/Unu-2819.jpg")
    np_arr = np.array(im)
    im = None
    np_arr = None

Does not work.

pip3 uninstall Pillow
 pip3 install Pillow==8.2.0

And everything is fine again

@paddywwoof
Copy link
Contributor

Yes, I've tried all permutations and it always seems to hang onto the memory with v8.3.0 though it never actually runs out for me - probably not big enough image. Using the older __del__ with file pointer checks etc doesn't make any difference. It's possibly something in the C code to save reloading things to memory if they might be needed later, but I've no idea really.

@radarhere
Copy link
Member

radarhere commented Nov 8, 2021

Given what you're seeing, I suspect this is due to #5379, where we replaced our Image's __array_interface__ with __array__.

Are you sure this isn't a problem better addressed by the NumPy team?

@radarhere radarhere added the NumPy label Nov 8, 2021
@rlevine
Copy link

rlevine commented Nov 19, 2021

Hit the same issue - poor 512MB Heroku dynos didn't know what hit them. "Memory quota vastly exceeded." *chuckle*
36 2967x2992 8-bit png images with alpha channel, 33.86MB each, High water mark at 1.55GB, 1.2GB of memory that didn't go away.

============================================================
   973    181.9 MiB    181.9 MiB   @profile
   974                             def color_substitute(image, old_color, new_color):
   975    260.9 MiB     79.0 MiB       data = numpy.array(
   976    181.9 MiB      0.0 MiB       image)  # "data" is a height x width x 4 (assuming an alpha channel) numpy array
   977    260.9 MiB      0.0 MiB       red, green, blue, alpha = data.T  # Transpose bands
   978                                         
   979                                 # Replace old color with new color... (leaves alpha values alone...)
   980    278.8 MiB     17.9 MiB       color_areas = (red == old_color[0]) & (green == old_color[1]) & (blue == old_color[2])
   981    278.8 MiB      0.0 MiB       color_areas = color_areas.T  # transpose the array; back in the same orientation as original
   982    279.0 MiB      0.2 MiB       data[..., :-1][color_areas] = new_color  # vegematic
   983                                         
   984    279.0 MiB      0.0 MiB       return Image.fromarray(data)

Python 3.10.0
PIL : 8.4.0
numpy : 1.21.4

@radarhere
Copy link
Member

Hi @rlevine. To confirm my theory, would you be able to install https://github.com/radarhere/Pillow/tree/numpy (or just open PIL/Image.py and make the radarhere@2779305 changes) and see if that fixes the problem?

@rlevine
Copy link

rlevine commented Nov 19, 2021

Chicken dinner. Memory for 36 passes through the function goes from 1488 MiB to 267 MiB, about 1.28GB difference.
Somebody missing a decref for the underlying cstruct?

Thanks!

Rick


Pillow==8.4.0 as distributed

984 1488.4 MiB 0.0 MiB 1 return Image.fromarray(data)

8.4.0 with radarhere/Pillow@2779305 applied

984 268.6 MiB 0.0 MiB 1 return Image.fromarray(data)

@homm
Copy link
Member

homm commented Nov 19, 2021

I'm investigating this. The fix really simple:

        class ArrayData:
            def __init__(self, new):
                __array_interface__ = new

But I still don't understand the nature of cyclic references here. That's what I see:

import numpy, gc
from PIL import Image
im = Image.new('RGB', (4, 4))

try:
   gc.disable()
   gc.collect()
   gc.set_debug(gc.DEBUG_LEAK | gc.DEBUG_STATS)
   numpy.array(im)
   gc.collect()
   for item in gc.garbage:
      print('>>>', type(item), repr(item))
finally:
   gc.set_debug(0)
   gc.enable()
   gc.garbage.clear()
   gc.collect()
>>> <class 'tuple'> (4, 4, 3)
>>> <class 'dict'> {'shape': (4, 4, 3), 'typestr': '|u1', 'version': 3, 'data': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'}
>>> <class 'tuple'> (<class 'object'>,)
>>> <class 'dict'> {'__module__': 'PIL.Image', '__array_interface__': {'shape': (4, 4, 3), 'typestr': '|u1', 'version': 3, 'data': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'}, '__dict__': <attribute '__dict__' of 'ArrayData' objects>, '__weakref__': <attribute '__weakref__' of 'ArrayData' objects>, '__doc__': None}
>>> <class 'type'> <class 'PIL.Image.Image.__array__.<locals>.ArrayData'>
>>> <class 'getset_descriptor'> <attribute '__dict__' of 'ArrayData' objects>
>>> <class 'getset_descriptor'> <attribute '__weakref__' of 'ArrayData' objects>
>>> <class 'tuple'> (<class 'PIL.Image.Image.__array__.<locals>.ArrayData'>, <class 'object'>)

And from my point of view there couldn't be any circular refs.

@homm
Copy link
Member

homm commented Nov 19, 2021

Oh, I get it finally. It turns that every class definition is a circular reference to itself:

In [15]: class ArrayData: 
    ...:     pass 
    ...: ArrayData.__mro__                                                                                                      
Out[15]: (__main__.ArrayData, object)

The smallest case which will grow infinitely:

gc.disable()
while True:
    class ArrayData:
        pass

So we just should move ArrayData definition to Image namespace, for example:

class Image:
    class _ArrayData:
        def __init__(self, new):
            __array_interface__ = new
                
    def __array__(self, dtype=None):
        ...
        return np.array(self._ArrayData(new), dtype)

@homm
Copy link
Member

homm commented Nov 19, 2021

By the way, if anyone is curious about the performance of Pillow → NumPy conversions, there is an article with a helper function that works significantly faster: https://uploadcare.com/blog/fast-import-of-pillow-images-to-numpy-opencv-arrays/

Unfortunately, this enhancement is barely implementable within Pillow codebase.

@homm
Copy link
Member

homm commented Nov 19, 2021

@rlevine

Somebody missing a decref for the underlying cstruct?

Nope, this is totally garbage pressure. We should always be very careful with circular references when working with large objects.

@radarhere
Copy link
Member

Thanks @homm. I've created PR #5844 with your suggestion.

@rlevine
Copy link

rlevine commented Nov 20, 2021

Works! Thanks!

@pcicales
Copy link

pcicales commented Apr 8, 2022

I appear to be having this same issue with PIL==9.1.0.

I am observing the same sawtooth memory pattern, followed by 100% memory usage, when loading and unloading PIL images in a loop. Additionally I am passing PIL objects through helper functions, which may have something to do with it, I am not certain yet. The images are loaded using PIL, then converted to np arrays, then unloaded from memory using .close() and = None

@hugovk @homm could you try to reproduce? Same exact scripts as above.

@radarhere
Copy link
Member

The original fix to this issue involved making Image.__array__ more efficient. With the upcoming release of NumPy 1.23, we have the opportunity to simplify further by switching back to Image.__array_interface__.

Would you mind testing https://github.com/radarhere/Pillow/tree/numpy and seeing if that solves your problem?

@pcicales
Copy link

pcicales commented Apr 9, 2022

@radarhere Just installed. Here is the package info. I will be testing this shortly.

Previous:

 Name                    Version                   Build                    Channel
pillow                      9.1.0                     pypi_0                    pypi

 Name                 Version                   Build                    Channel
numpy                   1.21.2                py37h20f2e39_0
numpy-base              1.21.2                py37h79a1101_0

Current:

 Name                    Version                   Build                    Channel
pillow                      9.2.0.dev0               pypi_0                    pypi

 Name                    Version                   Build                    Channel
numpy                    1.21.2                py37h20f2e39_0
numpy-base                1.21.2                py37h79a1101_0

@pcicales
Copy link

pcicales commented Apr 9, 2022

@radarhere Seems to still have the leak - cpu mem gradually increases to 100%.

Should I revert back to an older version of PIL? Maybe 8.2.0?

@radarhere
Copy link
Member

8.2.0 is the version before changed from __array_interface__ to __array__. That version would made sense, except that if that version would work, I would also have expect https://github.com/radarhere/Pillow/tree/numpy to work.

If there is any older version of Pillow that works for you, that would be useful debugging information to have.

Would you be able to open a new issue, specify your operating system details, and simple code to demonstrate the problem?

@helgeerbe
Copy link
Author

Just to let you know. I upgraded to 9.1.0and it worked for me. So it must be something else. It seems not to be my original reported issue.

@pcicales
Copy link

pcicales commented Apr 11, 2022

@radarhere Interesting - I will do so. I am trying to pinpoint where memory is not being released; do you have any recommendations on best practices for doing so? Currently I am just logging cpu memory, but I think it would be helpful for us if I could pinpoint the PIL/np operation that is causing the issue.

Just to let you know. I upgraded to 9.1.0and it worked for me. So it must be something else. It seems not to be my original reported issue.

What is strange is that I am observing the same strange zig-zag memory pattern, which made me think it must have been related to your issue. I guess I may be seeing the same symptoms but for a different bug.

@wiredfool
Copy link
Member

wiredfool commented Apr 11, 2022

Try running it under valgrind, using the massif tool. It's slow, but it gives a really good profile of where and when memory is allocated. Alternately, the tracemalloc in py3 works pretty well to narrow it down to a line of python.

@Rahul-Matta
Copy link

Rahul-Matta commented Dec 5, 2022

@wiredfool @pcicales @helgeerbe
is it good to use > Pillow:9.1.0 to avoid memory consumption?
I am using Pillow:8.4.0 currently it's causing all four processors would hit 100% usage on the libc library.

@wiredfool
Copy link
Member

@Rahul-Matta That doesn't sound anything like what's happening here. You're best off posting a complete bug report as a new issue.

@Rahul-Matta
Copy link

Okay, @wiredfool #6781 as new bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants