Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with sequences #300

Open
wodim opened this issue Jul 20, 2016 · 20 comments

Comments

@wodim
Copy link

commented Jul 20, 2016

The following code eats all your memory in a matter of seconds:

from wand.color import Color
from wand.image import Image

while True:
    ri = Image(width=1000, height=1000, background=Color('red'))
    bi = Image(width=1000, height=1000, background=Color('blue'))
    gi = Image(width=1000, height=1000, background=Color('green'))

    container = Image(width=1000, height=1000)
    container.sequence.append(ri)
    container.sequence.append(bi)
    container.sequence.append(gi)

    container.format = 'gif'
    container.save(filename='/dev/null')

    ri.destroy()
    bi.destroy()
    gi.destroy()
    for frame in container.sequence:
        frame.destroy()
    container.destroy()

    print('saved')

Is it a bug? How can I destroy all the images so the memory is freed? (I'm supposedly destroying them all, but it's not working)

@dahlia

This comment has been minimized.

Copy link
Collaborator

commented Jul 22, 2016

Color and Image are intended to be used as a context manager. Could you try opening these Color and Image objects using with statement?

@wodim

This comment has been minimized.

Copy link
Author

commented Jul 22, 2016

That's not what that document says, it says I can use with or I can use .destroy(). Given the way my code is laid out (it has nothing to do with that small test case) using with statements would make things pretty complicated. Is this a bug?

@dahlia

This comment has been minimized.

Copy link
Collaborator

commented Jul 22, 2016

You could use .destroy() method instead of with statement, but color objects (for backgrounds) in your code won't be destroyed.

@wodim

This comment has been minimized.

Copy link
Author

commented Jul 23, 2016

It's exactly the same with with, it eats all the RAM, and that taking into account my code is impossible to write with with unless you nest every colour inside a with statement, which is a horribly awful thing to do.

from wand.color import Color
from wand.image import Image

while True:
    with Image(width=1000, height=1000, background=Color('red')) as ri:
        with Image(width=1000, height=1000) as container:
            container.sequence.append(ri)

            container.format = 'gif'
            container.save(filename='/dev/null')

    print('saved')
@dahlia

This comment has been minimized.

Copy link
Collaborator

commented Jul 23, 2016

@wodim You need to destroy used colors as well since they are also resources e.g.:

from wand.color import Color
from wand.image import Image

while True:
    with Color('red') as red, Image(width=1000, height=1000, background=red) as ri:
        with Image(width=1000, height=1000) as container:
            container.sequence.append(ri)

            container.format = 'gif'
            container.save(filename='/dev/null')

    print('saved')

@dahlia dahlia added the bug label Jul 23, 2016

@wodim

This comment has been minimized.

Copy link
Author

commented Jul 23, 2016

That code also eats all the RAM.

@dahlia

This comment has been minimized.

Copy link
Collaborator

commented Jul 23, 2016

I will try reproducing the phenomenon.

@danxshap

This comment has been minimized.

Copy link

commented Jul 26, 2016

@dahlia not sure if this is the same issue or should be submitted separately, please let me know.

When I call this function over and over in a loop, it works fine:

def generate_pdf_thumbnail(pdf_page=1):
    with Image(filename='test.pdf') as pdf_image:
       png_data = pdf_image.make_blob('png')

I updated the function so that it can create a PNG from a specific page of the PDF, using the suggestion in #186. When I call this function over and over in a loop, the available memory on the machine gradually goes down until it runs out and then the function fails because it's looking for a temporary file that doesn't exist:

def generate_pdf_thumbnail(pdf_page=1):
    with Image(filename='test.pdf') as pdf_image:
        # Select page
        try:
            with pdf_image.sequence[pdf_page-1].clone() as page_image:
                png_data = page_image.make_blob('png')
        except IndexError:
            return

I'm using Wand 0.4.3 on Ubuntu 14.0.4 LTS and I can't figure out how to create a PNG out of a specific page of a PDF without this memory leak problem.

@cballenar

This comment has been minimized.

Copy link

commented Aug 10, 2016

Hi everyone,

I'm also experiencing a similar issue with the following code:

# convert array of images to pdf
with Image() as pdf_sequence:
    for image_path in array_of_images:
        with Image(filename=image_path) as page:
            pdf_sequence.sequence.append(page)

    pdf_sequence.format = 'pdf'
    pdf_sequence.save(filename=output_path)

I've tried adding .destroy() with no changes whatsoever.

@mlissner

This comment has been minimized.

Copy link

commented Aug 25, 2016

Well, I just upgraded my stack to use Wand instead of IM itself, and I have exactly this problem. Damn!

There's a bunch of us here now. Anybody made any progress on this? I've never hunted down a leak like this before, but if @dahlia is willing to push out a release for this fix, we should work together to get this fixed.

@voutilad

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2016

There appears to be a leak due to the way the Sequence class is implemented. I'm digging more into it, but here are two examples that are simplified a bit:

def this_one_leaks():
    with Image(width=1000, height=1000) as outer:
        with Image(width=1000, height=1000) as inner:
            inner.sequence.append(outer)
            inner.format = 'gif'
            inner.save(filename='/dev/null')


def this_one_is_tight():
    with Image(width=1000, height=1000) as outer:
        with Image(width=1000, height=1000) as inner:
            inner.sequence.append(outer)
            inner.format = 'gif'
            inner.save(filename='/dev/null')
            inner.sequence.pop()
            inner.sequence.pop()
            outer.sequence.pop()
`

If i figure out a fix, I'll let you know and submit a PR.
dahlia added a commit that referenced this issue Aug 30, 2016
Merge pull request #302 from voutilad/issue-300
possible short term fix for #300
@dahlia

This comment has been minimized.

Copy link
Collaborator

commented Aug 30, 2016

#302 fixed it.

@dahlia dahlia closed this Aug 30, 2016

@wodim

This comment has been minimized.

Copy link
Author

commented Aug 30, 2016

Did this fix the .destroy() use case too, or am I forced to use with?

@wodim

This comment has been minimized.

Copy link
Author

commented Aug 30, 2016

Okay, I confirm this bug is not fixed yet. The code I opened this ticket with still triggers an OOM.

If you don't want to support the .destroy() command, please remove it from the documentation.

@iurisilvio

This comment has been minimized.

Copy link

commented Sep 7, 2016

Why not override the destroy method instead of __exit__? The __exit__ already call destroy.

@voutilad

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2016

From my understanding, there are two areas where memory can be consumed.

The first is in Python. My override to __exit__ was to resolve an issue where references to Python objects (specifically the SingleImage type) were not being properly collected when the parent Image was cleaned up by the context management. The underlying reason these were dangling to begin with has to do with the logic behind the Sequence class.

The second area is in ImageMagick and is a bit more complicated. IM uses a lot of caching for pixels and even the simplest file such as a 2MB PDF can consume over 1GB of cache. The default values for IM's memory usage will tend to first consume a lot of physical memory, then spill over to other locations like swap space or files on disk. Because Wand uses IM via a shared library, it will look like Python is consuming the memory...even if it's memory allocated and managed by ImageMagick.

For more details on IM, see the IM environment variables doc.

Ultimately, I think the doc should be updated in the mean time and ultimately the root cause of the Python memory leak should be addressed so it won't happen if developers don't use the with context statements.

As an aside: There are some issues where destroy() doesn't properly get IM to clean up memory and I have personal seen differing behavior across IM versions and architectures. For instance, destroy() was cleaning up memory and files on disk properly in macOS using IM v6.9.5, but in Ubuntu 14.04 it was leaving lots of 63MB files behind in /tmp. These seem like known IM issues and have less to do with Wand.

You can see how your environment is currently set by using the identify binary. Here's an example from my laptop using the default settings:

$ identify -list resource
Resource limits:
  Width: 214.7MP
  Height: 214.7MP
  Area: 34.36GP
  Memory: 16GiB
  Map: 32GiB
  Disk: unlimited
  File: 3648
  Thread: 1
  Throttle: 0
  Time: unlimited
@wodim

This comment has been minimized.

Copy link
Author

commented Oct 3, 2016

@dahlia can you please reopen this until it's fixed?

@dahlia dahlia reopened this Oct 22, 2016

@wodim

This comment has been minimized.

Copy link
Author

commented Jan 9, 2017

It does not look like it leaks anymore, at least after f97277b. Can someone confirm?

@NotSoSuper

This comment has been minimized.

Copy link

commented Mar 2, 2017

Using the context manager to create the image is now yielding me these errors: https://gist.github.com/NotSoSuper/e6e7b828d3b307af0378cf049480c14e

The code

i = 0
with wand.image.Image() as img:
  for frame in frames:
    with wand.image.Image(file=frame) as im:
      im.transform(resize='800x800>')
      im.liquid_rescale(width=int(im.width*0.5), height=int(im.height*0.5), delta_x=1, rigidity=0)
      im.liquid_rescale(width=int(im.width*1.5), height=int(im.height*1.5), delta_x=2, rigidity=0)
      im.resize(im.width, im.height)
      img.sequence.append(im)
      img.sequence[i].delay = 1
      i += 1
  img.type = 'optimize'
  final = BytesIO()
  img.save(file=final)
final.seek(0)
return final.read()

I've been having memory issues for a while and just saw this issue, hopefuly this can get fixed.

@julianopacheco

This comment has been minimized.

Copy link

commented Jul 29, 2019

I was also suffering from memory leaks issues. After some research and tweaking the code implementation, my issues were resolved.
I basically worked correctly using with and destroy() .
In some cases I could use with to open and read the file, as in the example below:
with wi(filename = pdf_file, resolution = 300) as pdf:

And in other cases I had to use the destroy() function, preferably inside a try / finally, as below:

try: for img in pdfImg.sequence: # code finally: img.destroy()

The second case, I cann't use with because I had to iterate the pages through the sequence.
This solution resolved my problems with memory leaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.