Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with sequences #300

Closed
wodim opened this issue Jul 20, 2016 · 21 comments
Closed

Memory leak with sequences #300

wodim opened this issue Jul 20, 2016 · 21 comments
Labels
Milestone

Comments

@wodim
Copy link

@wodim wodim commented Jul 20, 2016

The following code eats all your memory in a matter of seconds:

from wand.color import Color
from wand.image import Image

while True:
    ri = Image(width=1000, height=1000, background=Color('red'))
    bi = Image(width=1000, height=1000, background=Color('blue'))
    gi = Image(width=1000, height=1000, background=Color('green'))

    container = Image(width=1000, height=1000)
    container.sequence.append(ri)
    container.sequence.append(bi)
    container.sequence.append(gi)

    container.format = 'gif'
    container.save(filename='/dev/null')

    ri.destroy()
    bi.destroy()
    gi.destroy()
    for frame in container.sequence:
        frame.destroy()
    container.destroy()

    print('saved')

Is it a bug? How can I destroy all the images so the memory is freed? (I'm supposedly destroying them all, but it's not working)

@dahlia
Copy link
Collaborator

@dahlia dahlia commented Jul 22, 2016

Color and Image are intended to be used as a context manager. Could you try opening these Color and Image objects using with statement?

Loading

@wodim
Copy link
Author

@wodim wodim commented Jul 22, 2016

That's not what that document says, it says I can use with or I can use .destroy(). Given the way my code is laid out (it has nothing to do with that small test case) using with statements would make things pretty complicated. Is this a bug?

Loading

@dahlia
Copy link
Collaborator

@dahlia dahlia commented Jul 22, 2016

You could use .destroy() method instead of with statement, but color objects (for backgrounds) in your code won't be destroyed.

Loading

@wodim
Copy link
Author

@wodim wodim commented Jul 23, 2016

It's exactly the same with with, it eats all the RAM, and that taking into account my code is impossible to write with with unless you nest every colour inside a with statement, which is a horribly awful thing to do.

from wand.color import Color
from wand.image import Image

while True:
    with Image(width=1000, height=1000, background=Color('red')) as ri:
        with Image(width=1000, height=1000) as container:
            container.sequence.append(ri)

            container.format = 'gif'
            container.save(filename='/dev/null')

    print('saved')

Loading

@dahlia
Copy link
Collaborator

@dahlia dahlia commented Jul 23, 2016

@wodim You need to destroy used colors as well since they are also resources e.g.:

from wand.color import Color
from wand.image import Image

while True:
    with Color('red') as red, Image(width=1000, height=1000, background=red) as ri:
        with Image(width=1000, height=1000) as container:
            container.sequence.append(ri)

            container.format = 'gif'
            container.save(filename='/dev/null')

    print('saved')

Loading

@dahlia dahlia added the bug label Jul 23, 2016
@wodim
Copy link
Author

@wodim wodim commented Jul 23, 2016

That code also eats all the RAM.

Loading

@dahlia
Copy link
Collaborator

@dahlia dahlia commented Jul 23, 2016

I will try reproducing the phenomenon.

Loading

@danxshap
Copy link

@danxshap danxshap commented Jul 26, 2016

@dahlia not sure if this is the same issue or should be submitted separately, please let me know.

When I call this function over and over in a loop, it works fine:

def generate_pdf_thumbnail(pdf_page=1):
    with Image(filename='test.pdf') as pdf_image:
       png_data = pdf_image.make_blob('png')

I updated the function so that it can create a PNG from a specific page of the PDF, using the suggestion in #186. When I call this function over and over in a loop, the available memory on the machine gradually goes down until it runs out and then the function fails because it's looking for a temporary file that doesn't exist:

def generate_pdf_thumbnail(pdf_page=1):
    with Image(filename='test.pdf') as pdf_image:
        # Select page
        try:
            with pdf_image.sequence[pdf_page-1].clone() as page_image:
                png_data = page_image.make_blob('png')
        except IndexError:
            return

I'm using Wand 0.4.3 on Ubuntu 14.0.4 LTS and I can't figure out how to create a PNG out of a specific page of a PDF without this memory leak problem.

Loading

@cballenar
Copy link

@cballenar cballenar commented Aug 10, 2016

Hi everyone,

I'm also experiencing a similar issue with the following code:

# convert array of images to pdf
with Image() as pdf_sequence:
    for image_path in array_of_images:
        with Image(filename=image_path) as page:
            pdf_sequence.sequence.append(page)

    pdf_sequence.format = 'pdf'
    pdf_sequence.save(filename=output_path)

I've tried adding .destroy() with no changes whatsoever.

Loading

@mlissner
Copy link

@mlissner mlissner commented Aug 25, 2016

Well, I just upgraded my stack to use Wand instead of IM itself, and I have exactly this problem. Damn!

There's a bunch of us here now. Anybody made any progress on this? I've never hunted down a leak like this before, but if @dahlia is willing to push out a release for this fix, we should work together to get this fixed.

Loading

@voutilad
Copy link
Contributor

@voutilad voutilad commented Aug 29, 2016

There appears to be a leak due to the way the Sequence class is implemented. I'm digging more into it, but here are two examples that are simplified a bit:

def this_one_leaks():
    with Image(width=1000, height=1000) as outer:
        with Image(width=1000, height=1000) as inner:
            inner.sequence.append(outer)
            inner.format = 'gif'
            inner.save(filename='/dev/null')


def this_one_is_tight():
    with Image(width=1000, height=1000) as outer:
        with Image(width=1000, height=1000) as inner:
            inner.sequence.append(outer)
            inner.format = 'gif'
            inner.save(filename='/dev/null')
            inner.sequence.pop()
            inner.sequence.pop()
            outer.sequence.pop()
`

If i figure out a fix, I'll let you know and submit a PR.

Loading

dahlia added a commit that referenced this issue Aug 30, 2016
@dahlia
Copy link
Collaborator

@dahlia dahlia commented Aug 30, 2016

#302 fixed it.

Loading

@dahlia dahlia closed this Aug 30, 2016
@wodim
Copy link
Author

@wodim wodim commented Aug 30, 2016

Did this fix the .destroy() use case too, or am I forced to use with?

Loading

@wodim
Copy link
Author

@wodim wodim commented Aug 30, 2016

Okay, I confirm this bug is not fixed yet. The code I opened this ticket with still triggers an OOM.

If you don't want to support the .destroy() command, please remove it from the documentation.

Loading

@iurisilvio
Copy link

@iurisilvio iurisilvio commented Sep 7, 2016

Why not override the destroy method instead of __exit__? The __exit__ already call destroy.

Loading

@voutilad
Copy link
Contributor

@voutilad voutilad commented Sep 7, 2016

From my understanding, there are two areas where memory can be consumed.

The first is in Python. My override to __exit__ was to resolve an issue where references to Python objects (specifically the SingleImage type) were not being properly collected when the parent Image was cleaned up by the context management. The underlying reason these were dangling to begin with has to do with the logic behind the Sequence class.

The second area is in ImageMagick and is a bit more complicated. IM uses a lot of caching for pixels and even the simplest file such as a 2MB PDF can consume over 1GB of cache. The default values for IM's memory usage will tend to first consume a lot of physical memory, then spill over to other locations like swap space or files on disk. Because Wand uses IM via a shared library, it will look like Python is consuming the memory...even if it's memory allocated and managed by ImageMagick.

For more details on IM, see the IM environment variables doc.

Ultimately, I think the doc should be updated in the mean time and ultimately the root cause of the Python memory leak should be addressed so it won't happen if developers don't use the with context statements.

As an aside: There are some issues where destroy() doesn't properly get IM to clean up memory and I have personal seen differing behavior across IM versions and architectures. For instance, destroy() was cleaning up memory and files on disk properly in macOS using IM v6.9.5, but in Ubuntu 14.04 it was leaving lots of 63MB files behind in /tmp. These seem like known IM issues and have less to do with Wand.

You can see how your environment is currently set by using the identify binary. Here's an example from my laptop using the default settings:

$ identify -list resource
Resource limits:
  Width: 214.7MP
  Height: 214.7MP
  Area: 34.36GP
  Memory: 16GiB
  Map: 32GiB
  Disk: unlimited
  File: 3648
  Thread: 1
  Throttle: 0
  Time: unlimited

Loading

@wodim
Copy link
Author

@wodim wodim commented Oct 3, 2016

@dahlia can you please reopen this until it's fixed?

Loading

@dahlia dahlia reopened this Oct 22, 2016
@wodim
Copy link
Author

@wodim wodim commented Jan 9, 2017

It does not look like it leaks anymore, at least after f97277b. Can someone confirm?

Loading

@NotSoSuper
Copy link

@NotSoSuper NotSoSuper commented Mar 2, 2017

Using the context manager to create the image is now yielding me these errors: https://gist.github.com/NotSoSuper/e6e7b828d3b307af0378cf049480c14e

The code

i = 0
with wand.image.Image() as img:
  for frame in frames:
    with wand.image.Image(file=frame) as im:
      im.transform(resize='800x800>')
      im.liquid_rescale(width=int(im.width*0.5), height=int(im.height*0.5), delta_x=1, rigidity=0)
      im.liquid_rescale(width=int(im.width*1.5), height=int(im.height*1.5), delta_x=2, rigidity=0)
      im.resize(im.width, im.height)
      img.sequence.append(im)
      img.sequence[i].delay = 1
      i += 1
  img.type = 'optimize'
  final = BytesIO()
  img.save(file=final)
final.seek(0)
return final.read()

I've been having memory issues for a while and just saw this issue, hopefuly this can get fixed.

Loading

@julianopacheco
Copy link

@julianopacheco julianopacheco commented Jul 29, 2019

I was also suffering from memory leaks issues. After some research and tweaking the code implementation, my issues were resolved.
I basically worked correctly using with and destroy() .
In some cases I could use with to open and read the file, as in the example below:
with wi(filename = pdf_file, resolution = 300) as pdf:

And in other cases I had to use the destroy() function, preferably inside a try / finally, as below:

try: for img in pdfImg.sequence: # code finally: img.destroy()

The second case, I cann't use with because I had to iterate the pages through the sequence.
This solution resolved my problems with memory leaks.

Loading

@emcconville emcconville added this to the Wand 0.6.0 milestone Dec 19, 2019
@emcconville emcconville mentioned this issue Mar 1, 2020
@emcconville
Copy link
Owner

@emcconville emcconville commented May 13, 2020

Should be fix. Testing the memory reference rewrite for the better part of 2 months, and seems to be stable.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants