Fix various memory leaks discovered through valgrind #3258

mdboom · 2014-07-14T20:06:54Z

This fixes what #3207 started. Turns out there were two problems with #3207: the data array ended up with 2 refcount at the end of the function. Then, when it is later passed to _image.Image.frombuffer, that method also leaked a reference. sigh.

Also, fixes a leak in the collection transform array, and another in the "generic" font I/O that is used to load fonts from a BytesIO buffer (very rarely used, but the test suite does exercise it).

mdboom · 2014-07-14T21:06:30Z

So, it seems this works, but still doesn't address the out of memory failures on Travis:

https://travis-ci.org/matplotlib/matplotlib/jobs/29920943

tacaswell · 2014-07-14T21:32:33Z

Maybe we should turn down the number of processes we are using. Is there some way to query the travis machine we are running on? I would expect that some of their machines are under powered and we get the failures when we randomly are assigned to one of those.

If they have fixed core to memory ratio and we get one with less than 8 cores this might make sense.

mdboom · 2014-07-14T21:48:06Z

That's a good point. I'm not quite sure how they get provisioned, but we could add lshw (or something) to the top of our scripts to get some output for a while and see if there's any correlation between the failures and hardware (though I guess if it's a virtual machine lshw will lie anyway)...

efiring · 2014-07-15T19:09:13Z

@mdboom, is there any reason not go go ahead and merge this now?

mdboom · 2014-07-15T19:38:56Z

Stay tuned... I think I have a solution for the bigger problem in the works. Just running again over all the tests now...

This works by avoiding storing a reference to the figure in the "do_test" enclosure in the image_comparison decorator. nose keeps a reference to all of its test functions, and thereby holds on to every figure in every test that uses the image_comparison decorator (which is most). Instead of using the figure in the outer scope, this just uses the figure number, and then it's looked up when the test function is actually run. Also adds gc.collect() so we can ensure there are zero figures in memory at the start of each test. Probably not strictly necessary, but it makes analysing memory usage easier.

mdboom · 2014-07-15T20:12:04Z

This now (I think) addresses the steady growth of memory usage in the test suite. Fortunately, it is something specific to the test suite itself and does not indicate a deeper issue that would affect user code.

The image_comparison test generates a single nose test for each combination of figure and output test that a given test produces. These are managed as function closures, which include a reference to each figure. nose retains all of these function closures for the run of the test (probably so it can just display its name if the test fails etc.) The fix here is to pass the figure number rather than the figure to the closure.

Here are the results from valgrind's massif tool:

BEFORE:

    GB
2.751^                                                              #         
     |                                                             @#         
     |                                                             @# :  @:@:@
     |                                              @::::          @#::@:@:@:@
     |                                              @: :       @:@@@#::@:@:@:@
     |                                         @@   @: : ::::::@ @ @#::@:@:@:@
     |                                      :::@ :::@: : ::: : @ @ @#::@:@:@:@
     |                          ::         ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                       :::: :::::::::::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                   @::::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                   @: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                   @: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                   @: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                  :@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                  :@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |                @::@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |            @:::@::@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |         :@@@: :@::@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |     ::@::@ @: :@::@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
     |  :::::@::@ @: :@::@: ::: : ::: : :: ::: @ :::@: : ::: : @ @ @#::@:@:@:@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   866.7

AFTER:

    MB
810.6^                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                   #                                                    
     |                 ::#                                                    
     |                 : #                                                    
     |                :: #::                                                  
     |                :: #:                                                   
     |                :: #:                                             :: :@:
     |                :: #:                                     ::::::::::::@:
     |               ::: #:                                  :::::: ::: ::::@:
     |    :      @@::::: #:  @@           ::::@@        ::@@:: :::: ::: ::::@:
     | ::::::::@@@ : ::: #: :@ ::::::::::::: :@ @:::::::: @ :: :::: ::: ::::@:
   0 +----------------------------------------------------------------------->Gi
     0                                                                   939.1

There still obviously one test whose peak usage is quite large (it might be nice to get to the bottom of that), but the monotonically increasing problem is gone.

efiring · 2014-07-15T20:26:08Z

Encouraging! The specgram tests are using much more memory (and time) than I think they should need to do for testing purposes; they are using rather large data arrays, and making dense plots. There might be other culprits as well.

While you are looking at testing: the new option to include or exclude pep8 is good, but at present it is causing all the tests for 2.7 to be run twice.

mdboom · 2014-07-15T20:35:31Z

Encouraging! The specgram tests are using much more memory (and time) than I think they should need to do for testing purposes; they are using rather large data arrays, and making dense plots. There might be other culprits as well.

Yeah -- now we are down to one-off problems that we can fix incrementally.

While you are looking at testing: the new option to include or exclude pep8 is good, but at present it is causing all the tests for 2.7 to be run twice.

It is? Looking in Travis, I see a 2.7 run with everything- but-PEP8 and a 2.7 test with only PEP8.

efiring · 2014-07-15T20:48:30Z

Now I see you are right, of course. I think that in looking at earlier Travis results on some other PR I saw that the pep8 build was taking about as long as the others, and I mistakenly concluded it was running everything plus pep8, without looking at the output.

mdboom · 2014-07-16T13:57:06Z

Travis failure appears to be a networking issue downloading "travis-artifacts" (why that's not locally mirrored at Travis, I don't know). In any case, no OOM errors. I'm going to restart the whole suite again to see if I can trigger the memory errors again.

tacaswell · 2014-07-16T14:56:31Z

ERROR:  Error installing travis-artifacts:
    fog-brightbox requires fog-json (>= 0)

It looks like their install scripts are broken as fog-json does not appear anywhere else in the logs.

Fix various memory leaks discovered through valgrind

mdboom added 2 commits July 14, 2014 16:00

Fix memory leak in FT2Font.

760b2fc

Fix memory leaks in the temporary rasterized images.

7b39e78

mdboom added this to the v1.4.0 milestone Jul 14, 2014

mdboom added the release_critical label Jul 14, 2014

efiring mentioned this pull request Jul 15, 2014

Out of memory failures on Travis #3255

Closed

mdboom added a commit that referenced this pull request Jul 16, 2014

Merge pull request #3258 from mdboom/ft2font-memleak

c893d56

Fix various memory leaks discovered through valgrind

mdboom merged commit c893d56 into matplotlib:v1.4.x Jul 16, 2014

mdboom deleted the ft2font-memleak branch August 7, 2014 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix various memory leaks discovered through valgrind #3258

Fix various memory leaks discovered through valgrind #3258

mdboom commented Jul 14, 2014

mdboom commented Jul 14, 2014

tacaswell commented Jul 14, 2014

mdboom commented Jul 14, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 15, 2014

mdboom commented Jul 15, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 15, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 16, 2014

tacaswell commented Jul 16, 2014

Fix various memory leaks discovered through valgrind #3258

Fix various memory leaks discovered through valgrind #3258

Conversation

mdboom commented Jul 14, 2014

mdboom commented Jul 14, 2014

tacaswell commented Jul 14, 2014

mdboom commented Jul 14, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 15, 2014

mdboom commented Jul 15, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 15, 2014

efiring commented Jul 15, 2014

mdboom commented Jul 16, 2014

tacaswell commented Jul 16, 2014