creating pngs in background mode leaks memory #1397

doutriaux1 · 2015-06-16T21:13:23Z

A LOT!!!

for i in range(120):
   x.plot(data,"default","isofill")
   x.png("file_%i.png" % i)
   x.clear()

run and look at memory in top.

doutriaux1 · 2015-06-16T21:21:08Z

@dlonie any idea why the memory could grow when bg=True ? The renWin is different obviously. Could it be VTK related?
Note: I tried to comment out hideGui/showGUI in def png but no luck

alliepiper · 2015-06-17T12:12:24Z

I'm a bit confused, it doesn't look like the plot call in the sample code is using background mode?

In any case, it doesn't look like VTK is holding onto anything in offscreen mode:

import gc
gc.enable()
gc.set_debug(gc.DEBUG_LEAK)

def runTest():
    import vtk
    sphere = vtk.vtkSphereSource()

    mapper = vtk.vtkPolyDataMapper()
    mapper.SetInputConnection(sphere.GetOutputPort())

    actor = vtk.vtkActor()
    actor.SetMapper(mapper)

    ren = vtk.vtkRenderer()
    ren.AddActor(actor)

    renWin = vtk.vtkRenderWindow()
    renWin.AddRenderer(ren)

    renWin.OffScreenRenderingOn()
    renWin.SetSize(300, 300)

    renWin.Render()

runTest()
del runTest

gc.collect()

leaks = {}

for i,g in enumerate(gc.garbage):
    t = type(g)
    print "Leak %d: %s @0x%x:\n   %s"%(i, t, id(g), str(g)[:70])
    leaks[t] = leaks.get(t, 0) + 1

for t in leaks:
    print "%7d: %s"%(leaks[t], t)

print len(gc.garbage),"leaks total."

$ python vtk_offscreen.py 
0 leaks total.

Looks like the leaks are on the VCS side of things.

doutriaux1 · 2015-06-17T16:35:15Z

@dlonie I'm not quite sure what's going. All @durack1 and I noticed is that if you plot with vcs bg=1 then the process mem size keeps growing. And looking at the code it seems that the only differenc eis this:

    if self.bg:
        self.renWin.SetOffScreenRendering(True)
        self.renWin.SetSize(self.canvas.bgX,self.canvas.bgY)

But obviously there must be something else.

durack1 · 2015-06-17T16:37:19Z

@doutriaux1 @dlonie for perspective, the loop generating these images has ~1700 steps and as it passed the 1000 mark yesterday it was using 50GB! of memory.. So it's not a small leak at all.. Each png being generated is around the 100KB mark, so around 170MB total as an output..

doutriaux1 · 2015-06-17T16:37:32Z

@dlonie also it seems to be coming from the png function not the rendering function.
also a few more options are set on our renWin, not sure if it's matter e.g:

      self.renWin.SetWindowName("VCS Canvas %i" % self.canvas._canvas_id)
      self.renWin.SetAlphaBitPlanes(1)
      ## turning on Stencil for Labels on iso plots
      self.renWin.SetStencilCapable(1)
      ## turning off antialiasing by default
      ## mostly so that pngs are same accross platforms
      self.renWin.SetMultiSamples(0)

alliepiper · 2015-06-17T17:15:39Z

Those additional settings on the renderwindow shouldn't affect the offscreen rendering mechanism. Well, MultiSampling might, but only if it were non-zero.

What is the memory usage for the following scenarios?

bg + plot + png
bg + plot
plot + png
plot

Also, have you printed out the leaking objects and compared the results with a single run? The objects with increased leaks may point you to the issue.

alliepiper · 2015-06-17T17:18:54Z

The 50GB would make sense if the plot map were leaking -- that holds on to all of the plotted datasets as well as the intermediate filter outputs, which would add up very, very quickly.

alliepiper · 2015-06-17T17:20:51Z

Oh, and by plot map I mean the map returned by VTKPlots.plot -- I've noticed that it doesn't get freed in between calls to plot.

doutriaux1 · 2015-06-17T17:23:49Z

Ok I'll run your runTest and print the leaked object in bg vs not in bg, that should point us in the right direction. Thanks.

doutriaux1 · 2015-06-17T17:28:06Z

@dlonie yes but why isn't it cleaned up ONLY when plotting in bg mode? That part is really odd.

durack1 · 2015-06-17T17:46:58Z

@dlonie would a well placed gc.collect() within the loop help here, or will these things persist until the code completes?

alliepiper · 2015-06-17T17:53:08Z

When I was having troubles with it I was doing foreground plots, so I don't think fg/bg really matters for that particular leak.

Garbage collection should happen automatically by the interpreter as-needed. I doubt it's an issue of delayed collection -- given the plethora of other leaks occurring in vcs, it's more likely a reference counting problem (gc.collect() won't help if the reference counts are off or there are circular dependencies).

doutriaux1 · 2015-06-17T19:09:11Z

actually, bg does matter, we watch the process go for a while with @durack1 and the memory only grows when using bg=1.

alliepiper · 2015-06-17T19:21:54Z

Must be a different leak then. I was seeing that dictionary leak in
foreground mode,

On Wed, Jun 17, 2015 at 3:09 PM, Charles Doutriaux <notifications@github.com

wrote:

actually, it does matter, we watch the process go for a while with
@durack1 https://github.com/durack1 and the memory only grows when
using bg=1.

—
Reply to this email directly or view it on GitHub
#1397 (comment).

durack1 · 2015-06-18T00:23:18Z

@dlonie @doutriaux1 just FYI.. I ran this for one year, so 48 steps (I'm looping through a bunch of variables) and for this with bg=True I hit ~2.4GB, with bg=False I hit ~260MB so it does appear to be an issue with the background operation

Let me know if you need more info from me regarding my X11/client setup or other info.. Full disclosure here I'm using a VNC client..

As an aside, this plotting at least using isofill appears really slow, I'd certainly be keen to know of ways to optimize this to increase the plotting speed, particularly considering I'm looping over a lot of separate panels (~7000)..

alliepiper · 2015-06-18T14:16:10Z

@durack1 Thanks for the numbers, that is troubling indeed.

I'm hesitant to say VTK is at fault here, since

I don't see any leaks from a pure VTK application that does background rendering
VTK has had much more testing and development than UV-CDAT over the last 20 years, and is tested nightly for memory leaks with valgrind
UV-CDAT is known to leak lots of memory

So my suspicion is that something in VCS is holding onto application state in-between calls to plot. If you can come up with a script that demonstrates the issue without using VCS (e.g. a pure VTK reproduction of the issue) I'll be happy to track any issues down on our side. Until then, this is most likely a VCS issue.

doutriaux1 · 2015-06-18T16:05:29Z

@dlonie can you try to add the vtkpngwriter to your bit of coe above, just to make sure. Thanks. That's where the leak is probably coming from.

durack1 · 2015-06-18T21:12:52Z

@dlonie does this output mean anything to you?

processing: 1883-09 sic bc <- print output from my script
processing: 1883-10 sic bc
processing: 1883-11 sic bc
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  73 (X_GetImage)
  Serial number of failed request:  551681
  Current serial number in output stream:  551681

I'll run this code again with some timing turned on, it seems like things really grind to a halt and slow down remarkably when reusing the same canvas from x = vcs.init()

aashish24 · 2015-06-18T21:15:36Z

@dlonie let's try adding vtkpngwriter as @doutriaux1 suggested. That will help. May be we can give it to @sankhesh

alliepiper · 2015-06-19T13:43:48Z

@durack1 Nope, haven't seen that before. I remember we were having problems with the some template objects growing out of control when reusing a canvas for animations, causing a similar slowdown/memory leak.

@doutriaux1 @aashish24 Rebuilding to rerun w/ png writer now.

alliepiper · 2015-06-19T16:31:31Z

import gc
gc.enable()
gc.set_debug(gc.DEBUG_LEAK)

def runTest():
    import vtk
    sphere = vtk.vtkSphereSource()

    mapper = vtk.vtkPolyDataMapper()
    mapper.SetInputConnection(sphere.GetOutputPort())

    actor = vtk.vtkActor()
    actor.SetMapper(mapper)

    ren = vtk.vtkRenderer()
    ren.AddActor(actor)

    renWin = vtk.vtkRenderWindow()
    renWin.AddRenderer(ren)

    renWin.OffScreenRenderingOn()
    renWin.SetSize(300, 300)

    renWin.Render()

    w2img = vtk.vtkWindowToImageFilter()
    w2img.SetInput(renWin)
    w2img.SetInputBufferTypeToRGBA()

    writer = vtk.vtkPNGWriter()
    writer.SetInputConnection(w2img.GetOutputPort())
    writer.SetFileName("test-output.png")
    writer.Write()

runTest()
del runTest

gc.collect()

leaks = {}

for i,g in enumerate(gc.garbage):
    t = type(g)
    print "Leak %d: %s @0x%x:\n   %s"%(i, t, id(g), str(g)[:70])
    leaks[t] = leaks.get(t, 0) + 1

for t in leaks:
    print "%7d: %s"%(leaks[t], t)

print len(gc.garbage),"leaks total."

$ python vtk_offscreen.py 
0 leaks total.

durack1 · 2015-06-19T17:14:38Z

@dlonie @doutriaux1 not sure this info below is totally related to the bg=1 issue title, but just for completeness I've currently got a long run underway and have added in some diagnostics to query time taken through each loop iteration and memory footprint of the process, the code looks like (I've trimmed out the extraneous cdms2.open/close, variable calculation, print and string assignment etc calls for brevity):

counter = 1
for var in ['sic','sst']:
    ...
    x           = vcs.init()
    bg = False ; # For 1 yr uses ~260MB
    for data in ['obs','bcs']:
        ...
        for count,y in enumerate(range(1870,2013)):
            ...
            for m in range(12):
                startTime = time.time()
                ...
                x.plot(title,bg=bg)
                x.plot(s1,t1,iso,bg=bg); #,ratio="autot"); #,vtk_backend_grid=g)
                x.plot(diff,t2,iso2,bg=bg); #,ratio="autot") ; #,vtk_backend_grid=g)
                x.plot(s2,t3,iso,bg=bg); #,ratio="autot") ; #,vtk_backend_grid=g)
                ...
                x.png(fileName)
                x.clear()
                endTime = time.time()
                ...
                print counterStr,printStr,varName.ljust(6),BC,timeStr,memStr
                counter = counter+1
            ...
            gc.collect() ; # Attempt to force a memory flush
        ...
        x.ffmpeg(os.path.join(outPath,outMP4File),outFiles,rate=5,bitrate=2048); #,options=u'-r 2') ; # Rate is frame per second - 1/2s per month
    x.clear()

So the print statement prints out the following - just as an aside the x.ffmpeg call occurs every 1716 steps, so the memory growth and slowdown has not much to do with this:

00001 processing: 1870-01 sic     Time: 04.093 secs; Max mem: 0.238 GB
00150 processing: 1882-06 sic     Time: 06.176 secs; Max mem: 0.255 GB
00300 processing: 1894-12 sic     Time: 10.063 secs; Max mem: 0.261 GB
00500 processing: 1911-08 sic     Time: 15.334 secs; Max mem: 0.280 GB
01000 processing: 1953-04 sic     Time: 31.086 secs; Max mem: 0.320 GB
01500 processing: 1994-12 sic     Time: 48.595 secs; Max mem: 0.352 GB
02000 processing: 1893-08 sic  bc Time: 68.674 secs; Max mem: 0.389 GB
02500 processing: 1935-04 sic  bc Time: 90.528 secs; Max mem: 0.480 GB
03000 processing: 1976-12 sic  bc Time: 111.397 secs; Max mem: 0.480 GB
03500 processing: 1875-08 tos     Time: 134.129 secs; Max mem: 0.498 GB
04000 processing: 1917-04 tos     Time: 163.216 secs; Max mem: 0.517 GB

alliepiper · 2015-06-19T17:29:53Z

You can try profiling the code and looking at where it's spending all that extra time -- that's how we tracked down the template issue. I'll see if I can dig up some resources.

danlipsa · 2016-06-10T19:52:58Z

@durack1 That is fine. ssh -X/Y will give you the OpenGL on your local machine. If you have a linux local machine I would be interested in the test results/gxlinfo on that machine, even if you don't run uvcdat there.

durack1 · 2016-06-10T20:51:14Z

@danlipsa almost all our local machines are now OS X laptops.. What info do you need from these machines locally? Is there another way I can get the same info - querying the system hardware?

danlipsa · 2016-06-13T16:09:02Z

@durack1 @doutriaux1 @aashish24 I merged in a fix for the VTK bug
https://gitlab.kitware.com/vtk/vtk/merge_requests/1556
However this does not make a noticeable difference for uvcdat on my machine. I still see about 1MB per iteration leak for offscreen rendereing. This might be different for a different driver. I think Paul is seeing a much bigger leak per iteration.

durack1 · 2016-06-13T16:56:00Z

@danlipsa the script that I am using is exposing a very big memory leak, it creates a number of VCS objects before saving this using the pngwriter.. If I can get your merged PR in a build I can rerun my script and report the memory usage.. It seems to expose the issue very quickly throughout the ~7k iterations..

durack1 · 2016-06-15T18:43:10Z

@danlipsa, @doutriaux1 will pull across your branch for testing locally.. I will run my script against it and see how it compares to 2.4.1 are your changes in a PR in this repo, or a branch?

danlipsa · 2016-06-15T18:45:04Z

@durack1 Don't do that yet. I don't think you'll see any difference. I want to look more into this.

durack1 · 2016-06-15T18:47:03Z

@danlipsa no problem.. Let me know when you want me to kick the tires, it'll take a little time to get the changes into a local build so I can run my script

danlipsa · 2016-06-17T17:16:43Z

@aashish24 @durack1 @doutriaux1 I have a new merge request for the VTK bug. The first time the bug wasn't really fixed.
https://gitlab.kitware.com/vtk/vtk/merge_requests/1574
Waiting for the fix to be merged. @aashish24 😉
This does make a big difference for my uvcdat test. From about 500 MB for 500 iteration the leak is reduced to about 6MB for 500 iterations.

durack1 · 2016-06-17T17:19:02Z

@danlipsa thanks for the heads up.. This sounds like it, the two order of magnitude drop in memory usage should certainly flow into better behavior in my example script

doutriaux1 · 2016-06-20T15:22:36Z

That is great! @durack1 I will build a conda for you to test it with.

danlipsa · 2016-06-20T15:27:18Z

@doutriaux1 Note, the change is not merged in VTK yet - it is in the review process.

doutriaux1 · 2016-06-20T15:46:17Z

Can I build against your commit?

danlipsa · 2016-06-20T15:49:02Z

@doutriaux1 Yes, you can. It passes all VTK and uvcdat tests.

durack1 · 2016-06-21T07:14:40Z

@doutriaux1 let me know when I can take a test drive.. I'm curious if this will solve the problem that made things unusable with bg=1

doutriaux1 · 2016-06-21T19:16:23Z

@durack1 it seems that @danlipsa fixed the leak! on oceanonly use the conda environment LEAK let us know.

doutriaux1 · 2016-06-21T19:18:26Z

@danlipsa it saves 40Mb per frame on oceanonly!

durack1 · 2016-06-23T08:58:36Z

@doutriaux1 nice! I'll rerun my script next week and see what gives.. And also outline the same diagnostics included above. An update will also be useful for dealing with #1424 @chaosphere2112

doutriaux1 · 2016-06-23T14:09:38Z

@durack1 you snoozed too long, it's in master 😉

durack1 · 2016-06-23T16:11:21Z

@doutriaux1 excellent! Is it in conda nightly?

doutriaux1 · 2016-06-23T18:14:33Z

yep

On 06/23/2016 09:11 AM, Paul J. Durack wrote:

@doutriaux1 https://github.com/doutriaux1 excellent! Is it in conda
nightly?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1397 (comment), or
mute the thread
https://github.com/notifications/unsubscribe/ACpw8QzNNinIRrkdR8q6tmrVuc2ngrKLks5qOrArgaJpZM4FEycm.

durack1 · 2016-06-28T18:31:24Z

@danlipsa it seems that the bg=1 memory leak is no longer an issue.

The code iterates through monthly data from 1870-01 to 2012-12 over 4 variables (so steps 1 through 6864) as you can see below, it seems that as a number of arrays are being filled the memory climbs (this is likely an internal python/UV-CDAT behavior), whereas once these variables are being reused there is no memory growth - so 1717 to 3432, 3433 to 5148 and 5149 to 6795 has very stable memory usage:

UV-CDAT version:      v2.5.0-576-ge31e2fd
UV-CDAT prefix:       /export/durack1/anaconda2/envs/uvcdatReleaseAlpha
delFudge:             True
Background graphics:  True
00001 processing: 1870-01 sic     Time: 04.149 secs; Max mem: 1.127 GB PyObj#: 0093219;
00002 processing: 1870-02 sic     Time: 03.249 secs; Max mem: 1.561 GB PyObj#: 0093572;
00003 processing: 1870-03 sic     Time: 03.741 secs; Max mem: 1.561 GB PyObj#: 0093624;
...
00040 processing: 1873-04 sic     Time: 03.676 secs; Max mem: 1.562 GB PyObj#: 0095549;
...
00051 processing: 1874-03 sic     Time: 03.910 secs; Max mem: 1.566 GB PyObj#: 0096122;
...
00099 processing: 1878-03 sic     Time: 04.010 secs; Max mem: 1.567 GB PyObj#: 0098622;
...
00110 processing: 1879-02 sic     Time: 03.070 secs; Max mem: 1.571 GB PyObj#: 0099195;
...
00215 processing: 1887-11 sic     Time: 02.999 secs; Max mem: 1.573 GB PyObj#: 0104663;
...
00266 processing: 1892-02 sic     Time: 03.873 secs; Max mem: 1.574 GB PyObj#: 0107300;
...
01716 processing: 2012-12 sic     Time: 08.706 secs; Max mem: 1.619 GB PyObj#: 0182700;
01717 processing: 1870-01 sic    bc Time: 11.680 secs; Max mem: 2.032 GB PyObj#: 0182728;
...
03432 processing: 2012-12 sic    bc Time: 10.099 secs; Max mem: 2.032 GB PyObj#: 0272098;
03433 processing: 1870-01 tos     Time: 10.549 secs; Max mem: 2.093 GB PyObj#: 0272026;
...
05148 processing: 2012-12 tos     Time: 13.383 secs; Max mem: 2.093 GB PyObj#: 0361207;
05149 processing: 1870-01 tos    bc Time: 11.519 secs; Max mem: 2.175 GB PyObj#: 0361424;
...
06795 processing: 2007-03 tos    bc Time: 10.731 secs; Max mem: 2.175 GB PyObj#: 0447012;
...

@chaosphere2112 the increase in plot times is still a problem, as are the python objects that continue to accumulate, but that can be solved in #1424

@danlipsa @doutriaux1 I think you can close this issue

aashish24 · 2016-06-28T19:47:56Z

yay! great job @danlipsa, thanks @doutriaux1 @durack1

danlipsa · 2016-06-29T01:59:43Z

@durack1 @doutriaux1 Thanks for testing! Great to hear the fix works for your test case. I am closing this ...

doutriaux1 · 2017-04-05T19:48:02Z

@danlipsa @aashish24 we should take a look at this one!

doutriaux1 added VCS Bug Critical labels Jun 16, 2015

doutriaux1 added this to the 2.3 milestone Jun 16, 2015

doutriaux1 assigned alliepiper Jun 16, 2015

alliepiper removed their assignment Jun 17, 2015

doutriaux1 mentioned this issue Jun 18, 2015

How to enable backing store for plots from UVCDAT script? #1383

Open

durack1 mentioned this issue Jun 28, 2016

portrait status - png not working PCMDI/pcmdi_metrics#355

Closed

danlipsa closed this as completed Jun 29, 2016

danlipsa mentioned this issue Jun 30, 2016

creating pngs in background mode leaks (much less) memory #2040

Closed

danlipsa mentioned this issue Nov 23, 2016

Plot does not look right when using vnc or x-forwarding CDAT/vcs#39

Closed

durack1 mentioned this issue Jan 23, 2017

bg=True freaks out on OSMesa builds CDAT/vcs#119

Closed

durack1 mentioned this issue Aug 30, 2017

VCS and template object clean up to aid iteration speed and memory footprint CDAT/vcs#242

Closed

creating pngs in background mode leaks memory #1397

creating pngs in background mode leaks memory #1397

Comments

doutriaux1 commented Jun 16, 2015

doutriaux1 commented Jun 16, 2015

alliepiper commented Jun 17, 2015

doutriaux1 commented Jun 17, 2015

durack1 commented Jun 17, 2015

doutriaux1 commented Jun 17, 2015

alliepiper commented Jun 17, 2015

alliepiper commented Jun 17, 2015

alliepiper commented Jun 17, 2015

doutriaux1 commented Jun 17, 2015

doutriaux1 commented Jun 17, 2015

durack1 commented Jun 17, 2015

alliepiper commented Jun 17, 2015

doutriaux1 commented Jun 17, 2015

alliepiper commented Jun 17, 2015

durack1 commented Jun 18, 2015

alliepiper commented Jun 18, 2015

doutriaux1 commented Jun 18, 2015

durack1 commented Jun 18, 2015

aashish24 commented Jun 18, 2015

alliepiper commented Jun 19, 2015

alliepiper commented Jun 19, 2015

durack1 commented Jun 19, 2015

alliepiper commented Jun 19, 2015

danlipsa commented Jun 10, 2016

durack1 commented Jun 10, 2016

danlipsa commented Jun 13, 2016

durack1 commented Jun 13, 2016 • edited

durack1 commented Jun 15, 2016

danlipsa commented Jun 15, 2016

durack1 commented Jun 15, 2016

danlipsa commented Jun 17, 2016 • edited

durack1 commented Jun 17, 2016

doutriaux1 commented Jun 20, 2016

danlipsa commented Jun 20, 2016

doutriaux1 commented Jun 20, 2016

danlipsa commented Jun 20, 2016

durack1 commented Jun 21, 2016

doutriaux1 commented Jun 21, 2016

doutriaux1 commented Jun 21, 2016

durack1 commented Jun 23, 2016

doutriaux1 commented Jun 23, 2016

durack1 commented Jun 23, 2016

doutriaux1 commented Jun 23, 2016

durack1 commented Jun 28, 2016 • edited

aashish24 commented Jun 28, 2016

danlipsa commented Jun 29, 2016

doutriaux1 commented Apr 5, 2017

durack1 commented Jun 13, 2016 •

edited

danlipsa commented Jun 17, 2016 •

edited

durack1 commented Jun 28, 2016 •

edited