-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gcylc: large memory footprint in graph mode - possible leak? #1620
Comments
Example suite to play with:
Increase the number of tasks to make the problem more prominent. |
With your example suite above, I get a very similar gross memory profile with constant task run length regardless of the value (of run length), e.g. for and But for varied lengths, e.g. your sim mode settings above or live mode with In the latter case presumably the suite graph updates more often because fewer things happen near-simultaneously. All this would seem to suggest that graph view memory goes up with each graph update, even though (certainly in this non-cycling case) the graph is not actually growing any bigger. |
On a simpler suite, |
Same result at cylc-5.4.7 |
I've been doing some internal function profiling too - https://pypi.python.org/pypi/memory_profiler - but have not found the problem as yet. It could be in the xdot code. |
Agreed. I think its definitely related to the updates themselves.
Interesting, that probably explains why the user's suite was so badly affected - lots of very long task names.
I've been coming to that conclusion myself but struggling to narrow down where the problem is occurring in it. Commenting out the:
line under |
For my 100-member suite with 100-char task names (above), line 481 here increments 0.6MiB for every graph update. From further investigation, it seems that text rendering in xdot is the problem: returning immediately from |
I suspect the problem is due to this extant pygtk bug: https://bugzilla.gnome.org/show_bug.cgi?id=599730 |
I have also come to the same conclusion having done various Google searches. |
Not sure what to do about this. Presumably we could provide a pygtk patch ourselves, but having to build pygtk would be a significant complication for cylc installation! |
Fixed in Ubuntu? https://bugs.launchpad.net/ubuntu/+source/pygtk/+bug/981376 |
Given that it is not really a cylc bug, if we can confirm that the bug is fixed on the latest release of GTK and has been rolled out on latest releases of major Linux distros such as RHEL, Ubuntu, (SuSE?), etc then we should close this issue. |
Unfortunately that's not going to work. I took a git clone of pygtk, the last release 2.24.0 was more than four years ago. It looks like the bug was fixed two years ago, but I'm guessing it will never be released because pygtk is being superseded by pygobject:
|
That's unfortunate. Perhaps there is a need to move to |
See #112 |
@matthewrmshin/@hjoliver - as this is a non-cylc issue and will be resolved/irrelevent when we later move to a different gui technology can we close this down? |
@arjclark - I think we should keep this open while we're still using the current GUI technology, as a reference for anyone who notices the problem. Also, I remain skeptical that the current GUI can be superseded very quickly, unless perhaps we go to pygobject before developing a web-based GUI. |
Can we test this on a up-to-date distro? |
Solving #1873 will hopefully kill this issue. |
We are now working on the new Web GUI, which will replace the previous Desktop PyGTK GUI in our next major release of Cylc 8. This issue is being closed for not being applicable to this new Web GUI. |
When using the graph view in gcylc the memory footprint grows over time to worryingly large values. My suspicion is that we have some sort of memory leak either in our code or the underlying graphviz library.
Following a report from a user about large memory footprints for their gcylc sessions of over a gig each (1.4G and 1.9G) I've been investigating this and reproducing signs of a problem with a version of the um's rose-stem suite running in simulation mode with each task set to take 60-120s to run. As per the user's report I've been setting the view to graph mode, ungrouping all families and filtering out succeeded tasks.
What leads me to believe we have a memory leak is that, logically, as the number of tasks decreases in the view the overheads should be smaller, whereas what I'm seeing is the memory footprint increasing with time until the point of completion of the suite at which point (blank graph) it remains at the higher value. Swapping view to one of the others doesn't reduce the footprint and swapping back to graph sets it increasing again. If I close and re-open the gui the footprint starts small again and then keeps increasing again adding to suggestion of a memory leak (although not as bad as it once had been in this case, presumably due to a smaller initial number of graph nodes and edges)
This doesn't appear to be a recent development (had thought it might be down to some of the caching we've started doing) as I've been able to reproduce it from master back to earlier 6.x versions (got as far back as 6.1.0 with this so far before hitting a task naming convention error).
What I'm not sure of is what's going on. Is this a cylc issue or a graphviz one as it doesn't relate to the other views. We generally wouldn't recommend users running with graph view for suites as large as the one I'm investigating but the problem presumably still exists for other suites too.
I'll look to produce a simple example suite to investigate with next week.
The text was updated successfully, but these errors were encountered: