-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential memory leak #1014
Comments
I know that there is a tool (that i currently cannot find, since google is blocked in china) that allows you to list all objects currently in python memory. There is also a functionality to only list those objects that are "new" since some checkpoint. You could use this to see what new stuff is beeing allocated in each iteration and try to see if there is anything wrong there. |
Btw, that callback is quite useful, would you consider making a pull request with it? |
Finally, is this specific to this example, or could you try reproducing it with something that does not take hours to run? Anyway, obviously looks quite bad and is something we need to fix. |
In the iteration part of chambolle_pock_solver, the lines of proximal_dual(sigma)(dual_tmp, out=y) and proximal_primal(tau)(primal_tmp, out=x) need to be run at each iteration. Then new operators are created simultaneously. Hence I guess that would consume new memory but not release the old at each iteration. |
That is likely not the issue since python garbage collects things like that. The issue is somewhere deeper. Basically there are two ways that we could leak memory (that I know of)
import psutil
import numpy as np
class MyList(list):
def __init__(self, *args, **kwargs):
# Allocate something to cause memory growth
self.large_array = np.random.rand(50000)
list.__init__(self, *args, **kwargs)
def __del__(self):
pass
n = 10000
for i in range(n):
x = MyList()
x.append(x)
if i%1000 == 0:
print('RAM usage: {}%'.format(psutil.virtual_memory().percent)) which gives
However, this only occurs with python < 3.4, since in 3.4 PEP 442 fixes this, giving something like:
Interestingly, we only implement
So anyway, my debug steps for you would be:
|
This is definitely useful. @chongchenmath would you make a separate pull request just with this callback, giving it some name like |
不用客气 This is input by @aringh I think he will do it then. |
We run the code with Python 3.5.3. So it may be not the version problem of python. |
Frankly there is a large risk this has to do with astra then. Anyway, we need statistics on new allocations to get further with this then. |
Yes of I can do a PR with the callback. When it comes to the real issue:
but, it will have to wait until next week ;-) |
No problems! But I think this has to do with 3d somehow, given that we did 20 000 000 projections in https://github.com/adler-j/learned_gradient_tomography without any noticable memory leaks. |
Any update on this rather interesting issue? |
I am stuck with other things right now... but if I make sufficient progress there I hope to be able to have a look at this late this afternoon. |
That sure looks like it. By looming at diffs between iterates you should be able to see what has been created. You can do that by writing a simple callback. |
I have started to look at it. All code, output and findings can be found on this branch. In the Conclusion so far: I still have no idea. |
VERY nice investigative work, and interesting indeed that it seems to "stagnate" at some point (indicating that this may have something to do with garbage collection). Some suggestions:
P.S. Quite sure this type of work should not be on a branch (since it will never be merged back, given the huge log files). |
Current update on this: when running from Conclusion 1: something has happened between commit 7824dd2 and 3fd52ac that makes the later allocate much more memory in fewer iterations. |
Update on previous comment: in fact I seem to have a similar behavior as originally reported. Memory use seems to increase in the same way during the first 66 iterattions, after which I terminated the code in order to try to run code using Ps. Output files and notes are updated on the branch. Ds. |
Did you try this:
Which is a likely culprit |
I have not tried it. But in order to reproduce the exam same behavior as first reported, I need to run from commit I will pause this debugging for now: I think I need some help from @adler-j in person in order to continue this in a meaningful way :-) |
No problem, I'll be back next week. Great job so far! |
This has basically been solved in the ASTRA branch by this commit: astra-toolbox/astra-toolbox@aa32503 we're waiting for a conda release. |
Closing this issue here for now. Solution is to build astra locally. |
There seems to be a potential memory leak somewhere in
odl
. Running the code from @davlars repo to make reconstructions on simulated 3D data, the used RAM memory is continuously increasing. Trying a TV-reconstruction with the Chambolle-Pock solver we (@aringh and I) get the following memory usage in the iterations:As can be seen, the % of used RAM memory is increased from 59.2 to 72.8 over the iterations.
We do not yet know where the leak is, and it needs further investigation. Can anyone suggest a good way to do it?
The code used to get the above out-prints is (note however, that as is it will take several hours to run 500 iterations)
The text was updated successfully, but these errors were encountered: