Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDDataArray - using weakref for uncertainty and include it in tests #4825

Closed
MSeifert04 opened this issue May 1, 2016 · 17 comments · Fixed by #4919
Closed

NDDataArray - using weakref for uncertainty and include it in tests #4825

MSeifert04 opened this issue May 1, 2016 · 17 comments · Fixed by #4919
Assignees
Labels
Milestone

Comments

@MSeifert04
Copy link
Contributor

MSeifert04 commented May 1, 2016

In #4799 I realized that the current hardlinking between NDData and uncertainty leads to a memory leak. I don't want to update it in that PR but if the PR get's merged I (or someone) else need to edit the uncertainty.setter of NDDataArray too.

Maybe even deprecating NDDataArray if NDDataRef from #4797 is merged. But at least ccdproc is still using it.

Also currently there are few tests concerning NDDataArray. One possibility would be to include ccdproc and other affiliated packages that use NDData-derivatives into the tests or make the Mixin tests also include NDDataArray.

@pllim pllim added the nddata label May 2, 2016
@pllim
Copy link
Member

pllim commented May 2, 2016

@MSeifert04 , do you have more info on the memory leak, say, from a diagnostic tool?

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented May 2, 2016

@pllim Good point I thought the open-files issue from #4799 would be enough but I had done some tests locally, I'll submit them shortly.

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented May 2, 2016

This was the quick test I originally made (slightly adapted though).

For simplicity I've created a function that creates the NDDataArray but doesn't return it. That should immediatly trigger cleanup of any not returned variables. (The results also look the same if I sleep(1) in between.) So in theory the memory consumption in and outside the function should be constant. The memory consumption inside the function offsetted by the amount that is needed to hold the NDData object.

from astropy.nddata import NDDataArray, StdDevUncertainty
import numpy as np
import time
import psutil

# Function that creates NDDataArray-objects and returns the memory usage
def do_analysis():
    mem_usage = []
    mem_usage_func = []
    for i in range(10):
        mem_usage_func.append(create_nddata())
        mem_usage.append(psutil.virtual_memory().percent)
    return mem_usage, mem_usage_func

mem_usage = do_analysis()

import matplotlib.pyplot as plt

plt.plot(mem_usage)
plt.plot(mem_usage_func)
plt.show()

Currently

Without uncertainty (no memory leak)

def create_nddata():
    ndd = NDDataArray(np.ones((5000,5000)))
    return psutil.virtual_memory().percent

figure_1-2

With uncertainty (potential memory leak)

def create_nddata():
    ndd = NDDataArray(np.ones((3000,3000)),
                      uncertainty=StdDevUncertainty(np.ones((3000, 3000))))
    return psutil.virtual_memory().percent

figure_1-1

This clearly shows a linearly increasing memory usage which might indicate that the object cannot be deallocated when the function exits. That the memory consumption is identical inside and outside the function is another hint. The initial failing Travis tests (from #4799) complaining about "open files" yet another indicator.

Using the version from the current PR (memory leak fixed)

def create_nddata():
    ndd = NDDataArray(np.ones((3000,3000)),
                      uncertainty=StdDevUncertainty(np.ones((3000, 3000))))
    return psutil.virtual_memory().percent

figure_1

With the weakref the memory consumption under identical conditions is again constant, indicating that the object is cleaned up properly when the function exits.

@MSeifert04
Copy link
Contributor Author

I hope that's somehow convincing. 😄 If unclear (sorry I was in a hurry when creating these figures) I can update the plots and analysis.

@pllim
Copy link
Member

pllim commented May 2, 2016

Just want to be clear -- You mentioned two different PRs above. Which one actually has the fix?

@MSeifert04
Copy link
Contributor Author

@pllim #4799 has the fix.

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented May 2, 2016

But the PR (#4799) isn't about the fix. The fix was just included to deal with the test-failures.

@MSeifert04
Copy link
Contributor Author

I've edited the plots a bit. Is that sufficient to conclude that there is a memory-leak?

@pllim
Copy link
Member

pllim commented May 2, 2016

Thanks for the plots! @mwcraig ?

@mwcraig
Copy link
Member

mwcraig commented May 2, 2016

@plim I'll get to this and the other nddata PRs on Wednesday, will try to hit this one tomorrow.

@MSeifert04
Copy link
Contributor Author

@mwcraig If it would help you for review I could open another PR including only the fix for that issue. Currently the fix is in #4799 and probably a bit "hidden".

@mwcraig
Copy link
Member

mwcraig commented May 10, 2016

@MSeifert04 wow, sorry it took so long to actually look at this. If you could separate the memory leak (which I think MUST get into 1.2 if at all possible) from the broader IO pR #4799 that would be AWESOME!

@mwcraig mwcraig added this to the v1.2.0 milestone May 10, 2016
@mwcraig mwcraig self-assigned this May 10, 2016
@MSeifert04
Copy link
Contributor Author

MSeifert04 commented May 10, 2016

@mwcraig So not splitting it would mean a higher chance of #4799 getting merged without extra work? That doesn't sound very convincing to me. 😄

I'll try to create a PR only containing the memory-leak and stolen-parent fix.

@mwcraig
Copy link
Member

mwcraig commented May 11, 2016

Hmm, hadn't thought about it that way....

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented May 11, 2016

@mwcraig PR #4799 is now without the fix (failing because of the memory leak) and PR #4862 about fixing the memory leak and associated problems. Wasn't too hard to seperate them after all.

@mwcraig
Copy link
Member

mwcraig commented May 15, 2016

@MSeifert04 -- is the memory leak part of this fixed in #4862? A separate issue has been opened for potential renaming/reorganization of classes.

If this is fixed now, please close (and thanks again for fixing and finding this).

@MSeifert04
Copy link
Contributor Author

@mwcraig I'll submit a PR to make sure it's fixed for NDDataArray too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants