New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in cython functions length and set_number_of_points #411

Merged
merged 5 commits into from Sep 11, 2014

Conversation

Projects
None yet
4 participants
@MarcCote
Contributor

MarcCote commented Aug 27, 2014

I found a memory leak in the cythonized version of dipy.tracking.streamline.length and dipy.tracking.streamline.set_number_of_points. Cython automatically convert a list of numpy arrays into a vector of memoryviews but at the end of the function, when clearing the vector, memviews' refcount are not decreased. So, I do it manually.

I added a new unit test to expose the memory leak: test_streamline:TestStreamline.test_memory_leak(). Below, I report the difference in RAM usage between successive calls to each function.

Before
length()
['0.13Mo', '0.22Mo', '0.28Mo', '0.50Mo', '0.31Mo', '0.33Mo', '0.08Mo', '0.41Mo', '0.14Mo', '0.06Mo', '0.47Mo', '0.31Mo', '0.32Mo', '0.17Mo', '0.48Mo', '0.29Mo', '0.06Mo', '0.13Mo', '0.20Mo']

set_number_of_points()
['1.50Mo', '0.90Mo', '1.26Mo', '1.11Mo', '1.89Mo', '1.06Mo', '1.02Mo', '1.09Mo', '1.12Mo', '1.17Mo', '1.12Mo', '1.12Mo', '1.12Mo', '1.12Mo', '1.12Mo', '1.12Mo', '1.12Mo', '0.98Mo', '1.09Mo']

Now
length()
['0.15Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo']

set_number_of_points()
['0.06Mo', '0.17Mo', '0.00Mo', '0.25Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo', '0.00Mo']

@MrBago

This comment has been minimized.

Contributor

MrBago commented Aug 27, 2014

Why does the vector need to be manually de-referenced? Is this a bug in cython?

cdef void XDEC_vector_of_memview(Streamlines streamlines) nogil:
# Cython automatically convert list of numpy array into vector of memoryviews but

This comment has been minimized.

@Garyfallidis

Garyfallidis Aug 27, 2014

Member

Typos. Please rephrase: can convert a list of numpy array into a vector of memoryviews but ...

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Aug 27, 2014

Yes, @MrBago this is a bug in Cython. @MarcCote manage to find it by looking into the C++ generated code from Cython. He is also now reporting this to the Cython list.

Great catch @MarcCote !!!

@MrBago

This comment has been minimized.

Contributor

MrBago commented Aug 27, 2014

I'm worried that if/when the bug is fixed in cython we'll end up
de-referencing the vector twice. Do you expect this patch to play nice with
future versions of cython?

On Wed, Aug 27, 2014 at 11:31 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Yes, @MrBago https://github.com/MrBago this is a bug in Cython.
@MarcCote https://github.com/MarcCote manage to find it by looking into
the C++ generated code from Cython. He is also now reporting this to the
Cython list.

Great catch @MarcCote https://github.com/MarcCote !!!


Reply to this email directly or view it on GitHub
#411 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Aug 27, 2014

Maybe @MarcCote can add a small test without dereferencing which will start failing when the cython people fix the bug.

Maybe the test he already has can give us the same information. @MarcCote what do you think?

@MrBago

This comment has been minimized.

Contributor

MrBago commented Aug 27, 2014

It be nice if we could figure out a fix that would work with multiple
versions of cython, otherwise well need to do a cython version jump at some
point in the future. Is the vector method significantly faster than just
declaring a list or typle of memory views?

Bago

On Wed, Aug 27, 2014 at 11:40 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Maybe @MarcCote https://github.com/MarcCote can add a small test
without dereferencing which will start failing when the cython people fix
the bug.


Reply to this email directly or view it on GitHub
#411 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Aug 27, 2014

Yes, the new method gives 90X speedups and the API is much better.
It really worth using here what Marc suggests. He will have another PR next
with
the new version of Quickbundles and you will be able to see and judge the
added value yourself.

On Wed, Aug 27, 2014 at 2:55 PM, MrBago notifications@github.com wrote:

It be nice if we could figure out a fix that would work with multiple
versions of cython, otherwise well need to do a cython version jump at
some
point in the future. Is the vector method significantly faster than just
declaring a list or typle of memory views?

Bago

On Wed, Aug 27, 2014 at 11:40 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Maybe @MarcCote https://github.com/MarcCote can add a small test
without dereferencing which will start failing when the cython people
fix
the bug.


Reply to this email directly or view it on GitHub
#411 (comment).


Reply to this email directly or view it on GitHub
#411 (comment).

@MrBago

This comment has been minimized.

Contributor

MrBago commented Aug 27, 2014

Can I suggest that we re-write these functions using python lists instead
of c++ vectors for now? I've written a quick prototype to show that this
can be done just as fast, if not faster, using lists,
https://github.com/MrBago/dipy/compare/length_benchmark.

On Wed, Aug 27, 2014 at 12:01 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Yes, the new method gives 90X speedups and the API is much better.
It really worth using here what Marc suggests. He will have another PR
next
with
the new version of Quickbundles and you will be able to see and judge the
added value yourself.

On Wed, Aug 27, 2014 at 2:55 PM, MrBago notifications@github.com wrote:

It be nice if we could figure out a fix that would work with multiple
versions of cython, otherwise well need to do a cython version jump at
some
point in the future. Is the vector method significantly faster than just
declaring a list or typle of memory views?

Bago

On Wed, Aug 27, 2014 at 11:40 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Maybe @MarcCote https://github.com/MarcCote can add a small test
without dereferencing which will start failing when the cython people
fix
the bug.


Reply to this email directly or view it on GitHub
#411 (comment).


Reply to this email directly or view it on GitHub
#411 (comment).


Reply to this email directly or view it on GitHub
#411 (comment).

@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Aug 27, 2014

Since I apply the fix at the end of the function, it should not be a problem to decrease the refcount twice. I just tried it (adding many __PYX_XDEC_MEMVIEW(&streamlines[i], 1)) and there is no segfault or exception raised.

@MrBago

This comment has been minimized.

Contributor

MrBago commented Aug 27, 2014

Also my benchmark results:

python length_bench.py
timing double
with list: 1.32256698608
with vector: 1.61416316032

timing float32
with list: 1.46335792542
with vector: 1.62773990631

On Wed, Aug 27, 2014 at 2:18 PM, Bago mrbago@gmail.com wrote:

Can I suggest that we re-write these functions using python lists instead
of c++ vectors for now? I've written a quick prototype to show that this
can be done just as fast, if not faster, using lists,
https://github.com/MrBago/dipy/compare/length_benchmark.

On Wed, Aug 27, 2014 at 12:01 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Yes, the new method gives 90X speedups and the API is much better.
It really worth using here what Marc suggests. He will have another PR
next
with
the new version of Quickbundles and you will be able to see and judge the
added value yourself.

On Wed, Aug 27, 2014 at 2:55 PM, MrBago notifications@github.com
wrote:

It be nice if we could figure out a fix that would work with multiple
versions of cython, otherwise well need to do a cython version jump at
some
point in the future. Is the vector method significantly faster than
just
declaring a list or typle of memory views?

Bago

On Wed, Aug 27, 2014 at 11:40 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Maybe @MarcCote https://github.com/MarcCote can add a small test
without dereferencing which will start failing when the cython people
fix
the bug.


Reply to this email directly or view it on GitHub
#411 (comment).


Reply to this email directly or view it on GitHub
#411 (comment).


Reply to this email directly or view it on GitHub
#411 (comment).

@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Aug 27, 2014

Lists look nice, I'll have a closer look.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Aug 27, 2014

Just a clarification here the 90X speedup is with the new Quickbundles not with the length.

What do you think @MarcCote should we use your length or @MrBago's length. Would you have problems for using QB2 for length clustering using what Bago suggests?

@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Aug 29, 2014

@MrBago thanks, you made me realised we don't really need std::vector so I could remove dependency to C++. @Garyfallidis it will work fine also for the Quickbundles.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Aug 29, 2014

Super! Simplicity always wins! Thx @MarcCote and of course @MrBago for the feedback and the example.

@MrBago can you merge this if you think that it is ready?

@@ -101,11 +93,14 @@ def length(streamlines):

This comment has been minimized.

@MrBago

MrBago Aug 29, 2014

Contributor

I'm a little worried that you're only checking the shape of the first array here. Could this lead to segfault if any of the other arrays do not have enough points? It seems like there is a pretty easy fix thought, if you just make _length work for any number of points you will not have this issue, you can drop the TODO and you can take care of #412 at the same time (#412 is going to conflict with these changes so we might as well just take care of it here so we don't need to merge them).

This comment has been minimized.

@MarcCote

MarcCote Aug 29, 2014

Contributor

You are right if streamlines have different number of dimensions it will produce unexpected results. I generalized both _length and _arclengths and merge #412.

if dtype == np.float32:
_length[float2d](streamlines, streamlines_length)
for i in range(len(streamlines)):
streamlines_length[i] = _length[float2d](streamlines[i])

This comment has been minimized.

@MrBago

MrBago Aug 29, 2014

Contributor

I suspect declaring streamlines_length as an ndarray or a memory view will give you better performance here.

This comment has been minimized.

@MarcCote

MarcCote Aug 29, 2014

Contributor

I tested it and there is no gain in speed. Also, the return object will be a MemoryView instead of numpy array and will need to be cast anyway.

for idx in range(streamlines.size()):
streamline = streamlines[idx]
out[idx] = 0.0
int i

This comment has been minimized.

@MrBago

MrBago Aug 29, 2014

Contributor

Should this be np.npy_intp instead of int? Not sure it matters.

This comment has been minimized.

@MrBago

MrBago Aug 29, 2014

Contributor

Streamlines are not going to be more than 2G so I don't think it really matters here, but it's good to get into the habit of using intp instead of int when you're using it to index into memory. This PR otherwise looks good to me.

omarocegueda and others added some commits Aug 27, 2014

BF: use simple multiplication instead of using exponentiation and pow…
… to prevent the generated cpp code from calling pow (visual c gets confused because it doesnt know what version of pow to use)

Conflicts:
	dipy/tracking/streamlinespeed.pyx
@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Aug 29, 2014

Now functions length and set_number_of_points support list of 2d array of any shape.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Sep 3, 2014

@MrBago this looks ready. Can you merge it?

@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Sep 11, 2014

Gently reminder in case you forgot, is there anything missing for this PR?

MrBago added a commit that referenced this pull request Sep 11, 2014

Merge pull request #411 from MarcCote/cython_memory_leaks
Fix memory leak in cython functions length and set_number_of_points

@MrBago MrBago merged commit c7773a1 into nipy:master Sep 11, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@omarocegueda

This comment has been minimized.

Contributor

omarocegueda commented Sep 12, 2014

Hello @MarcCote, @MrBago, @Garyfallidis
I am having the same issue as this:
http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/25/steps/shell_10/logs/stdio
in my windows machine. Apparently the 'resource' module is not included in Anaconda. Were you aware of this? is it ok to have more dependencies in test code?
Thanks!

@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Sep 12, 2014

My mistake, I was not aware that resource was a Unix system module! https://docs.python.org/2/library/resource.html

Now that we are not using c++ in Cython, (that is why we had leaks in the first place) we could remove the memory_leak test. See PR #416

@MarcCote MarcCote deleted the MarcCote:cython_memory_leaks branch Oct 9, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment