Change the writeable flag back to its original state when finished. #431

MarcCote · 2014-10-01T19:17:05Z

Right now, using Cython version of length and set_number_of_points on read-only numpy arrays (as provided by NiBabel) will fail. The reason is memoryview can only be used with writable numpy array (for more details check https://mail.python.org/pipermail/cython-devel/2013-February/003394.html)

This PR makes sure those Cython functions set the writeable flag of numpy arrays to True before processing them and changed it back to its original state once done.

MrBago · 2014-10-01T20:52:48Z

dipy/tracking/streamlinespeed.pyx

            streamlines_length[i] = _length[double2d](streamlines[i])
+            streamline.setflags(write=writeable)


Sorry this is probably me just being stupid, but what is streamline here? It doesn't seem like the streamline variable is being updated in the loop.

Woa, no you are not, I am. It should be streamlines[i] instead of streamline.

MrBago · 2014-10-01T21:01:13Z

My issue with this approach is that it doesn't solve the problem in general. Read-only arrays sometimes raise an error if you try and make them writable. Take this example:

import numpy as np
a = np.zeros((10, 3))
a.flags.writeable = False
b = a[:]
b.setflags(write=True)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-58-b316630e639a> in <module>()
----> 1 b.setflags(write=True)

ValueError: cannot set WRITEABLE flag to True of this array

I think the only way to solve this problem in general is to copy the array by doing something like this:

streamline = streamline if streamline.flags.writeable else streamline.copy()

That being said, I think this is an OK solution to the nibable issue if we make it clear in the comments that's the problem it's meant to solve.

Another approach to solve the nibabel problem (though it may have it's own issues) is to have trackvis.read use np.fromstring instead of np.ndarray(..., buffer=string).

Garyfallidis · 2014-10-01T21:06:20Z

Streamlines are usually very large datasets >1.5 Gbytes. If we start copying them we will definitely get MemoryErrors.

MrBago · 2014-10-01T21:07:41Z

You only need to copy one at a time, it leaves memory as soon as you need
to copy the next one. Even a streamline with thousands of points only takes
up a few KB.

On Wed, Oct 1, 2014 at 2:06 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Streamlines are usually very large datasets >1.5 Gbytes. If we start
copying them we will definitely get MemoryErrors.

—
Reply to this email directly or view it on GitHub
#431 (comment).

Garyfallidis · 2014-10-01T21:19:54Z

Okay this was not clear in your previous post. But I see what you mean now. But copying memory in small blocks takes time too. So please before you change this, profile it first with a few millions of streamlines. And let's see how it does.

MarcCote · 2014-10-02T18:04:38Z

Following the same idea proposed by @MrBago, I used astype before calling the inner function _length and _set_number_of_points instead of changing the writeable flag if needed.

I benchmarked the change using respectively 1k and 10k streamlines of 100 points for set_number_of_points and length.

bench_resample ... Timing set_number_of_points() in Cython
New version time: 11.0sec
Old version time: 10.1sec
Speed up of 0.920765027322x

bench_length ... Timing length() in Cython
Cython time: 39.2sec
Cython time: 32.9sec
Speed up of 0.838734371013x

What do you think of this slow down? It is still faster than the python version though.

Edit: This benchmarking was done on streamlines that are writeable and in float32 (I think the useful use case for those functions). But it worth mentioning that the new version is more flexible as it accepts list of streamlines with different dtypes and different writeable flag status. Basically before calling the inner Cython function (e.g. _length) I cast each streamline into either float32 or float64 depending on its dtype (int8/int16/int32 or int64) using astype or if its writeable flag is False.

Garyfallidis · 2014-10-02T18:11:11Z

Can you try it with millions of streamlines please?

MrBago · 2014-10-02T19:51:43Z

Keep in mind the difference should only effect read-only arrays.

On Thu, Oct 2, 2014 at 11:11 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Can you try it with millions of streamlines please?

—
Reply to this email directly or view it on GitHub
#431 (comment).

MarcCote · 2014-10-02T20:42:52Z

@MrBago with the modifications I made (called flexible version), it is not the case anymore. Even for arrays that are not read-only, there will be a difference because I am now doing more comparisons in order to handle list of streamlines with mixed dtype/writeable flags. The code is cleaner and shorter this way but it adds a small cost proportional to the number of streamlines.

@Garyfallidis I tested with a millions of streamlines, I forgot to mention that I was repeat the functions 100 times in the benchmark. Here are the results for 1M streamlines and repeat set to 1:

Timing length() in Cython
Flexible version time: 7.3sec
MrBago version time: 3.45sec <-- Caching caused this (see next post instead).
Speed up of 0.472602739726x

Timing set_number_of_points() in Cython
Flexible version time: 17.8sec
MrBago version time: 15.5sec
Speed up of 0.874507597074x

It seems the overhead has a bigger impact on function length.
Edit: I put clearer name for version that are being benchmarking.
Edit2: see next post for up-to-date benchmarking results.

MarcCote · 2014-10-02T21:01:53Z

I guess we don't really need those functions to be more flexible. In the end, the normal use case for those functions is to have a list of streamlines having the same dtype/writeable flag status.

I should have start by showing you this benchmarking: repeat=10, 1M streamlines, 100 points each, writeable=True, dtype=float32.

Timing length() in Cython
Original version time: 14.8sec
MrBago version time: 15.8sec (0.933x)
Flex version time: 19.7sec (0.748x)
MrBago (read-only) version time: 28.0sec (0.528x)
Flex (read-only) version time: 36.9sec (0.4x)

Timing set_number_of_points() in Cython
Old version time: 62.4sec
MrBago version time: 67.6sec (0.923x)
Flex version time: 72.9sec (0.857x)
MrBago (read-only) version time: 81.7sec (0.764x)
Flex (read-only) version time: 84.0sec (0.743x)

Edit: It appears caching was impacting the benchmark. So I've redone the benchmark making sure new streamlines are recreated before calling the function.

…nts in Cython.

…lexible version of functions length and set_number_of_points in Cython

Garyfallidis · 2014-10-03T15:35:28Z

Okay, I get the same timings with you @MarcCote. Switch to what @MrBago is suggesting and will do the merge.

MarcCote · 2014-10-08T15:56:39Z

Okay @Garyfallidis , I did some modifications on the flexible version. The flexible version (the one included in this PR) still supports having mixed dtypes and writeable flag status. @MrBago's version supports mixed writeable flag status. Whereas, the original version (the Cythonize version, that is not the Python ones in dipy.tracking.metrics) supports only writeable and homogeneous dtypes.

The following benchmark was produced on 100k streamlines containing 100 points (float32) each and were all writeable. When (read-only) is specified, it means streamlines were set to be read-only before processing them. *The slowdown (x factor) is reported against the original version.

Timing length() in Cython
Original version time: 3.48sec
MrBago version time: 3.85sec (0.904x)
Flex version time: 3.83sec (0.909x)
MrBago (read-only) version time: 7.68sec (0.453x)
Flex (read-only) version time: 8.04sec (0.433x)

Timing set_number_of_points() in Cython
Old version time: 12.5sec
MrBago version time: 13.2sec (0.953x)
Flex version time: 13.9sec (0.903x)
MrBago (read-only) version time: 15.7sec (0.798x)
Flex (read-only) version time: 15.8sec (0.793x)

Garyfallidis · 2014-10-09T01:13:37Z

@MrBago are you good with @MarcCote corrections? Should we go ahead and merge or you still see some issues?

MrBago · 2014-10-09T03:19:01Z

I think this is OK to merge. I haven't reviewed the tests as well as I probably should, mainly because there seem to be a lot of changes that don't have to do directly with this PR.

The one thing I'd like to add at some point is support for streamlines from a generator (which does not have len), but I don't think it should block this PR. If you guys think that's a good idea make an issue and assign it to me. I'll get around to it eventually (it shouldn't be that hard).

Garyfallidis · 2014-10-09T13:31:30Z

Cool, added the issue as a reminder.

Thx Marc!

Change the writeable flag back to its original state when finished.

Change the writeable flag back to its original state when finished.

ea5f8ff

MrBago reviewed Oct 1, 2014
View reviewed changes

Test a more flexible approach allowing list of mixed dtypes.

7ed5596

MarcCote added 3 commits October 2, 2014 17:03

Comparison of three version of functions length and set_number_of_poi…

aa9cb54

…nts in Cython.

Make a new benchmarking function to compare Orignal, MrBago and the f…

6709f38

…lexible version of functions length and set_number_of_points in Cython

Update benchmarking function

3da07b2

Improved flexible version

c841f33

Made test_streamline Python3 compatible

76d958d

Garyfallidis added a commit that referenced this pull request Oct 9, 2014

Merge pull request #431 from MarcCote/fix_streamlinespeed

3ed4501

Change the writeable flag back to its original state when finished.

Garyfallidis merged commit 3ed4501 into dipy:master Oct 9, 2014

MarcCote deleted the fix_streamlinespeed branch October 9, 2014 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change the writeable flag back to its original state when finished. #431

Change the writeable flag back to its original state when finished. #431

MarcCote commented Oct 1, 2014

MrBago Oct 1, 2014

MarcCote Oct 1, 2014

MrBago commented Oct 1, 2014

Garyfallidis commented Oct 1, 2014

MrBago commented Oct 1, 2014

Garyfallidis commented Oct 1, 2014

MarcCote commented Oct 2, 2014

Garyfallidis commented Oct 2, 2014

MrBago commented Oct 2, 2014

MarcCote commented Oct 2, 2014

MarcCote commented Oct 2, 2014

Garyfallidis commented Oct 3, 2014

MarcCote commented Oct 8, 2014

Garyfallidis commented Oct 9, 2014

MrBago commented Oct 9, 2014

Garyfallidis commented Oct 9, 2014

		streamlines_length[i] = _length[double2d](streamlines[i])
		streamline.setflags(write=writeable)

Change the writeable flag back to its original state when finished. #431

Change the writeable flag back to its original state when finished. #431

Conversation

MarcCote commented Oct 1, 2014

MrBago Oct 1, 2014

Choose a reason for hiding this comment

MarcCote Oct 1, 2014

Choose a reason for hiding this comment

MrBago commented Oct 1, 2014

Garyfallidis commented Oct 1, 2014

MrBago commented Oct 1, 2014

Garyfallidis commented Oct 1, 2014

MarcCote commented Oct 2, 2014

Garyfallidis commented Oct 2, 2014

MrBago commented Oct 2, 2014

MarcCote commented Oct 2, 2014

MarcCote commented Oct 2, 2014

Garyfallidis commented Oct 3, 2014

MarcCote commented Oct 8, 2014

Garyfallidis commented Oct 9, 2014

MrBago commented Oct 9, 2014

Garyfallidis commented Oct 9, 2014