New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract values from an image based on streamline coordinates. #811
Conversation
a956f71
to
bc7bb25
Compare
Fixed up for elderly scipy, rebased and added testing. This is ready for a review. |
|
||
|
||
def vals_from_img(img, streamlines, affine=None): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no initial description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer values_
than vals_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also having input an img rather than data and affine is something rarely used in dipy. Do you really need img as input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... after a bit more thinking it seems the function needs to be renamed and moved to dipy.tracking.streamline
. The name of the function I propose is dipy.tracking.streamline.scalar_values
. This function will be useful also for @martcous who is currently preparing some PRs related to tractometry. @martcous give us a review here please and tell us if you have a different preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - for now, I am moving it into tracking.streamline. I have also addressed the other comments -- I think that you are right about using the data array and an affine, rather than the nibabel image. Have refactored accordingly.
Any more comments here? @martcous -- do you have any input on any of this? |
Sorry for not answering earlier. I think it makes sense for this to be in dipy.tracking.streamline. However, I am not sure about the name of the scalar_values() function. I think this operation would be useful to extract vectors from 4D arrays as well. I am not too familiar with Dipy's nomenclature, but I would name it something along the lines of extract_along_streamlines(). |
Thanks for taking a look! That's a comment that @Garyfallidis made as well. Let me try to implement the n-dimensional version of this (or at least the 4D version), and I will also think about a name that makes sense. |
I've refactored this to work for 4D data as well. I've special-cased the 3D vector case, to use |
Looks good to me, awesome work, thanks a lot! |
else: | ||
vals.append(list(vfu.interpolate_scalar_3d(data, sl)[0])) | ||
|
||
elif isinstance(streamlines, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I don't give it an array, a list or a streamline generator, won't it crash (or do weird stuff) on trying to return vals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. This would've raised a cryptic error ("val undefined" or somesuch), so I've added proper error handling in the else
block and a test for that.
At the same time, this seems to only works on points and not segments, might as well add a scary warning in a note to not run this on compressed datasets or you will most probably get nonsense/very few values. |
OK - I have added that in the |
vals = [] | ||
for sl in streamlines: | ||
if threedvec: | ||
vals.append(list(vfu.interpolate_vector_3d(data, sl)[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpolate function only works on float64, so it gives errors like
File "/home/samuel/python/dipy/tracking/streamline.py", line 440, in values_from_volume
affine=affine))
File "/home/samuel/python/dipy/tracking/streamline.py", line 375, in _extract_vals
vals.append(list(vfu.interpolate_scalar_3d(data, sl)[0]))
File "dipy/align/vector_fields.pyx", line 434, in dipy.align.vector_fields.interpolate_scalar_3d (dipy/align/vector_fields.c:21841)
ValueError: Buffer dtype mismatch, expected 'double' but got 'float'
just adding a quick n dirty sl=sl.astype(np.float64) makes it work though, so at this point it might be worth to do the typecast in the cython files directly instead of each upper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Fri, Jan 15, 2016 at 7:15 AM, Samuel St-Jean notifications@github.com
wrote:
In dipy/tracking/streamline.py
#811 (comment):
- Return
- array or list (depending on the input) : values interpolate to each
coordinate along the length of each streamline
- """
- data = data.astype(np.float)
- if affine is not None:
streamlines = ut.move_streamlines(streamlines,
np.linalg.inv(affine))
- if (isinstance(streamlines, list) or
isinstance(streamlines, types.GeneratorType)):
vals = []
for sl in streamlines:
if threedvec:
vals.append(list(vfu.interpolate_vector_3d(data, sl)[0]))
The interpolate function only works on float64, so it gives errors like
File "/home/samuel/python/dipy/tracking/streamline.py", line 440, in
values_from_volume
affine=affine))
File "/home/samuel/python/dipy/tracking/streamline.py", line 375, in
_extract_vals
vals.append(list(vfu.interpolate_scalar_3d(data, sl)[0]))
File "dipy/align/vector_fields.pyx", line 434, in
dipy.align.vector_fields.interpolate_scalar_3d
(dipy/align/vector_fields.c:21841)
ValueError: Buffer dtype mismatch, expected 'double' but got 'float'just adding a quick n dirty sl=sl.astype(np.float64) makes it work though,
so at this point it might be worth to do the typecast in the cython files
directly instead of each upper functions.Sorry - is this happening somewhere in our CI? Or in some test that you've
written? I am not seeing this anywhere. Could you share some code that
triggers this error with this function?—
Reply to this email directly or view it on GitHub
https://github.com/nipy/dipy/pull/811/files#r49863546.
I was testing it on a real use case, but here is one from the fornix dataset. Might be a good idea someday to force output of tracking as float32 since they are just coordinate, and you will save ram for free also. import numpy as np
from dipy.data import get_data
fname = get_data('fornix')
from nibabel import trackvis as tv
streams, hdr = tv.read(fname)
streamlines = [i[0].astype(np.float32) for i in streams]
from dipy.tracking.streamline import set_number_of_points, values_from_volume
extracted = values_from_volume(np.ones((10,10,10)), streamlines) |
Great. Thanks for the example. Will see what I can do to fix. On Fri, Jan 15, 2016 at 7:50 AM, Samuel St-Jean notifications@github.com
|
For the time being, I've put in a test for this (that currently fails). On Fri, Jan 15, 2016 at 9:24 AM, Ariel Rokem arokem@gmail.com wrote:
|
Since cython functions hardcode types, and for testing they have a simple python wrapper, easiest way would be to typecast there and use those. Kind of what I do with my stuff sometimes, then I return the original dtype if needed |
@omarocegueda : what do you think about adding type-casting to a wrapper On Sat, Jan 16, 2016 at 1:31 AM, Samuel St-Jean notifications@github.com
|
Do you mean like: https://github.com/omarocegueda/dipy/blob/cast_float64/dipy/align/vector_fields.pyx#L464 ? That seems to be ok if we want to pass float32 buffers. |
Arent streamlines only point coordinates? I would be very surprised to see a difference between 1e-8 and 1e-16 mm. Even rounding to 5 digits would probably be fine for roi extraction, with all the interpolation going around it can't have that much impact. |
Yep. I think that's the right thing for this. I actually just went ahead On Sat, Jan 16, 2016 at 8:01 AM, Omar Ocegueda notifications@github.com
|
Any thoughts about the loss of precision (see On Sat, Jan 16, 2016 at 10:56 AM, Ariel Rokem arokem@gmail.com wrote:
|
OK - I have armed this against different data-types of coordinates. This did result in some loss of precision in the answers up to approximately 4 decimals in the cases used for testing. Anyone has any more thoughts here? |
|
||
if affine is not None: | ||
inv_affine = np.linalg.inv(affine) | ||
sl_cat = (np.dot(sl_cat, inv_affine[:3, :3]) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also do that in only one matrix multiplication, that's why homogeneous coordinates are used.
https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - but then you have to augment the coordinates, adding a '1' to every
coordinate. This is a way to get around that.
On Tue, Jan 26, 2016 at 6:28 AM, Samuel St-Jean notifications@github.com
wrote:
In dipy/tracking/streamline.py
#811 (comment):
for sl in streamlines:
if threedvec:
vals.append(list(vfu.interpolate_vector_3d(data,
sl.astype(np.float))[0]))
else:
vals.append(list(vfu.interpolate_scalar_3d(data,
sl.astype(np.float))[0]))
- elif isinstance(streamlines, np.ndarray):
sl_shape = streamlines.shape
sl_cat = streamlines.reshape(sl_shape[0] *
sl_shape[1], 3).astype(np.float)
if affine is not None:
inv_affine = np.linalg.inv(affine)
sl_cat = (np.dot(sl_cat, inv_affine[:3, :3]) +
You can also do that in only one matrix multiplication, that's why
homogeneous coordinates are used.https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations
—
Reply to this email directly or view it on GitHub
https://github.com/nipy/dipy/pull/811/files#r50840662.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that be faster if you have to move a whole brain tractogram? Computer vision/imaging people are using these since you can express all transformations in a single operation instead, maybe they want to pitch in about if its faster to add another dimension or go this way.
I don't think it would be faster, and it's definitely less memory-demanding. FWIW, this is the same computation that is used by move_streamlines: https://github.com/nipy/dipy/blob/master/dipy/tracking/utils.py#L917 |
I tested it a bit and seems to be all good. Is there anything left to check? If not, I can go ahead with the merging it you want. |
Can't see my message from yesterday, it must have gone under a hidden diff comment. But anyway, its a few seconds faster (2.5 vs 3) on a small set of 200 000 streamlines, if you deal with a few millions it might be useful in the long run, just saying. |
Must have been a fluke: https://gist.github.com/arokem/db8396eccc9d138175f3 |
To be clear: I don't see any evidence that transforming streamlines by multiplying an augmented streamline (4D) with the full 4 by 4 matrix is faster than multiplying the 3D streamline with the top-left rotation matrix and adding the 3 coordinates of the translation transform. Without evidence, I don't think we should change anything about how we transform streamlines. Either here, or in |
I meant on a real example, like shifting a whole tractogram, linear algebra would have to be bad to do something on such a small example. aff=np.eye(4)
aff[3,:3] = 10
import nibabel_streamlines as nib
bund = nib.streamlines.load('tracking.trk')
slines = bund.streamlines
In [86]: %timeit for i in slines: np.dot(i,aff[:3,:3]) + aff[:3,3];
1 loops, best of 3: 6.91 s per loop
In [87]: %timeit for i in slines: a = np.ones((i.shape[0],4));a[:,:3]=i;np.dot(a,aff);
1 loops, best of 3: 6.38 s per loop Anyway, it just seems weird to not use something the way it's intended for, that's all. |
Sorry. Still not seeing any substantial difference, even with 200k On Thu, Jan 28, 2016 at 8:22 AM, Samuel St-Jean notifications@github.com
|
Wow, my computer might be bad then or something compared to yours. Or we are not using the same versions/stuff/lib/whatever, not really important at this point. Still, I guess it's more important for realtime stuff than what we are doing. |
You'll be doing more operations with the 4x4 multiplication than by doing the 3x3 multiplication and adding the 3, vector, as long as you can assume that the bottom row of the affine is [0, 0, 0, 1]. First case is 4 * (4 multiplications, 3 additions) per coordinate, second is 3 * (3 multiplications, 3 additions) per coordinate. I think. So, I'd expect second approach to be faster in general, but I guess will depend on your BLAS - it might be that intense cache optimization could overcome the operation count. |
So while we are still discussing this, would you mind adding a nearest neighbour interpolation option? The idea is to extract indices of crossed voxels instead of values (whichever might be easier), this way you can get back the original pathway of peaks followed by the tracking easily. |
Yeah. That wouldn't be too hard. Let me do that here, before this one gets On Fri, Jan 29, 2016 at 1:54 AM, Samuel St-Jean notifications@github.com
|
Good idea @samuelstjean. @arokem no problem. |
That's fine, I actually wanted to do something similar, and you actually
|
Great! Feel free to make a PR against this branch, if you have something On Fri, Jan 29, 2016 at 1:38 PM, Samuel St-Jean notifications@github.com
|
No actually I was about to do it and you had a pr already, so I am just
|
@samuelstjean You don't need to do it, I'll try to put it in in the coming week. |
Hey @jchoude -- thanks for pitching in on this. Would it be easier for you to do this as a follow-up PR, or as a PR against this one? |
@arokem Easier in a separate PR |
No problem. Just to be sure -- the idea would be to introduce a key-word argument in this function (something like |
I'd rather not for this case, his version is segment based instead of point
|
I meant as key-word arguments to the We can do the segment-based thing under the hood. On Sun, Jan 31, 2016 at 11:40 PM, Samuel St-Jean notifications@github.com
|
Well, if I don't see any performance degradation during the integration to Dipy, it could become the default mode. From my tests in standalone code, I was able to reach about the same speed as the density_map (and / or track_counts) functions, which are points-based. |
Any more comments here? It seems that @jchoude will make a separate PR for nearest-neighbors behavior. |
In fact, I'll make a separate PR for the code that works with variable step sizes. |
Great! Anything more to do here? |
Seems all good. I'll merge it then. |
Extract values from an image based on streamline coordinates.
I saw some typo still though |
Still need to test affine transformations