Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Statefull tractogram, robust spatial handling and IO #1812

Merged
merged 21 commits into from Jul 26, 2019

Conversation

frheault
Copy link
Contributor

The aim of this PR is to provide a easy-to-use and robust way to handle spatial transformation for tractogram.

  • The new class use other (existing) IO code, but once created an object remains in a consistent state which make manipulation (transformation, saving, etc.) easier.
  • Operations that would lead to a non-respect of conventions are prohibited (invalid state)
  • For any atypical operations or operations outside of the conventions, using existing IO (such as Nibabel) is recommended

A future PR will be done later to modify tutorials, remove depreciated code, adapt functions to reduce risky behaviour by promoting the use of this new class.

@pep8speaks
Copy link

pep8speaks commented Apr 26, 2019

Hello @frheault, Thank you for updating !

Line 208:81: E501 line too long (82 > 80 characters)
Line 218:81: E501 line too long (82 > 80 characters)
Line 225:81: E501 line too long (82 > 80 characters)
Line 235:81: E501 line too long (82 > 80 characters)

Line 276:81: E501 line too long (82 > 80 characters)
Line 327:81: E501 line too long (82 > 80 characters)

Comment last updated at 2019-07-25 13:31:24 UTC

@skoudoro
Copy link
Member

wow, Thank you for this @frheault! That's a huge PR! I feel like a part of this code should be directly on Nibabel but I am maybe wrong, I need to look deeper into it.

@codecov-io
Copy link

codecov-io commented Apr 26, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@51b6422). Click here to learn what that means.
The diff coverage is 67.99%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1812   +/-   ##
=========================================
  Coverage          ?   84.57%           
=========================================
  Files             ?      119           
  Lines             ?    14749           
  Branches          ?     2335           
=========================================
  Hits              ?    12474           
  Misses            ?     1728           
  Partials          ?      547
Impacted Files Coverage Δ
dipy/data/__init__.py 81.45% <ø> (ø)
dipy/workflows/segment.py 73.52% <100%> (ø)
dipy/data/fetcher.py 34.94% <100%> (ø)
dipy/stats/analysis.py 75% <100%> (ø)
dipy/workflows/align.py 87.87% <100%> (ø)
dipy/workflows/viz.py 81.08% <50%> (ø)
dipy/viz/app.py 52.34% <50%> (ø)
dipy/io/streamline.py 54.36% <53.74%> (ø)
dipy/io/utils.py 67.2% <67.5%> (ø)
dipy/io/stateful_tractogram.py 68.11% <68.11%> (ø)
... and 1 more

@frheault
Copy link
Contributor Author

@skoudoro I realize that most python2 build fail for Travis, this is due to a use of Enum in my code.
(Issue explained here: https://stackoverflow.com/questions/26828206/importerror-no-module-named-enum)

What is the 'vision' for Dipy 1.0, will you still support python2 ? If yes, a new requirement is needed (enum34).

@skoudoro
Copy link
Member

What is the 'vision' for Dipy 1.0, will you still support python2 ? If yes, a new requirement is needed (enum34).

No, Dipy 1.0 will not support python2. This problem should be fixed when we merge PR #1775. After that, you can rebase. So you can keep your code as it is

@jchoude
Copy link
Contributor

jchoude commented Apr 26, 2019

Hi @skoudoro ,

if I can bring my point of view WRT the placement of some parts in Nibabel versus here: this is exactly one of the points we wondered about when discussing this PR with @frheault . Our arguments for moving some things to Nibabel were that, since this relates to streamlines, it would make sense to integrate them in Nibabel, to keep them in sync. For example, utilities functions like those used to get the reference info or validate that the data_per_point elements fit with the number of points per streamline could be moved over there.

However, most of the functions and the new class are wrappers over the pure streamlines API, meant to ease use for Dipy users and developers and to reduce possible mistakes. They might be useful directly in the streamlines API, and it would centralize things, but I don't know if they would add too much overhead for users that just want to directly use the Streamlines API. For example, in @frheault code, most of the time, a reference anatomy / tractogram is needed, to correctly manage the reference frame. In nibabel.streamlines, the user can directly use the affine and manage it as he sees fit. If we integrate all of this to Nibabel, it needs to be correctly engineered to not remove some freedom from end users.

Part of the reason why this PR is still marked as WIP is because we felt we might need to discuss such questions with the team, since this is a PR that can break things (but since 1.0 might be backwards incompatible, that might not be a huge issue). I think that it would be very helpful to get the opinion of @MarcCote (since he's Mr. Streamlines API), @arokem (since he was interested in this) and @effigies (since he maintains Nibabel). We are absolutely open to discussion. This is an important issue that should be fixed once and for all.

Cheers!

Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having wrappers over the Streamlines API that forces the users to provide a reference point makes sense. From my quick glance over the PR, it seems a lot could/should be merged in Nibabel. For instance, it would be nice to have VTK support directly in the Streamlines API.

Using overwrite_data without data will erase data (with {})

Subsampling streamlines from ArraySequence should ALWAYS be done with a
deepcopy and pass as a list. ArraySequence views are broken !
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "ArraySequence views are broken!" Is it something that should be fixed in NiBabel?

Copy link
Contributor Author

@frheault frheault May 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the array sequence view. If you slices an ArraySequence (something like sub_streamlines = streamlines[0:10]) sub_streamlines.data contains ALL data from streamlines. I called it broken, maybe that's not the right word, it is more like "surprising behaviour" for a lot of people.

I think that this class is needed for Dipy 1.0 in July, from what I saw at the Workshop handling of spatial transformation, file format header and the knowing the appropriate space for computation and various function call is a huge challenge for most people. Do you think this new class would be acceptable in its current state for Nibabel ?

I am not trying to push anything too fast, but when (if) this is merged there is a lot of work (that I am willing to do) to fix the current handling of spatial transformation and IO.
3D_screenshot
All of these tractograms were computed from the example (and workflows) and they should be aligned (out of 13, 3 display at the right place in MI-Brain, only 2 out of these 3 have a adequate header)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing that is "limiting" right now is the support for lazy-loading and lazy-saving. From a "design" POV, it will be hard to maintain a state (like with the current PR) while also supporting lazy operations.

For example, say that we are lazy loading some streamlines. Then, for some reason, in the middle of iterating on the streamlines, some operation is done, like a call to to_vox. Then, what should be the behavior? Invalidate the iterator, I guess. This would make sense since we probably don't expect people to call a state changing function during iteration.

I think it would be doable, but would take longer. In the meantime, as @frheault just showed, most of the tutorials / functions are not consistent in Dipy. I would suggest the following plan, if you agree:

1- Integrate everything that is relevant to Nibabel based on the current design (after reviews and refactors if needed)
2- Fix tutorials, functions and examples in Dipy to be consistent and clear on what they expect.
3- [optional] Refactor StatefullTractogram to support lazy operations, if we can reliably handle it.

What do you think, @MarcCote ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a sobering picture!

Regarding the brokeness in nibabel, this is what I was mentioning here? nipy/nibabel#729

I still owe nibabel a tutorial :-(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frheault I'm all for making it easier for the users. Those are great wrappers 👍 .

@jchoude I agree with that plan. I think it also goes well with what @arokem proposed below (i.e. test everything out in DIPY 1.0 then move what needs to be moved into Nibabel).

@arokem yes you are right about the issue you mentioned. I wasn't expecting people to call .data directly (I don't remember why we didn't make it private in the first place, though) but I guess it is useful after all. That should be addressed in Nibabel, though.

Copy link
Contributor

@arokem arokem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @frheault! I had some comments, but they're mostly about documentation at this point.

Regarding integration with nibabel: one option would be to include this only in DIPY for the 1.0 release and then move it to nibabel and eventually remove it from DIPY in a subsequent release, once it's stabilized in DIPY. This will give us an opportunity to test it in practice, before integrating it upstream.

match the reference and are effectively in the right space.

Any change to the number of streamlines, data_per_points or
data_per_streamlines requires particular verification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you maybe say here how you would verify this? What code would you run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I meant to say that changes to these attribute (using the function of the class) will fail if invalid (failing to respect the length mentioned in the description).
What is the optimal way to 'warn' the user that these attributes need to remain consistent with each other ?
(I ask, because I am a bit too much close to the code to know what is clear)

RASMM = 'rasmm'


class StateFullTractogram(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be StatefulTractogram (because 'stateful' is a single word and it ends with only one 'l').



class StateFullTractogram(object):
""" Object designed to be identical no matter the file format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the top level description, I would say something like: "Class for stateful representation of collections of streamlines (tractograms)". I would add this description below that.


space_attribute = get_reference_info(reference)
if space_attribute is None:
raise TypeError('Reference MUST be one of the valid types')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we print out the valid types here?

self._inv_affine = np.linalg.inv(self._affine)

if space not in Space:
raise ValueError('Space MUST be one of the 3 choices (Enum)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we print out the valid options here?

Parameters
----------
sft : StateFullTractogram
The tractogram to save (must have been generated properly)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know if it was properly generated?

def load_tractogram(filename, reference, to_space=Space.RASMM,
shifted_origin=False, bbox_valid_check=True,
trk_header_check=True):
""" Applies median filter multiple times on input data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does what now? I think this first line of the docstring must be a copy-paste error (?)

return sft


def robust_split_name(filename):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is more broadly useful and should be in the dipy.io base namespace!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd delegate this to Nibabel. See my other comment about dealing with filename extensions.

dipy/io/vtk.py Outdated
lines += [lines_vertices[line_range]]
current_idx = next_idx
if to_lps:
to_ras = np.eye(4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable name is confusing. You are transforming this into LPS, right?

@@ -242,6 +246,18 @@ def test_io_dpy():
decimal=4)


def test_io_vtk():
if have_vtk:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to decorate the function with a @npt.dec.skipif(not have_vtk)

@effigies
Copy link
Contributor

effigies commented May 2, 2019

Hi all, I haven't read through the code here, but some general thoughts: With regard to nibabel, I'm pretty much in accord with everything @jchoude said. The current streamlines API should continue to be usable, but I don't see any reason that an API for coordinating a reference file would be inappropriate in nibabel. Possibly there are some things that are really dipy-specific, but if you have data structures and algorithms that are usable without the overall dipy framework, nibabel seems a good place for them.

I think probably the biggest question is how you picture the back-and-forth between your development process and nibabel. You'll lose some flexibility by pushing these APIs to an upstream, as you'll be more dependent on our release processes. If you're willing to work on our release schedule (nipy/nibabel#734), then that will work well, but if you need faster iteration, that could grow frustrating for you. So I'd give some consideration to @arokem's suggestion of going a few versions refining the API here before moving upstream. If/when you're confident that it's an API you can use with new features coming only quarterly and bug fixes ~monthly, then that's a good time to move to nibabel.

Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also added a few comments. I'd be happy to schedule a video call to discuss it if you want.

logging.warning('Wrong initial space for this function')
return

def _voxmm_to_rasmm(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure you are aware of the get_affine_trackvis_to_rasmm function in
https://github.com/nipy/nibabel/blob/master/nibabel/streamlines/trk.py#L65

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it was clearer in term of code to have every private function on the same pattern.

raise ValueError('Invalid data per streamlines, does not match')

self._data_per_point = dict(data_per_point)
self._data_per_streamline = dict(data_per_streamline)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why aren't you using a Tractogram object internally? It would manage the streamlines, data_per_point and data_per_streamline for you and handle the space transformations given that you provide the correct affine in the constructor (i.e. which brings the streamlines in "RAS+ MM" space).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering the same thing, Why your class does not inherit from Tractogram?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @frheault answer this, but a quick answer is that using a Tractogram object internally would allow people to access methods such as Tractogram.affine_to_rasmm() and Tractogram.to_world(), which would leave the StatefulTractogram in an inconsistent state.

I guess that's a limitation of Python, since we can't really make anything 100% private.

Inheriting might be a solution, however.

Copy link
Contributor Author

@frheault frheault May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use tractogram to help with the consistency of streamlines and its data_per_streamlines and data_per_points. However for the spatial handling I find the attribute and function related to tractogram to be potentially confusing if access (what JC said)

After it get in Dipy 1.0 (and some real-life testing), if this get moved to Nibabel. It would be interesting to inherit from Tractogram and potentially LazyTractogram. That redesign/refactor is probably a good idea, but maybe later ?

But for now I will look into Tractogram for data/streamlines handling and keep the internal consistency and spatial transformation handling for the stateful_tractogram
But if I see that the state is easily made inconsistent by using function such as affine_to_rasmm(), Tractogram.to_world() or even Tractogram.apply_affine(), I am not sure. With the autocomplete on, I am ''scared'' that people would see them and use them and make the stateful_tractogram as risky as the alternative.

is_trk = False
if isinstance(reference, six.string_types):
_, ext = robust_split_name(reference)
if ext == '.nii' or ext == '.nii.gz':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would avoid checking for specific extensions and delegate that to Nibabel. I'd try to do a nib.load (or if it fails a nib.streamlines.load) and check the type of object that is returned (like you do anyway). I think that makes it more generic since you could provide a BytesIO object and makes it more robust to people messing with extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then you would continue with the check for instance of Nifti1Image, TrkFile, Nifti1Header, etc. ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

return sft


def robust_split_name(filename):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd delegate this to Nibabel. See my other comment about dealing with filename extensions.

@jchoude
Copy link
Contributor

jchoude commented May 3, 2019

I agree with @arokem and @effigies proposal of integrating to Dipy, and then migrating to Nibabel when it has been test ran thoroughly.

Additionally, about the stability vs release cycle of Nibabel: the functions in here are meant to cover all cases of space changes. Nothing is purely proper to Dipy: any project interacting with streamlines might want to use them in any of the spaces defined here. Normally, once everything is tested, this should remain extremely stable going forward.

Also, just a note that @frheault 's work is based on work we have been doing in the lab for the last 6,7 years. François and I have broken our teeth many times on funky cases. I'm pretty confident in what is here, and it is based on code we have been using internally since the Streamlines API has been released. Of course, this doesn't replace getting it in the wild and seeing if issues arise. I just post this here to ease some concerns people might have.

@arokem
Copy link
Contributor

arokem commented May 6, 2019

After it get in Dipy 1.0 (and some real-life testing), if this get moved to Nibabel. It would be interesting to inherit from Tractogram and potentially LazyTractogram. That redesign/refactor is probably a good idea, but maybe later ?

+1 from me for that plan

@skoudoro skoudoro added this to the 1.0 milestone May 7, 2019
@frheault
Copy link
Contributor Author

frheault commented May 8, 2019

I think I address the first "wave" of comments, most were pretty straightforward as they were error on my part, typo or about docstring.

The most complex change is the internal use of a nibabel Tractogram object. I decided to use only a private attribute and reduce as much as possible the visibility of the variable. Interactions with it should always be through the setter/getter of the stateful tractogram (this is to minimize the chance of people seeing or using Tractogram function such as to_word() or apply_affine() that could lead to weird/unexpected behaviour)

As a result, the handle of data_per_point and data_per_streamlines is greatly simplified, so a couple hundred lines of code (and test) were removed. However, it is important that "normal" user do not interact directly with the internal Tractogram.

Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another quick pass over the StatefulTractogram. It is still not clear to me why you couldn't just rely on self._tractogram.affine_to_rasmm given you set it properly and make the necessary checks. What unit test (or example) should I be looking at to understand why this is needed?

dipy/io/stateful_tractogram.py Outdated Show resolved Hide resolved
dipy/io/stateful_tractogram.py Outdated Show resolved Hide resolved
dipy/io/stateful_tractogram.py Outdated Show resolved Hide resolved
Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it is still unclear to me what's really needed in addition to the Tractogram object (other than the sanity checks). Is it that you need the original reference point (i.e. space and alignment)?

Can you point me to an example where using Tractogram directly fails? Thanks.

dipy/io/stateful_tractogram.py Outdated Show resolved Hide resolved
old_shift = deepcopy(sft.get_current_shift())

sft.to_rasmm()
sft.to_center()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here you are bringing the streamlines into rasmm space where the origin (0, 0, 0) is defined as the center of a voxel before saving? That's exactly the same space streamlines from Tractogram object are defined in (provided you don't mess directly with the ArraySequence).

Copy link
Contributor Author

@frheault frheault May 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, to revert to a state that nibabel expect and thus creating a valid file

A situation where Tractogram is super confusing for most user is when you call the function .to_world() and the affine_to_ras variable becomes identity and if you write this in the header of a trk, this isn't valid. (also you cannot go back to vox or voxmm space)

Storing spatial attribute and state is the only way to easily deal with this for most user.

In a future work, all of this could be in Tractogram (if a reference is forced) or a new class that inherit it. (after dipy testing)


return is_valid

def remove_invalid_streamlines(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frheault what is the exact purpose of this method?

Copy link
Contributor Author

@frheault frheault May 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I forgot the docstring on this function, I will add it very soon. Overall here is a quick summary:

Sometime directly after generation, or after a transformation, or after cropping a volume, or even numerical error, streamlines end up having invalid coordinates (above the volume dimension or below 0 in voxel space).
This will cause crash in function such as target, density_map, connectivity_map.

Removing such streamlines yourself can be surprisingly difficult, requires spatial information about the reference, good knowledge of various space/convention and since it is technically an invalid state for any tractogram having this function in the class will help user to save/load valid object.

PS : Loading and Saving will raise an error (which you can disable with an optional parameters) if trying to load/save an invalid stateful_tractogram.

tmp_data_per_point = {}
tmp_data_per_streamline = {}

for key in self._tractogram.data_per_point:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to do here? What is the purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this function removes streamlines that are invalid, I remove their data (per_point, per_streamline) so the new object is still consistent and valid for the nibabel.tractogram object

@arokem
Copy link
Contributor

arokem commented May 22, 2019

Hey @frheault : I think that something went wrong with your recent rebase. I am now seeing a lot of unrelated commits in this PR.

@arokem
Copy link
Contributor

arokem commented May 23, 2019

I'm afraid that you might need to do an interactive rebase. I'd be happy to walk you through this on a video chat if this is your first time doing this.

@frheault
Copy link
Contributor Author

@arokem Sorry I deleted a message almost immediately after posting it...
I think I fixed it using cherry-picking and then force the push. I am up-to-date with master and I see only my 12 commits (and only 6 files changed)

@arokem
Copy link
Contributor

arokem commented May 24, 2019

Yes. Looks like you are back on track now. Thanks!

@frheault
Copy link
Contributor Author

frheault commented Jun 6, 2019

Failling tests are in the test_linear_mixed_models_flow, only for two build.
Is that considered normal ? (I'm up-to-date with the master)

@arokem
Copy link
Contributor

arokem commented Jul 16, 2019

Hey @frheault and others: where does this currently stand? It looks like discussion on several points was adressed. I am wondering what still needs to happen here for us to be able to merge this and start using it.

Also, @frheault : would you mind rebasing here?

@frheault
Copy link
Contributor Author

I have been asking the same questions for 1 month. I think everyone is on vacation. Since there was little comment on the code itself and no external testing I was wondering if it was ready to merge or not. I am currently on vacation, next week I could work on this and finalize it if there is new comments. (and I will rebase as soon as I turn on my laptop, so Monday morning)

@skoudoro
Copy link
Member

Indeed, with the conferences and vacation, This PR slow down, but it will be HIGH priority as soon as you come back. so next week

@skoudoro
Copy link
Member

Hi @frheault,

Thank you for the rebase! Travis is failing because of this error on many tests:
cannot import name 'save_trk'

I advise you to keep these functions save_trk, load_trk, load_tck, etc... They are just small interface to save/load_tractograms and they will need a reference too. Even if they all run save/load_tractogram, they are convenient for the users.

@frheault
Copy link
Contributor Author

frheault commented Jul 24, 2019

Hi @skoudoro, so I fixed everything that was failing involving loading and saving tractogram.

Also, everywhere in the test I am using load_tractogram and save_tractogram (new version) as best as possible considering the context. Right now I updated the tracks300.trk so the header is "valid". BUT it is still an invalid file as the value are actually in no specific space (So I have to turn off the check for validity in voxel_space, but technically the file can be loaded normally anyway)

Then in the workflow such as align (SLR) and segment (RB) I am not using the new stateful_tractogram and instead directly use nibabel as before they were basically doing exactly that and I didn't want to add a new parameters to the flow (as discussed in the video call). The dipy_horizon workflow is the same, but there is a if to support dpy like before.

Finally I added the load_trk, save_trk, etc. version, BUT I added a check so they can load and save only the file format they expect (tiny wrapper instead of an assignation). This is to avoid what was happening very often when people use load_trk in a script and give a filename ending in tck instead and everything load/save normally but crash later. Now it crashs and tell you that load_trk can only load trk, for more generability use load_tractogram instead.
(PS I actually don't know if there is a better way to do that, right it is explicit and quite long/ugly to the eye. But I don't know if there is an equivalent to template in python)

@skoudoro
Copy link
Member

Awesome! Really good job @frheault! Just a quick comment, Can you replace your assert by assert_equal. This is more explicit and respect the convention. I will review completely this PR tomorrow morning. Thank you again

@Garyfallidis Garyfallidis changed the title WIP: ENH: Statefull tractogram, robust spatial handling and IO ENH: Statefull tractogram, robust spatial handling and IO Jul 24, 2019
@Garyfallidis
Copy link
Contributor

@frheault I have removed the WIP from the title of your PR as this is getting really close to be merged.

Copy link
Member

@skoudoro skoudoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below a couple of comment. In general, I would prefer to use @property which I find "more pythonic" than all this get/set method but this is a personal preference.
Otherwise, it looks good and really close to being merged.

If someone else can have a look or an opinion, it will be good!

Thank you

dipy/io/utils.py Outdated
def create_tractogram_header(tractogram_type, affine, dimensions, voxel_sizes,
voxel_order):
""" Write a standard trk/tck header from spatial attribute """
if isinstance(tractogram_type, six.string_types):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace six.string_types by str

voxel_order):
""" Write a standard trk/tck header from spatial attribute """
if isinstance(tractogram_type, six.string_types):
tractogram_type = detect_format(tractogram_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detect_format functions not found, can you import it


identical_header = True
if not np.allclose(affine_1, affine_2):
logging.error('Affine not equal')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging module not found (you have to import it). Why are you using logging module and do not raise an exception. I will do `raise IOError("Affine not equal")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is made to test if the header are identical, it returns a boolean and the testing is usually done outside.
It is useful to give some liberty to users when making verification themselves.
Also the logging is useful to see which (and all) difference. Then the user can do what they want with the true/false return.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, Got it! so that's good

identical_header = False

if not np.array_equal(dimensions_1, dimensions_2):
logging.error('Dimensions not equal')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

identical_header = False

if not np.allclose(voxel_sizes_1, voxel_sizes_2):
logging.error('Voxel_size not equal')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

elif isinstance(reference, dict) and 'magic_number' in reference:
header = reference
is_trk = True

Copy link
Member

@skoudoro skoudoro Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to add an else just in case someone try to use a ndarray as a reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's not enough, an affine does not contains de the shape/dimensions information which is important for function such as density_map computation.

Copy link
Member

@skoudoro skoudoro Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, Got it! I went through it again and it makes total sense

fornix = [s[0] for s in streams]
data_path = get_fnames('fornix')
fornix = load_tractogram(data_path, 'same',
bbox_valid_check=False).get_streamlines()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small indentation error

@@ -50,6 +50,7 @@ def test_median_otsu_flow():
def test_recobundles_flow():
with TemporaryDirectory() as out_dir:
data_path = get_fnames('fornix')
print(data_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this print

@@ -8,7 +8,7 @@ renamed or are deprecated (not recommended) during different release circles.
DIPY 1.0 changes
----------------
Some of the changes introduced in the 1.0 release will break backwards
compatibility with previous versions.
compatibility with previous versions.s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s ? Can you remove this typo


def get_streamlines_copy(self):
""" Safe getter for streamlines (for slicing) """
return deepcopy(list(self._tractogram.streamlines))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you convert this to a list ? To keep an ArraySequence, I will do:

self._tractogram.streamlines.copy()

@skoudoro
Copy link
Member

Thank you @frheault for this Huge work! merging

@MarcCote
Copy link
Contributor

MarcCote commented Aug 7, 2019

Great job @frheault

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants