New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save/load tck #1651

Merged
merged 10 commits into from Oct 26, 2018

Conversation

Projects
None yet
5 participants
@skoudoro
Copy link
Member

skoudoro commented Oct 16, 2018

This PR just add:

  • load_tck function
  • save_tck function
  • lazy_load option
  • lazy_save option via LazyTractogram
  • load_dpy function
  • save_dpy function

In the end, this is an implementation of @gabknight comment

@@ -20,6 +24,11 @@ def save_trk(fname, streamlines, affine, vox_size=None, shape=None, header=None)
The shape of the reference image (default: None)
header : dict, optional
Metadata associated to the tractogram file(*.trk). (default: None)
lazy_save : {False, True}, optional
If True, save streamlines in a lazy manner i.e. they will not be kept
in memory. Otherwise, load all streamlines in memory.

This comment has been minimized.

@jchoude

jchoude Oct 16, 2018

Contributor

load -> keep all streamlines in memory until saving.

@skoudoro

This comment has been minimized.

Copy link
Member

skoudoro commented Oct 16, 2018

good catch @jchoude, thank you for your feedback! I just updated the code.

@arokem
Copy link
Member

arokem left a comment

Looks good. I just had a couple of questions.

return trk_file.streamlines, trk_file.header


load_tck = load_tractogram

This comment has been minimized.

@arokem

arokem Oct 16, 2018

Member

Very clever. Any plans to deprecate these? Or just leave them here?

This comment has been minimized.

@skoudoro

skoudoro Oct 16, 2018

Member

I think just leave here for the moment. but any other opinion is welcomed

This comment has been minimized.

@arokem

arokem Oct 20, 2018

Member

Sounds fine to me

@@ -20,6 +24,11 @@ def save_trk(fname, streamlines, affine, vox_size=None, shape=None, header=None)
The shape of the reference image (default: None)
header : dict, optional
Metadata associated to the tractogram file(*.trk). (default: None)
lazy_save : {False, True}, optional

This comment has been minimized.

@arokem

arokem Oct 16, 2018

Member

What does this mean in this context? The latter part of the explanation, about loading streamlines into memory doesn't quite make sense in the context of saving streamlines.

This comment has been minimized.

@skoudoro

skoudoro Oct 16, 2018

Member

When you have a generator with a huge amount of streamlines, nib.streamlines.Tractogram(streamlines) put all the streamlines in memory before saving them which lead to a crash if you do not have enough memory.
By using the lazy_save, you use nib.streamlines.LazyTractogram(streamlines) and this object keep your generator as it is and save your streamlines without any problem.
I hope it is more clear @arokem

This comment has been minimized.

@Garyfallidis

Garyfallidis Oct 16, 2018

Member

Yes, we should rename this variable. It means that it saves a few streamlines each time. Uses less memory.

This comment has been minimized.

@arokem

arokem Oct 20, 2018

Member
Suggested change Beta
lazy_save : {False, True}, optional
save_with_generator : {False, True}, optional
If True (default), use `nib.streamlines.LazyTractogram` so that generator input is
incrementally generated, saving memory.

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

I think lazy_saveor save_with_generator are not explicit enough. What do you think of reduce_memory_usage?

@@ -20,7 +27,23 @@ def save_trk(fname, streamlines, affine, vox_size=None, shape=None, header=None)
The shape of the reference image (default: None)
header : dict, optional
Metadata associated to the tractogram file(*.trk). (default: None)
reduce_memory_usage : {False, True}, optional

This comment has been minimized.

@arokem

arokem Oct 20, 2018

Member

Even if we go with this name (which I think is fine), I'd still want to make sure that users understand that this will only save memory if they provide an input that is a generator (that's right?), so I would add that to the description.

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

will only save memory if they provide an input that is a generator (that's right?)

No, whatever the input streamlines object, we do the job under the hood here and the user will save memory

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

by saving memory, I mean, it will not increase like a hell 😄

This comment has been minimized.

@arokem

arokem Oct 20, 2018

Member

But if streamlines is a list on input, that wouldn't matter. It would just generate from that list, without saving any memory. Or am I missing something?

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

If you have a list on input, Tractogram create Streamlines object which will create a copy of your list and explode the memory for big data. The reduce_memory_usage variable allows the conversion of this list in a generator so no copy and no extra memory.

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

Each time you create a Streamlinesobject from a list, a copy is done because they are not organized in the same way on the memory. This is not the case from a generator or another Streamlines object

This comment has been minimized.

@arokem

arokem Oct 20, 2018

Member

Gotcha - that makes sense. Is there any performance trade-off? Why would anyone ever want to not use this? I mean, if there's no reason to set this to False, should we just eliminate this as an option?

This comment has been minimized.

@skoudoro

skoudoro Oct 20, 2018

Member

Good question! no idea about the performance trade-off. But in general, the generators are less faster

@arokem

This comment has been minimized.

Copy link
Member

arokem commented Oct 20, 2018

@skoudoro skoudoro force-pushed the skoudoro:add-tck-io branch 2 times, most recently from afe1907 to e53f722 Oct 22, 2018

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 23, 2018

Codecov Report

Merging #1651 into master will increase coverage by 0.02%.
The diff coverage is 98.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1651      +/-   ##
==========================================
+ Coverage   87.31%   87.34%   +0.02%     
==========================================
  Files         246      246              
  Lines       32613    32690      +77     
  Branches     3552     3561       +9     
==========================================
+ Hits        28477    28552      +75     
- Misses       3275     3276       +1     
- Partials      861      862       +1
Impacted Files Coverage Δ
dipy/io/tests/test_dpy.py 92.85% <100%> (-0.48%) ⬇️
dipy/io/tests/test_streamline.py 97.8% <100%> (+2.34%) ⬆️
dipy/io/streamline.py 94% <94.87%> (-0.45%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e28942...e53f722. Read the comment docs.

@arokem arokem merged commit e5199af into nipy:master Oct 26, 2018

4 checks passed

codecov/patch 98.01% of diff hit (target 87.31%)
Details
codecov/project 87.34% (+0.02%) compared to f887b77
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@skoudoro skoudoro deleted the skoudoro:add-tck-io branch Nov 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment