Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Efforts towards keeping memory low #807

Merged
merged 21 commits into from Nov 2, 2017

Conversation

oesteban
Copy link
Member

@oesteban oesteban commented Oct 31, 2017

This PR reduces memory with two principal changes:

  • Improved and more granular memory annotations
  • Run confounds in native space:
    • Added a new workflow that resamples the BOLD in native space (will be useful for [WIP] ENH: Add --func-only option to skip anatomical processing #808)
    • Adapted the confounds workflows to operate in native space: masks are converted into ROIs in the original T1w space (meaning, the erosions that Chris balanced are still happening at that resolution). A minor change is that the WM+CSF mask is now calculated from the TPMs (instead of calculating separate ROIs and then merging).
    • Added the CSF ROI to the _confounds.tsv file.

For reviewing this PR, please have a detailed look into the new CompCor masks.

@oesteban oesteban changed the title [ENH] Efforts towards keeping memory low [WIP,ENH] Efforts towards keeping memory low Nov 1, 2017
@oesteban oesteban changed the title [WIP,ENH] Efforts towards keeping memory low [ENH] Efforts towards keeping memory low Nov 2, 2017
@oesteban
Copy link
Member Author

oesteban commented Nov 2, 2017

Edited the PR initial message.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple mostly stylistic issues (questions, really), but I think the new ROI resampling strategy still needs to mask in the BOLD space.


for idx in indexes[1:]:
data += nb.load(in_files[idx]).get_data()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to avoid keeping N files in memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I was just avoiding the use of fsl.maths. This interface is just to add tissue probability maps which are small files by definition. Maybe I don't get where you want to get at...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just kind of wondering about the whole goal of this.

Copy link
Member

@effigies effigies Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If you're not trying to save memory and you really want the sum and not the logical OR (the docstring needs fixing, btw), you can replace a lot of this logic (up to this point):

def _run_interface(self, runtime):
    indices = self.inputs.indices or slice(None)
    in_files = np.array(self.inputs.in_files)[indices]

    if len(in_files) == 1:
        self._results['out_file'] = in_files[0]
        return runtime

    im = nb.concat_images(in_files)
    data = im.get_data().sum(axis=3)

input_spec = ConcatROIsInputSpec
output_spec = ConcatROIsOutputSpec
out_file = fname_presuffix(first_fname, suffix='_tpmsum')
im.__class__(data, im.affine, im.header).to_filename(out_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is boolean? Might be worth making sure that it's saved as np.uint8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm double checking. That should be a float. This interface works directly on the TPMs, and it should not change the original dtype. I think it is safe to leave it this way.

@@ -549,9 +562,60 @@ def init_func_preproc_wf(bold_file, ignore, freesurfer,
('outputnode.bold_mask', 'inputnode.bold_mask')]),
])

# Apply transforms in 1 shot
# Only use uncompressed output if AROMA is to be run
bold_bold_trans_wf = init_bold_preproc_trans_wf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bold_bold on this and next workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, its kind of bold-to-bold resampling. Since other are called bold_mni_ and bold_t1_ ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'm going to need to reread, because I did not catch that this wasn't bold->T1w. So is this so that we have an output if it's --func-only?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several targets with this PR:

  1. Effectively, working on BOLD space will make it a lot easier to implement this functionality with --func-only. However, as it is now we still need segmentations from T1. Now it is just easier to replace them with other (e.g. coming from an EPI atlas).
  2. Targets memory fingerprint, since the BOLD files we use to compute the confounds are as small as the original BOLD files. Resampled BOLD on T1 space were generally at least one order of magnitude bigger.
  3. The idea of the new reportlet (std before and after resampling) we add some means to the reports to show if HMC worked well. You were right both on that the report I showed you mixes up all corrections (STC, HMC and SDC) but I'd expect that the largest contributor to differences in that plot is HMC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would resampled BOLD on T1 be bigger? I thought the reduced FoV was supposed to resolve that? (We don't upsample.)

And I agree that the vast majority of the change will be HMC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the thing: reduced FoV optimizes the tilt of the axial plane to minimize the number of slices you acquire, right.

If you bring that reduced FoV to the T1w, which is usually acquired to have the AC PC axis more or less aligned with the axial plane, then the projections of the brain mask to the planes are about the same in X and Y, but a lot larger in Z. This is because the axial plane of BOLD will be very tilted w.r.t. the axial plane of the T1w.

As a result: you have an image of the same resolution of the original BOLD, with about the same number of pixels in X and also in Y, but a lot more in Z. If the time-series is long, then the size of the image grows quickly.

Finally, an as I commented below w.r.t. the mask, the problems we were having with the limited FoV were derived from mapping those partial FoV to T1 and the poor coverage of the T1 brain mask. If you keep things in BOLD space, then you don't suffer from that problem.

BOLD series mask in T1w space
t1_bold_xform
Affine matrix that maps the T1w space into alignment with
the native BOLD space
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're going to re-introduce issues with partial-FoV BOLD series if you're not masking with the BOLD mask after resampling the masks created in T1w space. (I think it's fine to use this transform. But I don't think it replaces masking.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. I'll make sure masks do not go off limits with partial-FoV in BOLD space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm thinking this doesn't make any difference: we work on BOLD space (so it is still partial if it was partial in first place). Nothing will be considered if it is off FoV.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that seems correct. I suppose I was thinking we were in bold_space-T1w still.

@oesteban
Copy link
Member Author

oesteban commented Nov 2, 2017

I checked on the resulting _confounds.tsv files and compare to the former version. Everything is quite close to the originals, except for the vx-stdDVARS which are very different.

Since now we are working on native space, I'm inclined to think that the new values of the voxelwise standardized dvars are more trustworthy.

@oesteban
Copy link
Member Author

oesteban commented Nov 2, 2017

Confirmed this fixes #776.

in_files = InputMultiPath(File(exists=True), mandatory=True, desc='input list of ROIs')
ref_header = File(exists=True, mandatory=True,
desc='reference NIfTI file with desired output header/affine')
indexes = traits.List(traits.Int, desc='select specific maps')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar nitpick: indices


for idx in indexes[1:]:
data += nb.load(in_files[idx]).get_data()
Copy link
Member

@effigies effigies Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If you're not trying to save memory and you really want the sum and not the logical OR (the docstring needs fixing, btw), you can replace a lot of this logic (up to this point):

def _run_interface(self, runtime):
    indices = self.inputs.indices or slice(None)
    in_files = np.array(self.inputs.in_files)[indices]

    if len(in_files) == 1:
        self._results['out_file'] = in_files[0]
        return runtime

    im = nb.concat_images(in_files)
    data = im.get_data().sum(axis=3)

@@ -549,9 +562,60 @@ def init_func_preproc_wf(bold_file, ignore, freesurfer,
('outputnode.bold_mask', 'inputnode.bold_mask')]),
])

# Apply transforms in 1 shot
# Only use uncompressed output if AROMA is to be run
bold_bold_trans_wf = init_bold_preproc_trans_wf(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would resampled BOLD on T1 be bigger? I thought the reduced FoV was supposed to resolve that? (We don't upsample.)

And I agree that the vast majority of the change will be HMC.

@oesteban oesteban merged commit 3166647 into nipreps:master Nov 2, 2017
@oesteban oesteban deleted the fix/memory-issues-revisit branch November 2, 2017 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants