Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No clear way to ignore known dummy frames for ICA-AROMA, etc... #13

Closed
alexlicohen opened this issue May 17, 2018 · 23 comments
Closed

No clear way to ignore known dummy frames for ICA-AROMA, etc... #13

alexlicohen opened this issue May 17, 2018 · 23 comments

Comments

@alexlicohen
Copy link

We are processing the GSP 1000 dataset, which does have dummy frames in the fMRI data, but we want to use the ICA-AROMA output. It would seem more logical to remove these frames up front before MELODIC is run. Informal comparison finds distinct differences in fcMaps between leaving them in vs truncating the file before fmriprep is run. What should be the appropriate sequence of steps here?

@oesteban
Copy link
Member

It seems to me like a very reasonable feature that we can add and set on by default. WDYT @chrisfilo @jdkent ?

@effigies
Copy link
Member

So we already detect non-steady-state volumes. Would we be able to reuse that? And is that a sensible default behavior?

@alexlicohen
Copy link
Author

alexlicohen commented May 17, 2018 via email

@effigies
Copy link
Member

That seems reasonable. My main concern in asking is whether we can avoid adding more options, when we don't need need to. Come to think of it, I wonder if there's anything in BIDS indicating the first non-dummy scan. It seems like the place to encode information like that is in the dataset, not in the command line arguments.

@jdkent
Copy link

jdkent commented May 17, 2018

Interesting problem!

My intuition was that ICA-AROMA would generate noise component(s) that were associated with the dummy frames and wouldn't (significantly) impact the other signal components, but this wouldn't be the first (nor the last) time my intuition was incorrect.

@alexlicohen When you compared the truncated versus non-truncated ICA-AROMA methods, you truncated the non-truncated data after ICA-AROMA, correct? I assume so, but I want to be explicit. I think I have some data that can play with truncating a different number of volumes and see if the components change.

As for the general process of truncation (outside the context of MELODIC/ICA-AROMA), I understand that this can be accomplished with adding columns to the design matrix with a "1" in the row that represents the volume you wish to ignore and "0"s for every other row. Given that is correct, you can effectively remove that volume without manipulating the data.

My concern for changing the number of volumes that are output by fmriprep is that it may change the interpretation of the events.tsv file if the fMRI data was constructed for a task. I like @effigies suggestion to incorporate the # of dummy volumes in the dataset itself

Thus if we decide that dummy volumes do negatively impact the generation of MELODIC components, then we could remove the volumes before MELODIC, run ICA-AROMA, and then either:

  • keep the volumes removed (and hope the events.tsv accounts for this)
  • add the unaltered dummy volumes back in and pad the aroma confounds in the confounds.tsv with the average of the component (or 0?)

@alexlicohen
Copy link
Author

alexlicohen commented May 17, 2018 via email

@chrisgorgo
Copy link

We could do what we are doing already for compcor - see nipreps/fmriprep#361

@effigies
Copy link
Member

Yeah, and I think @alexlicohen was saying that would be a reasonable default (speaking of which, is there a citation for AROMA or CompCor or similar algorithms performing more reliably with dummy scans removed?), but there should be a way of specifying the known number of dummy scans. I agree that's useful, but think it makes more sense to be encoded in BIDS metadata that we read. Is there such a field, or would you have any objection to proposing such a field?

@alexlicohen
Copy link
Author

alexlicohen commented May 18, 2018

Yes, we've got two fields, albeit now I see there is a typo in spec 1.1.0 (the first one should be "...discarded by the scanner (as opposed to...)

NumberOfVolumesDiscardedByScanner: Number of volumes ("dummy scans")
discarded by the user (as opposed to those discarded by the user post hoc) before saving the
imaging file. For example, a sequence that automatically discards the first 4 volumes before saving
would have this field as 4. A sequence that doesn't discard dummy scans would have this set to 0.
Please note that the onsets recorded in the event.tsv file should always refer to the beginning of 
the acquisition of the first volume in the corresponding imaging file - independent of the value of 
NumberOfVolumesDiscardedByScanner field.
 

NumberOfVolumesDiscardedByUser: Number of volumes ("dummy scans") discarded
by the user before including the file in the dataset. If possible, including all of the volumes is
strongly recommended. Please note that the onsets recorded in the event.tsv file should always 
refer to the beginning of the acquisition of the first volume in the corresponding imaging file - 
independent of the value of NumberOfVolumesDiscardedByUser field.
 

@chrisgorgo
Copy link

Thanks for reporting the typo - I fixed it in the current draft.

Some questions/comments:

  1. Do you have any examples where non-steady state detector did work correctly?
  2. Have you tried adding the NonSteadyStateOutlierXX regressors from confounds.tsv to your temporal regression/cleaning model or using it for censoring data prior to estimating connectivity? This could be the easiest way to clean up fcMaps.
  3. The NumberOfVolumesDiscardedByScanner and NumberOfVolumesDiscardedByUser BIDS fields describe how many volumes were removed (past tense) from data prior bidsification. This does not prescribe how many volumes should be ignored in preprocessing.

@alexlicohen
Copy link
Author

alexlicohen commented May 18, 2018 via email

@chrisgorgo
Copy link

Just to make it clear - I don't oppose adding this feature, but I would like to explore if what we already provide (via non-steady state detection) is sufficient. We iterated over the heuristic a couple of times, but if it is not robust enough there is indeed a need for manual option.

@alexlicohen
Copy link
Author

I will look through our tsv files, to be clear, this is what we've tried so far:

(Using a subject from the GSP which all retain the nonsteadystate frames: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/25833)

GSP -> fmriprep, use ICA-AROMA output -> cut 4 frames -> correlation maps
Vs
GSP -> cut 4 frames -> fmriprep, use ICA-AROMA output -> correlation maps

There were obvious differences spatially and strength-wise in the maps.

We haven't yet tried (but I'm having our RA do this now):
GSP -> fmriprep, use ICA-AROMA output -> GLM with tsv confounds (including the nonsteadystate column) -> correlation maps

We have also done:
GSP -> fmriprep, use regular output -> cut 4 frames -> GLM with (whole brain, csf, WM, 6motion) or (aCompCor) tsv confounds -> correlation maps
Vs
GSP -> cut 4 frames -> fmriprep, use regular output -> GLM with (whole brain, csf, WM, 6motion) or (aCompCor)tsv confounds -> correlation maps
I can't recall if these were different when we looked at it earlier this week, given all of the variations, I'm asking now

but not::
GSP -> fmriprep, use regular output -> GLM with tsv confounds (including the nonsteadystate column) -> correlation maps

@chrisgorgo
Copy link

Just to clarify. There are many potential confounds in the confounds.tsv file (including ones necessary for performing aggressive ICA denoising). For your comparison I would only include the columns related to non-steady state.

How many non-steady state volumes are detected in your data? Are any of the ICA components picking on them? Are those components classified as noise?

@alexlicohen
Copy link
Author

alexlicohen commented May 18, 2018

We usually get two columns:

  • NonSteadyStateOutlier00 and NonSteadyStateOutlier01, but sometime one or both are missing, the 00 column has a 1 for the first timepoint only, the 01 column has a 1 for the second timepoint only.
  • The aCompCor and tCompCor columns are zeros for the first two timepoints
  • The AROMAAggrComp columns are high values for those timepoints, then settle down.
  • It shows up in the CSF, WM, Global, and DVARs columns too:

I.e.:

NonSteadyStateOutlier00 NonSteadyStateOutlier01 aCompCor00 aCompCor01
1 0 0.000… 0.000…
0 1 0.000… 0.000…
0 0 0.076… -0.094…
0 0 0.034… -0.173…
0 0 -0.031… -0.111…
NonSteadyStateOutlier00 NonSteadyStateOutlier01 AROMAAggrComp01 AROMAAggrComp02
1 0 11.020… 10.795…
0 1 0.387… 1.941…
0 0 -0.038… 0.411…
0 0 -0.125… 0.201…
0 0 -0.105… 0.036…
NonSteadyStateOutlier00 NonSteadyStateOutlier01 CSF WhiteMatter GlobalSignal stdDVARS FramewiseDisplacement
1 0 669.692… 61.204… 163.751…
0 1 14.917… -2.558… 2.423… 11.573… 0.304…
0 0 -13.262… -2.348… -3.984… 1.422… 0.074…
0 0 -18.783… -2.356… -5.059… 0.963… 0.073…
0 0 -18.926… -1.407… -5.939… 0.898… 0.079…

@chrisgorgo
Copy link

These findings are consistent with my expectations:

  1. outliers are one hot encoded to avoid any assumptions of relations between them
  2. outliers are removed from data used to estimate compcor (hence zeros)
  3. ICA is picking up on the outliers (but it will stop if we denoise the data the same way we do for compcor)

I think the main question is if there are cases where you have evidence of more non-steady state volumes than detected in data which would warrant the manual specification. If not we can just implement the pre ICA denoising using non-steady state detector outputs.

@alexlicohen
Copy link
Author

One motivation is to try and replicate the output of an older (no longer supported) pipeline using fmriprep. In that pipeline, the first 4 frames were removed. Other than duplicating the entire dataset sans the first four frames, there is not a current way to do that in fmriprep...

I agree that it likely does the detection correctly, and steady state is achieved by frame three in our dataset, but it is not consistent across subjects and I think adding the option of specifically dropping a set numbers is a valid one.

(as a general default, I agree with your idea for using the nonsteadystate flags going into the ICA processing.)

@alexlicohen
Copy link
Author

alexlicohen commented May 23, 2018

If not we can just implement the pre ICA denoising using non-steady state detector outputs.

@chrisfilo Can we go ahead and add this for now for testing purposes?

Per the original AROMA paper, they did this too:

Preprocessing of the training set was carried out using tools from the FMRIB Software Library (FSL; http://www.fmrib.ox.ac.uk/fsl; Smith et al. (2004), Woolrich et al. (2009), Jenkinson et al. (2012)) and involved (1) removal of the first five volumes to allow for signal equilibration, (2) head movement correction by volume-realignment to the middle volume using MCFLIRT, (3) global 4D mean intensity normalization and (4) spatial smoothing (6 mm FWHM). Importantly, no temporal filtering was applied at this stage of processing. Next, we applied ICA to the preprocessed participant-level data, using automatic dimensionality estimation as implemented in FSL MELODIC.

So the intended use of AROMA is for data that is already all steady state...

@chrisgorgo
Copy link

chrisgorgo commented May 23, 2018 via email

@alexlicohen
Copy link
Author

Chris, I wish I could, but I'm not familiar enough with the fmriprep code to know where to change this. Can you point me in the right direction, and/or is someone else available to do this?

-Alex

@chrisgorgo
Copy link

I think the best way would be to add a regression step just before MELODIC
https://github.com/poldracklab/fmriprep/blob/5f55944ad78512f9e5a86c2c65eda23bfdba23e1/fmriprep/workflows/bold/confounds.py#L484

You can reuse some of the filtering code from compcor

https://github.com/nipy/nipype/blob/a01d4b4e127f45e9a8e514c22de95c17aeb3bf6c/nipype/algorithms/confounds.py#L558

and

https://github.com/nipy/nipype/blob/a01d4b4e127f45e9a8e514c22de95c17aeb3bf6c/nipype/algorithms/confounds.py#L1128

All of this could be controlled by a command line switch which will alternate between removing detected non-steady volumes, a fixed number or none.

I hope this helps! If you open a PR early (as a work in progress) we can guide you more accurately.

@oesteban
Copy link
Member

Hi @rciric, could you have a look into this discussion and make a suggestion as regards what we can do in this regard in the context of nipreps/fmriprep#1540 ?

@tsalo
Copy link
Collaborator

tsalo commented May 14, 2024

This is now built into the fMRIPost-AROMA CLI.

@tsalo tsalo closed this as completed Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants