Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] integrate load_confounds into first_level_from_bids #4103

Merged
merged 21 commits into from Dec 18, 2023

Conversation

Remi-Gau
Copy link
Collaborator

@Remi-Gau Remi-Gau commented Nov 9, 2023

Changes proposed in this pull request:

  • use load_confounds to load specific subset of confounds if extra confounds_* arguments are passed
  • update fake bids data generation so they contain more realistic confounds

TODO

  • update changelog
  • investigate conflicts between confounds strategy and GLM detrenting
  • add in example?

Copy link
Contributor

github-actions bot commented Nov 9, 2023

👋 @Remi-Gau Thanks for creating a PR!

Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft.

Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.

  • PR has an interpretable title.
  • PR links to Github issue with mention Closes #XXXX (see our documentation on PR structure)
  • Code is PEP8-compliant (see our documentation on coding style)
  • Changelog or what's new entry in doc/changes/latest.rst (see our documentation on PR structure)

For new features:

  • There is at least one unit test per new function / class (see our documentation on testing)
  • The new feature is demoed in at least one relevant example.

For bug fixes:

  • There is at least one test that would fail under the original bug conditions.

We will review it as quick as possible, feel free to ping us with questions if needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to move the create_bids_filename in a different module to help avoid circular imports

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these changes in here are black related

may conflict with #3285

Comment on lines 1873 to 1885
models, m_imgs, m_events, m_confounds = first_level_from_bids(
dataset_path=bids_path,
task_label="main",
space_label="MNI",
img_filters=[("desc", "preproc")],
slice_time_ref=None,
confounds_strategy=("motion", "wm_csf", "scrub"),
confounds_motion="full",
confounds_wm_csf="basic",
confounds_scrub=1,
confounds_fd_threshold=0.2,
confounds_std_dvars_threshold=3,
)
Copy link
Collaborator Author

@Remi-Gau Remi-Gau Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this what the "API" could look like to select some confounds: does it look sensible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htwangtw @ymzayek @bthirion
keeping this as a draft so you can discuss API, implementation...

Will update the doc once the dust settles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could be made lighter if we expect to have predefined configurations for confounds, but I don't think that we're at that point.
At least I find the current API quite explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we might get rid of confounds_strategy, because the next three arguments are redundant with the provided list ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this test just show a subset of the possible arguments that can be passed to load_confounds

I guess this could be made lighter if we expect to have predefined configurations for confounds, but I don't think that we're at that point.

Actually, we might get rid of confounds_strategy, because the next three arguments are redundant with the provided list ?

Actually I am just reusing the API from load_confounds, so it can almost be passed as is and we can let load_confounds do the argument validation

https://nilearn.github.io/dev/modules/generated/nilearn.interfaces.fmriprep.load_confounds.html

In short strategy defines what type of confounds to include and all the other parameters give more details on how to include them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update the doc string to try to explain this so we can see if it makes sense

@Remi-Gau Remi-Gau marked this pull request as draft November 9, 2023 11:56
@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Nov 9, 2023

one thing we may want to check and send warning about the compatibility between some confound strategies and first level arguments: especially regarding high pass filters

Comment on lines +1363 to +1368
confounds, metadata = get_legal_confound()
confounds.to_csv(
confounds_path, sep="\t", index=None, encoding="utf-8"
)
with open(confounds_path.with_suffix(".json"), "w") as f:
json.dump(metadata, f)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with this approach is that the "legal_confounds" have a set number of time points so when creating a fake bids dataset, we end up with images that have a number of time points that does not match the number of time points in the confounds.

This does not affect any tests AFIACT but this may lead to confusing errors when testing down the line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean, because of scrubbing ? Sorry if I miss something obvious.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

let me try to rephrase

the way to generate "fake confounds" for the fake fmriprep datasets we use for testing would only create 6 confounds for the realignment parameters filled with random data and for a specified number of time points

to allow testing the load_confounds we need more realistic confounds with more columns with names that match what's in an actual fmriprep dataset

to do this I reuse the strategy used to test the load_confounds functions: use an actual confound file from an fmriprep dataset and copy its content every time it is needed in the fake fmriprep dataset

but this "template" confound file has only a limited number of time points

so we end up with fake fmriprep datasets that have nifti images with 100 volumes but with confounds with only 30 time points

possible solutions:

  • easy: set the number of volumes to match the number of time points in the confounds
  • hard(er): adapt the content of the confounds to the number of time points

hope this is clearer

for now I will go for the easy solution but we may have to implement the harder option in the future if we want to test more "exotic" stuff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go for the easy one. The number of volumes should be a parameters of the data simulation function anyhow ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now it is not: it is hard coded. I would keep it that way for now until we need more flexibility during testing.

but I will change the place where it is hard coded so it is easier to adapt in the future. will also add a comment to explain why this value was chosen.

Comment on lines +1368 to +1369
with open(confounds_path.with_suffix(".json"), "w") as f:
json.dump(metadata, f)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor change: using legal_confounds allows to add metada files for the confounds in the fake bids derivatives. some tests had to be changed to account for this.

Copy link

codecov bot commented Nov 9, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (a1810e3) 91.85% compared to head (d3588e1) 91.86%.

Files Patch % Lines
nilearn/_utils/bids.py 86.66% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4103      +/-   ##
==========================================
+ Coverage   91.85%   91.86%   +0.01%     
==========================================
  Files         145      146       +1     
  Lines       16360    16384      +24     
  Branches     3424     3432       +8     
==========================================
+ Hits        15027    15051      +24     
+ Misses        792      788       -4     
- Partials      541      545       +4     
Flag Coverage Δ
macos-latest_3.10_test_plotting 91.72% <96.49%> (+0.01%) ⬆️
macos-latest_3.11_test_plotting ?
macos-latest_3.12_test_plotting 91.72% <96.49%> (+0.01%) ⬆️
macos-latest_3.8_test_plotting 91.68% <96.49%> (+0.01%) ⬆️
macos-latest_3.9_test_plotting 91.69% <96.49%> (+0.01%) ⬆️
ubuntu-latest_3.10_test_plotting ?
ubuntu-latest_3.11_test_plotting ?
ubuntu-latest_3.12_test_plotting 91.72% <96.49%> (+0.01%) ⬆️
ubuntu-latest_3.12_test_pre 91.72% <96.49%> (+0.01%) ⬆️
ubuntu-latest_3.8_test_min 68.95% <96.49%> (?)
ubuntu-latest_3.8_test_plot_min ?
ubuntu-latest_3.8_test_plotting ?
ubuntu-latest_3.9_test_plotting ?
windows-latest_3.8_test_plotting 91.66% <96.49%> (+0.01%) ⬆️
windows-latest_3.9_test_plotting 91.66% <96.49%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Nov 9, 2023

Thought: should this functionality be demoed in the examples?

Comment on lines 1020 to 1032
kwargs: :obj:`dict`

.. added:: 0.11.0

Keyword arguments to be passed to functions called within this function.

Kwargs prefixed with ``confound_``
will be passed to :func:`~nilearn.interfaces.fmriprep.load_confounds`.
This allows to ``first_level_from_bids`` to return
a specific set of confounds by relying confound loading strategies
defined in :func:`~nilearn.interfaces.fmriprep.load_confounds`.
If no kwargs are passed, ``first_level_from_bids`` will return
all the confounds available in the confounds TSV files.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bthirion let me know if this helps clarify how to use this.

I prefer to add examples here and refer users to the doc of load_confounds for the details to avoid doc duplications.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed !

@bthirion
Copy link
Member

Is this ready for review ?

@Remi-Gau Remi-Gau marked this pull request as ready for review November 14, 2023 09:33
@Remi-Gau
Copy link
Collaborator Author

I would say yes, though I need to better check how to handle the second TODO mentioned in the top post of the PR:

  • investigate conflicts between confounds strategy and GLM detrenting

Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall !


.. code-block:: python

models, m_imgs, m_events, m_confounds = first_level_from_bids(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
models, m_imgs, m_events, m_confounds = first_level_from_bids(
models, imgs, events, confounds = first_level_from_bids(
``` (not sure what the `m_` means)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a copy-pasta from the code we have in the tests: probably could change it there too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI those were shorter form of the name of the return argument:

  • models_run_imgs
  • models_events
  • models_confounds

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe even if this makes the doc more verbose, I should use their full name to be internally consistent in the doc string ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer models, imgs, events, confounds = first_level_from_bids

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bthirion I renamed those variable in the doc strings.

they also appear like this in a few tests, I could do a bit of renaming there too just for internal consistency.

nilearn/glm/first_level/first_level.py Outdated Show resolved Hide resolved
Copy link
Member

@ymzayek ymzayek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

nilearn/glm/first_level/first_level.py Outdated Show resolved Hide resolved
nilearn/glm/first_level/first_level.py Outdated Show resolved Hide resolved
nilearn/glm/first_level/first_level.py Outdated Show resolved Hide resolved
@ymzayek
Copy link
Member

ymzayek commented Nov 15, 2023

I'm not against demoing in an example. I would always just consider the amount of additional build time, whether or not it is already well documented (which the added docstring does very well), and if parts of an example can be replaced or minimally tweaked to essentially improve them using this new functionality.

@Remi-Gau
Copy link
Collaborator Author

I'm not against demoing in an example. I would always just consider the amount of additional build time, whether or not it is already well documented (which the added docstring does very well), and if parts of an example can be replaced or minimally tweaked to essentially improve them using this new functionality.

I will do a draft in a separate PR but I was considering a minimal tweak to an already existing example.

Remi-Gau and others added 2 commits November 16, 2023 10:27
Co-authored-by: Yasmin <63292494+ymzayek@users.noreply.github.com>
@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Dec 7, 2023

"Conflict" to resolve:

load_confounds

  • load_confounds has the possibility to add a high pass filter to the confounds
- "high_pass" adds discrete cosines transformation basis regressors to handle low-frequency signal drifts.

and when the compcor strategy is requested "high_pass" must be as well

- "compcor" confounds derived from CompCor :footcite:`Behzadi2007`.
  When using this noise component, "high_pass" must also be applied.
  Associated parameter: `compcor`, `n_compcor`

GLM first level

drift_model can be cosine or polynomial or none

    drift_model : string, default='cosine'
        This parameter specifies the desired drift model for the design
        matrices. It can be 'polynomial', 'cosine' or None.

    high_pass : float, default=0.01
        This parameter specifies the cut frequency of the high-pass filter in
        Hz for the design matrices. Used only if drift_model is 'cosine'.

    drift_order : int, default=1
        This parameter specifies the order of the drift model (in case it is
        polynomial) for the design matrices.

There are some combinations of the above that could lead to "strange" design matrices:

  • design matrices with high pass filter defined twice (with possibly different types of high pass filters)

To keep things simple I would say that we let the GLM machinery handle the high pass filtering setting and we ignore in the load_confounds anything that has to do with high pass filtering.

This means that we should also ignore (for now) the compcor strategy.

@Remi-Gau
Copy link
Collaborator Author

Remi-Gau commented Dec 7, 2023

@bthirion before I start on this, can you tell me if the brief description above makes roughly sense as to what the problem is?

@bthirion
Copy link
Member

bthirion commented Dec 7, 2023

You mean, the problem of duplication of high-pass filtering with the use of compcorr ? My view on this is that this is the user responsibility. You can't prevent people from doing wrong things. The point is to help them diagnose it easily using visualization of the design matrix they created. Does that answer your concern ?

@Remi-Gau
Copy link
Collaborator Author

You can't prevent people from doing wrong things but you can add more friction to make it harder for them to do so. 😋

I think my views may be a bit more "paternalistic" (more tainted by automated pipelines and avoiding options that would considered wrong in most cases) but you have way more experience than me in the "nilearn philosophy" so I will gladly follow your lead on this.

I will at least add some explicit warnings to tell them they have chosen options that may be redundant or clash with each other.

@bthirion
Copy link
Member

Agreed. My impression is that when we impose constraints, we may not foresee all practical consequences.
This is why I think it is important to have good examples or event tutorials that promote good patterns.

@htwangtw
Copy link
Member

Currently we do safegard the compcor and high pass in load_confounds. For real I have already seen people using other approaches for high pass that is not the cosine regressor approach with fMRIPrep compcor despite that is not recommanded on their official documentation. Raising warning might be the way to go

@nilearn nilearn deleted a comment from D3njo Dec 12, 2023
models, m_imgs, m_events, m_confounds = first_level_from_bids(
models, models, imgs, events, confounds = first_level_from_bids(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the variables as mentioned in the PR discussion

Comment on lines +1137 to +1148
if drift_model is not None and kwargs_load_confounds is not None:
if "high_pass" in kwargs_load_confounds.get("strategy"):
if drift_model == "cosine":
verb = "duplicate"
if drift_model == "polynomial":
verb = "conflict with"

warn(
f"""Confounds will contain a high pass filter,
that may {verb} the {drift_model} one used in the model.
Remember to visualize your design matrix before fitting your model
to check that your model is not overspecified.""",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the phrasing of the warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me toot.

Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx.

Comment on lines +1137 to +1148
if drift_model is not None and kwargs_load_confounds is not None:
if "high_pass" in kwargs_load_confounds.get("strategy"):
if drift_model == "cosine":
verb = "duplicate"
if drift_model == "polynomial":
verb = "conflict with"

warn(
f"""Confounds will contain a high pass filter,
that may {verb} the {drift_model} one used in the model.
Remember to visualize your design matrix before fitting your model
to check that your model is not overspecified.""",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me toot.

@Remi-Gau
Copy link
Collaborator Author

If I don't hear back from anyone I will merge this on Monday.

Copy link
Member

@htwangtw htwangtw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for implementing this!

@Remi-Gau Remi-Gau merged commit dfe2d54 into nilearn:main Dec 18, 2023
32 checks passed
@Remi-Gau Remi-Gau deleted the bids_glm_confounds branch December 18, 2023 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] integrate load_confounds into first_level_from_bids
4 participants