-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Improve _get_series_len
performance
#2406
[ENH] Improve _get_series_len
performance
#2406
Conversation
- When dummy_scans is provided in the configuration, we do not need to estimate the number of non-steady-state volumes
fmriprep/workflows/bold/base.py
Outdated
if isinstance(config.workflow.dummy_scans, int): | ||
skip_vols = config.workflow.dummy_scans | ||
else: | ||
skip_vols = _get_vols_to_discard(img) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config
object is a hulking beast, and I'd prefer to keep its tentacles confined to init_*_wf()
functions, and pass arguments to standalone functions, to make them easier to reason about.
def _get_series_len(bold_fname, skip_vols):
from niworkflows.interfaces.registration import _get_vols_to_discard
img = nb.load(bold_fname)
if len(img.shape) < 4:
return 1
if skip_vols is None:
skip_vols = _get_vols_to_discard(img)
return img.shape[3] - skip_vols
And then, above, pass config.workflow.dummy_scans
to _get_series_len()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to make an overhaul of the documentation, on the code guidelines section) and:
- mention this (please do not use config deep in the workflow)
- recommend black for everything except the
workflow.connect
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't actually looked at the code guidelines in a while, so I read them again just now.
I realized that I probably also should have considered the branch naming/pull request naming, and added [ENH]
.
Code review by @effigies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I appreciate the attention you pay to details like this that improve performance.
Thank you :-) I actually only noticed this by accident while profiling something else. |
_get_series_len
performance_get_series_len
performance
Fixes #2405
Changes proposed in this pull request
When
dummy_scans
is provided in the configuration, we do not need to estimate the number of non-steady-state volumes, thus increasing performance of workflow creation.This effectively halves the time needed for
![2021-04-28_14-45](https://user-images.githubusercontent.com/789054/116406085-9a90a780-a830-11eb-8fdf-256198942b94.png)
init_func_preproc_wf
in my test dataset.Documentation that should be reviewed
Not applicable