Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Improve _get_series_len performance #2406

Merged
merged 2 commits into from
Apr 28, 2021

Conversation

HippocampusGirl
Copy link
Contributor

@HippocampusGirl HippocampusGirl commented Apr 28, 2021

Fixes #2405

Changes proposed in this pull request

When dummy_scans is provided in the configuration, we do not need to estimate the number of non-steady-state volumes, thus increasing performance of workflow creation.

This effectively halves the time needed for init_func_preproc_wf in my test dataset.
2021-04-28_14-45

Documentation that should be reviewed

Not applicable

- When dummy_scans is provided in the configuration, we do not need to
  estimate the number of non-steady-state volumes
Comment on lines 883 to 886
if isinstance(config.workflow.dummy_scans, int):
skip_vols = config.workflow.dummy_scans
else:
skip_vols = _get_vols_to_discard(img)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config object is a hulking beast, and I'd prefer to keep its tentacles confined to init_*_wf() functions, and pass arguments to standalone functions, to make them easier to reason about.

def _get_series_len(bold_fname, skip_vols):
    from niworkflows.interfaces.registration import _get_vols_to_discard
    img = nb.load(bold_fname)
    if len(img.shape) < 4:
        return 1

    if skip_vols is None:
        skip_vols = _get_vols_to_discard(img)

    return img.shape[3] - skip_vols

And then, above, pass config.workflow.dummy_scans to _get_series_len().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to make an overhaul of the documentation, on the code guidelines section) and:

  • mention this (please do not use config deep in the workflow)
  • recommend black for everything except the workflow.connect function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't actually looked at the code guidelines in a while, so I read them again just now.

I realized that I probably also should have considered the branch naming/pull request naming, and added [ENH].

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I appreciate the attention you pay to details like this that improve performance.

@HippocampusGirl
Copy link
Contributor Author

Thank you :-) I actually only noticed this by accident while profiling something else.

@HippocampusGirl HippocampusGirl changed the title Improve _get_series_len performance [ENH] Improve _get_series_len performance Apr 28, 2021
@effigies effigies merged commit 84a6005 into nipreps:maint/20.2.x Apr 28, 2021
@HippocampusGirl HippocampusGirl deleted the series-len-performance branch May 13, 2021 07:49
@HippocampusGirl HippocampusGirl restored the series-len-performance branch October 19, 2021 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants