New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Enable the clinica_file_reader
to work with run numbers
#943
[ENH] Enable the clinica_file_reader
to work with run numbers
#943
Conversation
The linked data PR proposes to add some multi-run data for T1Linear and FLAIRLinear. Assuming the input for T1Linear:
The user should obtain:
Where Without the code from this PR, the user should face the following kind of error: E clinica.utils.exceptions.ClinicaBIDSError: Clinica faced error(s) while trying to read files in your BIDS directory.
E Clinica encountered 1 problem(s) while getting T1w MRI:
E * (sub-02 | ses-M000): More than 1 file found:
E /Users/ci-aramis-clinica/data/clinica/clinica_data_ci/data_ci/T1Linear/in/bids/sub-02/ses-M000/anat/sub-02_ses-M000_run-03_T1w.nii.gz
E /Users/ci-aramis-clinica/data/clinica/clinica_data_ci/data_ci/T1Linear/in/bids/sub-02/ses-M000/anat/sub-02_ses-M000_run-01_T1w.nii.gz
E /Users/ci-aramis-clinica/data/clinica/clinica_data_ci/data_ci/T1Linear/in/bids/sub-02/ses-M000/anat/sub-02_ses-M000_run-02_T1w.nii.gz |
I think this is ready for reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with a small suggestion in regex parsing.
Co-authored-by: Ghislain Vaillant <ghisvail@users.noreply.github.com>
Thanks for the review @ghisvail ! |
Description
The issue being addressed
This is a work-in-progress-PR to analyze how we could support run numbers in Clinica.
Some of our converters like GENFI2BIDS or OASIS32BIDS already output files with run numbers and pipelines have different behaviors in regard of this unsupported entity.
The proposed solution
Most of the related logic lies around the
clinica_file_reader
function which relies on_read_files_parallel
or_read_files_sequential
. These two function take as input a BIDS/CAPS folder, a subject, a session, and a query, and they expect to get a single file based on these inputs.If the input BIDS dataset contains some files with run numbers and if the query captures the different runs of a given acquisition, then the file reader will raise an error saying that too many files were found.
This PR investigates how we could allow the file reader to handle the fact that a query returns multiple files. If this happens, we need to verify that the files found only differ through their run numbers. Otherwise, this is indeed an error and we raise as before.
If the files are really different runs, then I believe it makes sense to select only one of the runs to proceed (otherwise I'm not sure what would happen. I suspect the pipelines will crash with unexpected number of files in the input nodes, but we could try to see for real...).
This selection could be done in multiple ways:
Example
Assuming a BIDS input dataset like this:
This is what the file reader does with the proposed implementation.
Thoughts, comments, suggestions, or ideas are more than welcome !