Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restrict find_matching_paths to only look outside of derivatives/ #1127

Closed
kaare-mikkelsen opened this issue Mar 8, 2023 · 7 comments · Fixed by #1281
Closed

restrict find_matching_paths to only look outside of derivatives/ #1127

kaare-mikkelsen opened this issue Mar 8, 2023 · 7 comments · Fixed by #1281
Labels
Nice First Contribution Opportunity This issue is good for new contributors to MNE-BIDS

Comments

@kaare-mikkelsen
Copy link
Contributor

kaare-mikkelsen commented Mar 8, 2023

Description of the problem

This command is supposed to find 4 .set files for the given subject, from 4 different subjects:

setFilePaths = mb.find_matching_paths(root=root,extensions='.set',datatypes='eeg')
However, it finds 8. Inspecting these 8, it turns out that 4 of them are 'bastard-files' with directories matching the subejct/session directories of the correct files, but basenames matching .set files in a derivatives folder. Looking in path.py, it seems that fpaths = _return_root_paths(root, datatype=datatypes, ignore_json=False) looks in all subfolders beneath root, including 'derivatives', after which _filter_fnames does not check whether anything has been collected accidentally. Finally, _fnames_to_bidspaths creates bidspaths using the inferred subject/session folder based on the basename, but with root specifically set to the root command passed to `find_matching_paths which leads to these 'bastard' bidspaths.

Steps to reproduce

search for bidsfiles in a root directory which also contains a derivatives folder with similar bids-formatted files inside.

Expected results

I suggest ensuring that "find_matching_paths" ignores 'sourcedata' and 'derivatives' when walking the folder tree.

Actual results

find_matching_paths returns too many bidspaths.

Additional information

I don't remember whether the BIDS standard requires subject folders be called 'sub-X', if so, it would be easy to only search folders matching that pattern. Otherwise, 'derivatives', 'sourcedata', 'code' and 'stimuli' might be good folders to ignore when searching for matching files.

@hoechenberger
Copy link
Member

I don't remember whether the BIDS standard requires subject folders be called 'sub-X', if so, it would be easy to only search folders matching that pattern

It does, so your proposal sounds like a good solution!

@sappelhoff
Copy link
Member

I don't remember whether the BIDS standard requires subject folders be called 'sub-X', if so, it would be easy to only search folders matching that pattern.

it does, but such a folder sub-X could also be nested in a derivatives folder. I think it'd be a reasonable solution to add a parameter that controls whether or not the derivatives folder is also considered for the search, or not.

@kaare-mikkelsen
Copy link
Contributor Author

My idea was to only allow sub-X folders in the top level. If people wish to search the derivates folder, I think it would make more sense to change the 'root'-parameter in the call?

@hoechenberger
Copy link
Member

I agree. I would also vote for ignoring derivatives by default. We wouldn't even know if the derivatives are BIDS-compliant

@sappelhoff
Copy link
Member

okay, sounds reasonable to me, ignoring derivatives by default ... and if someone is really keen on getting matches there, they can pass the derivatives directory as the root.

@hoechenberger hoechenberger added the Nice First Contribution Opportunity This issue is good for new contributors to MNE-BIDS label Sep 27, 2023
@sappelhoff
Copy link
Member

@kaare-mikkelsen do you want to make a PR for this?

@kaare-mikkelsen
Copy link
Contributor Author

@sappelhoff yes and no. It's on my to-do list (I agree with @hoechenberger that it's a nice first contribution opportunity), but unfortunately that list is always growing with more urgent stuff. If someone is more able/eager to solve it, they should just do it.

@sappelhoff sappelhoff removed the bug label Oct 24, 2023
@sappelhoff sappelhoff changed the title find_matching_paths looks both in the subject folders and the derivatives restrict find_matching_paths to only look outside of derivatives/ Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Nice First Contribution Opportunity This issue is good for new contributors to MNE-BIDS
3 participants