-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add expert option to load BIDS layouts from database file #187
Conversation
You're going to want to add it to these places, as well: Line 75 in 1e45226
fitlins/fitlins/interfaces/bids.py Line 107 in 4fea54e
fitlins/fitlins/interfaces/bids.py Lines 230 to 231 in 4fea54e
fitlins/fitlins/interfaces/bids.py Line 421 in 4fea54e
And this layout should be built with derivatives. |
Codecov Report
@@ Coverage Diff @@
## master #187 +/- ##
==========================================
- Coverage 76.51% 76.48% -0.03%
==========================================
Files 18 18
Lines 1026 1029 +3
Branches 179 181 +2
==========================================
+ Hits 785 787 +2
- Misses 149 150 +1
Partials 92 92
Continue to review full report at Codecov.
|
Alright, trying to add the database_file option to those interfaces. Is there a preferred way to specify an optional input that should be an existing file path if provide, but be None otherwise? |
This or similar: if database_file is not None:
interface.inputs.database_file = database_file |
Something like this should work for specifying it in the input spec? class ModelSpecLoaderInputSpec(BaseInterfaceInputSpec):
bids_dir = Directory(exists=True,
mandatory=True,
desc='BIDS dataset root directory')
database_file = traits.File(exists=True,
desc='Optional path to bids database file.')
model = traits.Either('default', InputMultiPath(File(exists=True)),
desc='Model filename')
selectors = traits.Dict(desc='Limit models to those with matching inputs') |
Yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks reasonable.
If this works for you, I'm good to merge it. It would be good to add tests at some point, but most of the lines we're currently missing require iterating over BIDS datasets with different properties, which makes it a finicky proposition. |
I've currently only tested that the code still works if you don't pass a database file, I'll try to test functionality when passed a database file sometime today and add it to the CircleCI config. I've got an abstract to write today though, so I may not get to it till later on in the week. |
Hello @Shotgunosine, Thank you for updating! Cheers! There are no style issues detected in this Pull Request. 🍻 To test for issues locally, Comment last updated at 2019-12-06 20:14:23 UTC |
@Shotgunosine Heads up that I resolved merge conflicts, so you'll want to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor style changes.
Co-Authored-By: Chris Markiewicz <effigies@gmail.com>
Requires #197... |
@effigies This is pretty much good to go. I think there's an issue with adding the code coverage file for the test I added. Do you think you could take a look at that? |
Yup. I'll get to this ASAP. |
fitlins/cli/run.py
Outdated
g_bids.add_argument('--force-index', action='store', default=None, | ||
help='regex pattern or string to include files') | ||
g_bids.add_argument('--ignore', action='store', default=None, | ||
help='regex pattern or string to ignore files') | ||
g_bids.add_argument('--desc-label', action='store', default='preproc', | ||
help="use BOLD files with the provided description label") | ||
g_bids.add_argument('--database-path', action='store', default=None, | ||
help="Caution, this is an Expert level option subject to change! " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this caution is necessary given how few people use this, and how it's overall an early stage project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is turning into less of a one-off hack. I guess we can remove the warning.
fitlins/cli/run.py
Outdated
reset_database = True | ||
make_layout = True | ||
elif Path(opts.database_path).exists(): | ||
layout = BIDSLayout.load(opts.database_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to let pybids handle some of this logic.
That is, is path exists, let pybids decide to load the database from it.
The problem with the current approach is that if someone passes a valid db path, but then changes the CLIs arguments (so that they would actually want to re-index), no error will be thrown. So their CLI options are being silently ignored in favor for the indexing options in the db_path.
I rather let pybids crash if the options mismatch. Also simplifies this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
# Go ahead and initialize the layout database | ||
if opts.database_path is None: | ||
database_path = Path(work_dir) / 'dbcache' | ||
reset_database = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to manually ask pybids to reset_database
? It will do it on its own if it doesn't file a valid sqlite
file inside the database_path
.
I suppose this is safer though, as it could be a mismatching dbcache
from another run.
An alternative here is to clean up the dbcache
folder on fitlins teardown. (and put it somewhre more hidden, at least make it .dbcache
). Or use a tmpdir, although @effigies seemed to think that would be a problem (not guaranted to be on disk).
I think the right solution is to use tempfile.TemporaryDirectory
but set prefix
to work_dir/.dbcache
. That way its cleaned up automatically. Or can also be cleaned up manually with ``cleanup` function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is safer though, as it could be a mismatching
dbcache
from another run.
Yes, that's the purpose here. If somebody passes a db file, then we assume they know that it's valid. If we're creating it, we don't want to take any chances, and will always clear it.
I think the right solution is to use
tempfile.TemporaryDirectory
but setprefix
towork_dir/.dbcache
. That way its cleaned up automatically.
That would be fine, but I tend to think of anything in the scratch directory as disposable anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. Maybe I'm overthinking this then, and it might actually be useful to have the index there for debugging.
Maybe I'm over complicating things these, but I could imagine it being useful to not force re-index, when you re-run fitlins several times. I guess you could just build it yourself and pass the path in, but its nice to not have to manage that.
Feel free to ignore this suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deffo want it for debugging, and this way, if they want to reuse a layout saved in a working directory, they can pass the path for it explicitly, otherwise it gets reset in order to avoid nasty surprises.
fitlins/cli/run.py
Outdated
g_bids.add_argument('--force-index', action='store', default=None, | ||
help='regex pattern or string to include files') | ||
g_bids.add_argument('--ignore', action='store', default=None, | ||
help='regex pattern or string to ignore files') | ||
g_bids.add_argument('--desc-label', action='store', default='preproc', | ||
help="use BOLD files with the provided description label") | ||
g_bids.add_argument('--database-path', action='store', default=None, | ||
help="Caution, this is an Expert level option subject to change! " | ||
"Path to directory containing SQLite database indicies " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Path to directory containing SQLite database indicies " | |
"Path to directory containing SQLite database indices " |
fitlins/cli/run.py
Outdated
g_bids.add_argument('--force-index', action='store', default=None, | ||
help='regex pattern or string to include files') | ||
g_bids.add_argument('--ignore', action='store', default=None, | ||
help='regex pattern or string to ignore files') | ||
g_bids.add_argument('--desc-label', action='store', default='preproc', | ||
help="use BOLD files with the provided description label") | ||
g_bids.add_argument('--database-path', action='store', default=None, | ||
help="Caution, this is an Expert level option subject to change! " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is turning into less of a one-off hack. I guess we can remove the warning.
# Go ahead and initialize the layout database | ||
if opts.database_path is None: | ||
database_path = Path(work_dir) / 'dbcache' | ||
reset_database = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is safer though, as it could be a mismatching
dbcache
from another run.
Yes, that's the purpose here. If somebody passes a db file, then we assume they know that it's valid. If we're creating it, we don't want to take any chances, and will always clear it.
I think the right solution is to use
tempfile.TemporaryDirectory
but setprefix
towork_dir/.dbcache
. That way its cleaned up automatically.
That would be fine, but I tend to think of anything in the scratch directory as disposable anyway.
fitlins/cli/run.py
Outdated
reset_database = True | ||
make_layout = True | ||
elif Path(opts.database_path).exists(): | ||
layout = BIDSLayout.load(opts.database_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fitlins/cli/run.py
Outdated
if opts.participant_label is not None: | ||
subject_list = bids.collect_participants( | ||
opts.bids_dir, participant_label=opts.participant_label, | ||
database_path=database_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just pass the layout, rather than instantiating another one in the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it
fitlins/cli/run.py
Outdated
@@ -196,7 +229,7 @@ def run_fitlins(argv=None): | |||
except Exception: | |||
retcode = 1 | |||
|
|||
layout = BIDSLayout(opts.bids_dir, derivatives=derivatives) | |||
layout = BIDSLayout.load(database_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that layout is guaranteed to exist, I think we can just reuse it?
layout = BIDSLayout.load(database_path) |
fitlins/interfaces/bids.py
Outdated
@@ -142,6 +142,8 @@ class LoadBIDSModelInputSpec(BaseInterfaceInputSpec): | |||
desc='BIDS dataset root directory') | |||
derivatives = traits.Either(traits.Bool, InputMultiPath(Directory(exists=True)), | |||
desc='Derivative folders') | |||
database_path = Directory(exists=False, | |||
desc='Optional path to bids database directory.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, you're setting this up as an alternative to bids_dir/derivatives, but in _run_interface
, you're replacing them entirely. We should be consistent about which one we're doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefer it to be optional or to have LoadBIDSModel
and BIDSSelect
only work if passed a database_path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional at the CLI level for sure. But we could set up a default internal path (don't we already do that?).
fitlins/interfaces/bids.py
Outdated
@@ -395,6 +382,8 @@ class BIDSSelectInputSpec(BaseInterfaceInputSpec): | |||
desc='BIDS dataset root directories') | |||
derivatives = traits.Either(True, InputMultiPath(Directory(exists=True)), | |||
desc='Derivative folders') | |||
database_path = Directory(exists=False, | |||
desc='Optional path to bids database path.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again. Inputs say optional, runtime says replace.
@Shotgunosine Just a reminder that this one's waiting on you. |
Yeah, thanks, have not had time, but will try to get to it soon. |
Co-Authored-By: Chris Markiewicz <effigies@gmail.com>
Alright, changes made. Let me know if there's anything else to tweak. |
Thanks for pushing on this. Today was slow (sick kid), but I'll try to get to this on Monday. |
LGTM. Thanks for your patience. |
Closes #186