Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
ENH (sugaring): -c | --cfg-proc option for rev-create to specify cfg_ procedure(s) to run #3353
ATM there is no easy way to specify procedures to run on a freshly created dataset. But in my case it is a common use case to quickly create datasets where I know what type they should be - should not
Alternative implementation could be adding --proc-post to the rev-create command itself. Pros then would be:
I also added a
So now it would look like
$> datalad rev-create -c text2git -f /tmp/newds2 [INFO ] Creating a new annex repo at /tmp/newds2 create(ok): /tmp/newds2 (dataset) [INFO ] Running procedure cfg_text2git [INFO ] == Command start (output follows) ===== [INFO ] == Command exit (modification check follows) =====
With something like this I think I wouldn't miss any other option of current
@@ Coverage Diff @@ ## master #3353 +/- ## ========================================== - Coverage 91.15% 91.12% -0.03% ========================================== Files 263 263 Lines 34162 34170 +8 ========================================== - Hits 31139 31138 -1 - Misses 3023 3032 +9
as I have expressed myself hooks specification doesn't allow for quick/concise option to achieve trivial dataset customization upon
Overall, I do not think there is need to decide on hooks before starting to provide this convenience
My concern with this PR is necessity to agree on the
$> datalad run-procedure --discover setup_bids_dataset (datalad_hirni/resources/procedures/setup_bids_dataset.py) [python_script] setup_hirni_dataset (datalad_hirni/resources/procedures/setup_hirni_dataset.py) [python_script] cfg_metadatatypes (../datalad-master/datalad/resources/procedures/cfg_metadatatypes.py) [python_script] cfg_text2git (../datalad-master/datalad/resources/procedures/cfg_text2git.py) [python_script] setup_yoda_dataset (../datalad-master/datalad/resources/procedures/setup_yoda_dataset.py) [python_script]
Do you think we could rename
[quoting out of order] @yarikoptic:
Overall, I do not think there is need to decide on hooks before starting to provide this convenience `-c` option. Or am I missing something?
I think you're right that this shouldn't be held up by command hooks. If we end up adding hooks (and specifically a `create-done` one), this should be easy enough to map onto that without any API change.
as [I have expressed myself](#3264 (comment)) hooks specification doesn't allow for quick/concise option to achieve trivial dataset customization upon `create`.
My question wasn't implying that the syntax there was an alternative to sugaring.
Although might be more generic and useful (although besides this `create` use case I do not have immediate others yet), it would probably largely be complimentary.
AFAICT it calls `run_procedure` where the `create-done` hook would, without being hooked up to the configuration and being restricted to procedures "cfg_*". So...
May be even this sugaring functionality could be refactored later to (ab)use hooks for some reason instead of invoking `run_procedure` directly.
I'd guess that, if we do add hooks, we should refactor this to work with the `run_procedure` loop for the `create-done` spot.
Thank you @mih for all the tune ups! I am ok to merge it as is or we could may be discuss the issue about possible need of allowing for parametrization which I initially brough up in datalad/datalad-neuroimaging#60 (comment) and may be we better decide upon right away to not require changing API later on:
... if we figure out how to "parametrize" additional actions/parametrizations scalably/flexibly. E.g. here
What do you think?