-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REF] Split main workflow from CLI for Python access #442
Conversation
Codecov Report
@@ Coverage Diff @@
## master #442 +/- ##
==========================================
+ Coverage 74.99% 76.12% +1.13%
==========================================
Files 35 37 +2
Lines 2863 2961 +98
==========================================
+ Hits 2147 2254 +107
+ Misses 716 707 -9
Continue to review full report at Codecov.
|
Thank you @tsalo! It is a worthwhile refactoring. I will review within next days. Cheers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a "heavy" PR, hard grasp what was moved vs modified even with git diff -w --color-moved=zebra -W master...
. But tests passing is a good sign! ;)
My other concern (besides other comments in a few spots) is: I do not like heudiconv.workflows
and then even more heudiconv_workflow
names. My thinking:
- With "workflow" I got to immediate get a picture of something like nipype or other workflow engine. I do understand that it is my limited view on the world, and here it could indeed be considered a "workflow" but then every function calling other functions in a sequence could.
- I do not see other "workflows" to be born under
workflows/
so why bother with it.
So, why not to just have heudiconv/main.py
with workflow
(or run
to parallel heudiconv.cli.run
; or even heudiconv
to parallel all those odd from glob import glob
;) )... ?
heudiconv/workflows/run.py
Outdated
@@ -0,0 +1,343 @@ | |||
#!/usr/bin/env python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is not intended to be executed directly, so better not bother with shebang. That might interfere with some "smart" setups of IDEs/editors making those files executable for no good reason etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
heudiconv/workflows/run.py
Outdated
from ..utils import load_heuristic, anonymize_sid, treat_infofile, SeqInfo | ||
from ..convert import prep_conversion | ||
from ..bids import populate_bids_templates, tuneup_bids_json_files | ||
from ..queue import queue_conversion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while doing such large RFings anyways -- unless order is required due to some import order effects, best to sort those lines (within a group, e.g. here within from ..
and separately above among system libraries) -- that would simplify later finding the correct imports etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good now.
heudiconv/workflows/run.py
Outdated
queue_args : str or None, optional | ||
Additional queue arguments passed as single string of space-separated | ||
Argument=Value pairs. Default is None. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know an "easy" way around it ATM, may be you would find some, but such duplication of docstrings in here and in cmdline argparse setup eventually would diverge. I don't know an "easy" way since most of the time then people indeed either duplicate like you have done, or come up with some evolved way to generate both cmdline and python interfaces from the same specification (like we did in datalad, and I saw similar ones elsewhere).
In principle it should be possible for many (if not all -- didn't do matching which one is pertinent only here and not cmdline) to take argparse specification and use it to populate a docstring here, and while at it may be even verify consistency of the function. It feels like a common problem, so may be someone already came up with a similar solution? I was looking for the one for click
based interfaces but found none... could you check @tsalo around to see if smth like that was already done, or may be just identify a showstopper which would forbid it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried looking around for tools to synchronize argument parsers with associated functions' docstrings, but couldn't find anything. If it's any consolation, we have the same separation in tedana
and, while there's no automated control, keeping the function docstring up-to-date with the CLI is something we've been able to do. I could also add a line in the function description telling users to refer to the command-line help for the definitive argument descriptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't the default click behavior work. see the top example here: https://click.palletsprojects.com/en/7.x/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for any developer it should be clear from looking at the decorators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that that would take whatever's in the function docstring and make it available in the command line help, but when you try help(function)
in Python, you get a very long and unrelated string for click.Command
instead of the function. I don't think click would help with within-Python documentation.
I've moved |
this looks good to me, thanks! i'll wait for @yarikoptic's blessing before merging |
heudiconv/main.py
Outdated
% ', '.join(get_known_heuristic_names())) | ||
|
||
|
||
def workflow(dicom_dir_template=None, files=None, subjs=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no sane person should invoke workflow
providing any of those as positional args, I will add a leading *,
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen that once, but never used it. It's awesome! I also added a numpydoc Notes section to the docstring to clarify usage for those who, like me, might be confused by the way it's defined. I can remove it if you want though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes are good, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review the use of former args.
and their possible overlap with locally defined variables. (and add that leading *,
to the signature of the workflow
so nobody burns themselves ;))
heudiconv/main.py
Outdated
if session is not None: | ||
session = session | ||
if locator is not None: | ||
locator = locator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smart pycharm greyed out locator
in the signature of this function which raised my concern. Here code like
if args.session is not None:
session = args.session
if args.locator is not None:
locator = args.locator
was, so args.
was providing defaults, whenever now loop variables mask away function arguments. This needs to be fixed up. It is pity that apparently we have no tests which would have triggered either of those cases!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a head-scratcher. I combed through the workflow with an eye out for this kind of pattern, but didn't find any other cases.
anon_outdir = args.conv_outdir or outdir | ||
anon_study_outdir = op.join(anon_outdir, locator or '') | ||
|
||
# TODO: --datalad cmdline option, which would take care about initiating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a datalad cmdline option now. Can this be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess -- good find
Also kind of hoping that the test failures were random and that they’ll pass now.
4 failures in travis now, e.g.
otherwise looks good. After failures addressed, I will try to comb through it locally a bit more and then merge |
I finally fixed the testing, so I think it's ready for your review. |
I think that this PR could be helpful for the BIDS Curation GUI BrainHack project (ohbm/hackathon2020#188), so I thought I'd ping it. Is there anything I can do to move this forward or will it have to hold off until after the hackathon? |
FWIW: reviewing it now... gimme some minutes |
I guess it was a mental side effect from using lgr.error initially or smth like that
heudiconv/main.py
Outdated
from .utils import get_heuristic_description | ||
print(get_heuristic_description(heuristic, full=True)) | ||
else: | ||
raise ValueError("Unknown command %s", command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix this (present in original version too)
lgr.warning( | ||
"We cannot setup exception hook since not in interactive mode") | ||
|
||
sys.excepthook = _pdb_excepthook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: we have this function not covered by tests -- tested manually seems to still work )
heudiconv/main.py
Outdated
iterarg, iterables = ("files", len(files)) if files else \ | ||
("subjects", len(subjs)) | ||
queue_conversion(queue, iterarg, iterables, queue_args) | ||
sys.exit(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we must not sys.exit(0)
here now -- I will RF so we exit in the CLI invocation
ok, pushed those few tune ups. as far as I see it -- good to go. Let's wait for green and merge! |
Thanks @yarikoptic! |
Various improvements and compatibility/support (dcm2niix, datalad, duecredit) changes. Major change is placement of output files to the target output directory during conversion. - #454 zenodo referencing in README.rst and support for ducredit for heudiconv and reproin heuristic - #445 more tutorial references in README.md - [#485][] placed files during conversion right away into the target directory (with a `_heudiconv???` suffix, renamed into ultimate target name later on), which avoids hitting file size limits of /tmp ([#481][]) and helped to avoid a regression in dcm2nixx 1.0.20201102 - #477 replaced `rec-<magnitude|phase>` with `part-<mag|phase>` now that BIDS supports the part entity - #473 made default for CogAtlasID to be a TODO URL - #459 made AcquisitionTime used for acq_time scans file field - #451 retained sub-second resolution in scans files - #442 refactored code so there is now heudiconv.main.workflow for more convenient use as a Python module - minimal version of nipype set to 1.2.3 to guarantee correct handling of DWI files ([#480][]) - `heudiconvDCM*` temporary directories are removed now ([#462][]) - compatibility with DataLad 0.13 ([#464][]) - #443 pathlib as a dependency (we are Python3 only now) * tag 'v0.9.0': Add a helper rule to upload to pypi update changelog reference as part of prep release [DATALAD RUNCMD] prepare the release CHANGELOG entry for 0.9.0
Various improvements and compatibility/support (dcm2niix, datalad, duecredit) changes. Major change is placement of output files to the target output directory during conversion. - #454 zenodo referencing in README.rst and support for ducredit for heudiconv and reproin heuristic - #445 more tutorial references in README.md - [#485][] placed files during conversion right away into the target directory (with a `_heudiconv???` suffix, renamed into ultimate target name later on), which avoids hitting file size limits of /tmp ([#481][]) and helped to avoid a regression in dcm2nixx 1.0.20201102 - #477 replaced `rec-<magnitude|phase>` with `part-<mag|phase>` now that BIDS supports the part entity - #473 made default for CogAtlasID to be a TODO URL - #459 made AcquisitionTime used for acq_time scans file field - #451 retained sub-second resolution in scans files - #442 refactored code so there is now heudiconv.main.workflow for more convenient use as a Python module - minimal version of nipype set to 1.2.3 to guarantee correct handling of DWI files ([#480][]) - `heudiconvDCM*` temporary directories are removed now ([#462][]) - compatibility with DataLad 0.13 ([#464][]) - #443 pathlib as a dependency (we are Python3 only now) * tag 'v0.9.0': Do no bother ensuring that version changed - should be no changes
Closes #440.
This refactor might be more aggressive than the devs might like, so I'm happy to scale it back if necessary.
Changes proposed:
heudiconv.cli.run.main()
intoheudiconv.cli.run.process_args()
and move toheudiconv.workflows.run.heudiconv_workflow()
.heudiconv.cli.run.main()
now does as little as possible (i.e., parses the arguments and does a single check that requires the parser's help message.heudiconv.cli.run
toheudiconv.workflows.run
:args
structure into individual arguments when passed toheudiconv.workflows.run.heudiconv_workflow()
args
.