[REF] Split main workflow from CLI for Python access #442

tsalo · 2020-04-18T17:05:43Z

Closes #440.

This refactor might be more aggressive than the devs might like, so I'm happy to scale it back if necessary.

Changes proposed:

Make a new workflows module.
Merge parts of heudiconv.cli.run.main() into heudiconv.cli.run.process_args() and move to heudiconv.workflows.run.heudiconv_workflow().
- heudiconv.cli.run.main() now does as little as possible (i.e., parses the arguments and does a single check that requires the parser's help message.
Move the following functions from heudiconv.cli.run to heudiconv.workflows.run:
- process_extra_commands
- setup_exceptionhook
- is_interactive
- ensure_heuristic_arg
Split args structure into individual arguments when passed to heudiconv.workflows.run.heudiconv_workflow()
- This makes it easier for users to call the workflow directly in Python, as they won't need to build args.
- It should also make it easier to track updates to the parameters in the parser vs. those used in the workflow.
- The drawback is that it means updating both the CLI and the workflow function in tandem when changing parameters.
Reformat the parser a little bit. I can change this back, of course, but I think it's more readable this way.

codecov · 2020-04-18T21:21:42Z

Codecov Report

Merging #442 into master will increase coverage by 1.13%.
The diff coverage is 80.89%.

@@            Coverage Diff             @@
##           master     #442      +/-   ##
==========================================
+ Coverage   74.99%   76.12%   +1.13%     
==========================================
  Files          35       37       +2     
  Lines        2863     2961      +98     
==========================================
+ Hits         2147     2254     +107     
+ Misses        716      707       -9

Impacted Files	Coverage Δ
heudiconv/main.py	`76.19% <76.19%> (ø)`
heudiconv/cli/run.py	`91.48% <100.00%> (+11.84%)`	⬆️
heudiconv/tests/test_main.py	`100.00% <100.00%> (ø)`
heudiconv/dicoms.py	`81.14% <0.00%> (-0.41%)`	⬇️
heudiconv/bids.py	`82.97% <0.00%> (ø)`
heudiconv/info.py	`100.00% <0.00%> (ø)`
heudiconv/tests/test_utils.py	`100.00% <0.00%> (ø)`
heudiconv/tests/test_convert.py	`100.00% <0.00%> (ø)`
heudiconv/utils.py	`90.45% <0.00%> (+0.37%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1a7cb5...c49134e. Read the comment docs.

yarikoptic · 2020-04-19T04:52:06Z

Thank you @tsalo! It is a worthwhile refactoring. I will review within next days. Cheers

yarikoptic

It is a "heavy" PR, hard grasp what was moved vs modified even with git diff -w --color-moved=zebra -W master... . But tests passing is a good sign! ;)

My other concern (besides other comments in a few spots) is: I do not like heudiconv.workflows and then even more heudiconv_workflow names. My thinking:

With "workflow" I got to immediate get a picture of something like nipype or other workflow engine. I do understand that it is my limited view on the world, and here it could indeed be considered a "workflow" but then every function calling other functions in a sequence could.
I do not see other "workflows" to be born under workflows/ so why bother with it.

So, why not to just have heudiconv/main.py with workflow (or run to parallel heudiconv.cli.run; or even heudiconv to parallel all those odd from glob import glob ;) )... ?

yarikoptic · 2020-04-20T16:57:52Z

heudiconv/workflows/run.py

@@ -0,0 +1,343 @@
+#!/usr/bin/env python


this file is not intended to be executed directly, so better not bother with shebang. That might interfere with some "smart" setups of IDEs/editors making those files executable for no good reason etc.

yarikoptic · 2020-04-20T16:59:03Z

heudiconv/workflows/run.py

+from ..utils import load_heuristic, anonymize_sid, treat_infofile, SeqInfo
+from ..convert import prep_conversion
+from ..bids import populate_bids_templates, tuneup_bids_json_files
+from ..queue import queue_conversion


while doing such large RFings anyways -- unless order is required due to some import order effects, best to sort those lines (within a group, e.g. here within from .. and separately above among system libraries) -- that would simplify later finding the correct imports etc

Should be good now.

yarikoptic · 2020-04-20T17:25:16Z

heudiconv/workflows/run.py

+    queue_args : str or None, optional
+        Additional queue arguments passed as single string of space-separated
+        Argument=Value pairs. Default is None.
+    """


I don't know an "easy" way around it ATM, may be you would find some, but such duplication of docstrings in here and in cmdline argparse setup eventually would diverge. I don't know an "easy" way since most of the time then people indeed either duplicate like you have done, or come up with some evolved way to generate both cmdline and python interfaces from the same specification (like we did in datalad, and I saw similar ones elsewhere).

In principle it should be possible for many (if not all -- didn't do matching which one is pertinent only here and not cmdline) to take argparse specification and use it to populate a docstring here, and while at it may be even verify consistency of the function. It feels like a common problem, so may be someone already came up with a similar solution? I was looking for the one for click based interfaces but found none... could you check @tsalo around to see if smth like that was already done, or may be just identify a showstopper which would forbid it?

I tried looking around for tools to synchronize argument parsers with associated functions' docstrings, but couldn't find anything. If it's any consolation, we have the same separation in tedana and, while there's no automated control, keeping the function docstring up-to-date with the CLI is something we've been able to do. I could also add a line in the function description telling users to refer to the command-line help for the definitive argument descriptions.

wouldn't the default click behavior work. see the top example here: https://click.palletsprojects.com/en/7.x/

for any developer it should be clear from looking at the decorators.

I think that that would take whatever's in the function docstring and make it available in the command line help, but when you try help(function) in Python, you get a very long and unrelated string for click.Command instead of the function. I don't think click would help with within-Python documentation.

tsalo · 2020-04-24T20:00:54Z

I've moved heudiconv.workflows.heudiconv_workflow() to heudiconv.main.workflow() and eliminated the workflows module. The reason I originally had it was that I assumed that heudiconv.cli.monitor.monitor() would also be converted to a workflow at some point, but I hadn't actually looked at it at the time.

mgxd · 2020-05-11T17:08:39Z

this looks good to me, thanks!

i'll wait for @yarikoptic's blessing before merging

yarikoptic · 2020-05-13T01:32:38Z

heudiconv/main.py

+                         % ', '.join(get_known_heuristic_names()))
+
+
+def workflow(dicom_dir_template=None, files=None, subjs=None,


no sane person should invoke workflow providing any of those as positional args, I will add a leading *, ;)

I've seen that once, but never used it. It's awesome! I also added a numpydoc Notes section to the docstring to clarify usage for those who, like me, might be confused by the way it's defined. I can remove it if you want though.

Notes are good, thank you!

yarikoptic

Please review the use of former args. and their possible overlap with locally defined variables. (and add that leading *, to the signature of the workflow so nobody burns themselves ;))

yarikoptic · 2020-05-13T01:38:31Z

heudiconv/main.py

+        if session is not None:
+            session = session
+        if locator is not None:
+            locator = locator


Smart pycharm greyed out locator in the signature of this function which raised my concern. Here code like

if args.session is not None: session = args.session if args.locator is not None: locator = args.locator

was, so args. was providing defaults, whenever now loop variables mask away function arguments. This needs to be fixed up. It is pity that apparently we have no tests which would have triggered either of those cases!

That's a head-scratcher. I combed through the workflow with an eye out for this kind of pattern, but didn't find any other cases.

tsalo · 2020-05-13T02:36:45Z

heudiconv/cli/run.py

-        anon_outdir = args.conv_outdir or outdir
-        anon_study_outdir = op.join(anon_outdir, locator or '')
-
-        # TODO: --datalad  cmdline option, which would take care about initiating


There is a datalad cmdline option now. Can this be removed?

I guess -- good find

Also kind of hoping that the test failures were random and that they’ll pass now.

yarikoptic · 2020-05-14T14:16:25Z

4 failures in travis now, e.g.

___________________________ test_notop[bidsoptions1] ___________________________
heudiconv/tests/test_heuristics.py:167: in test_notop
    assert op.exists(pjoin(tmppath, 'Halchenko/Yarik/950_bids_test4'))
E   AssertionError: assert False
E    +  where False = <function exists at 0x7fc5bc715bf8>('/tmp/pytest-of-travis/pytest-0/test_notop_bidsoptions1_0/Halchenko/Yarik/950_bids_test4')
E    +    where <function exists at 0x7fc5bc715bf8> = op.exists
E    +    and   '/tmp/pytest-of-travis/pytest-0/test_notop_bidsoptions1_0/Halchenko/Yarik/950_bids_test4' = pjoin('/tmp/pytest-of-travis/pytest-0/test_notop_bidsoptions1_0', 'Halchenko/Yarik/950_bids_test4')

otherwise looks good. After failures addressed, I will try to comb through it locally a bit more and then merge

My local testing in Docker isn’t working for some reason.

tsalo · 2020-05-15T18:32:37Z

I finally fixed the testing, so I think it's ready for your review.

tsalo · 2020-06-16T17:09:42Z

I think that this PR could be helpful for the BIDS Curation GUI BrainHack project (ohbm/hackathon2020#188), so I thought I'd ping it. Is there anything I can do to move this forward or will it have to hold off until after the hackathon?

yarikoptic · 2020-06-16T18:17:53Z

FWIW: reviewing it now... gimme some minutes

I guess it was a mental side effect from using lgr.error initially or smth like that

yarikoptic · 2020-06-16T18:38:39Z

heudiconv/main.py

+        from .utils import get_heuristic_description
+        print(get_heuristic_description(heuristic, full=True))
+    else:
+        raise ValueError("Unknown command %s", command)


I will fix this (present in original version too)

yarikoptic · 2020-06-16T18:43:47Z

heudiconv/main.py

+            lgr.warning(
+              "We cannot setup exception hook since not in interactive mode")
+
+    sys.excepthook = _pdb_excepthook


FWIW: we have this function not covered by tests -- tested manually seems to still work )

yarikoptic · 2020-06-16T18:46:52Z

heudiconv/main.py

+        iterarg, iterables = ("files", len(files)) if files else \
+                             ("subjects", len(subjs))
+        queue_conversion(queue, iterarg, iterables, queue_args)
+        sys.exit(0)


I think we must not sys.exit(0) here now -- I will RF so we exit in the CLI invocation

…rsion

yarikoptic · 2020-06-16T19:08:38Z

ok, pushed those few tune ups. as far as I see it -- good to go. Let's wait for green and merge!

tsalo · 2020-06-16T19:18:33Z

Thanks @yarikoptic!

Various improvements and compatibility/support (dcm2niix, datalad, duecredit) changes. Major change is placement of output files to the target output directory during conversion. - #454 zenodo referencing in README.rst and support for ducredit for heudiconv and reproin heuristic - #445 more tutorial references in README.md - [#485][] placed files during conversion right away into the target directory (with a `_heudiconv???` suffix, renamed into ultimate target name later on), which avoids hitting file size limits of /tmp ([#481][]) and helped to avoid a regression in dcm2nixx 1.0.20201102 - #477 replaced `rec-<magnitude|phase>` with `part-<mag|phase>` now that BIDS supports the part entity - #473 made default for CogAtlasID to be a TODO URL - #459 made AcquisitionTime used for acq_time scans file field - #451 retained sub-second resolution in scans files - #442 refactored code so there is now heudiconv.main.workflow for more convenient use as a Python module - minimal version of nipype set to 1.2.3 to guarantee correct handling of DWI files ([#480][]) - `heudiconvDCM*` temporary directories are removed now ([#462][]) - compatibility with DataLad 0.13 ([#464][]) - #443 pathlib as a dependency (we are Python3 only now) * tag 'v0.9.0': Add a helper rule to upload to pypi update changelog reference as part of prep release [DATALAD RUNCMD] prepare the release CHANGELOG entry for 0.9.0

Various improvements and compatibility/support (dcm2niix, datalad, duecredit) changes. Major change is placement of output files to the target output directory during conversion. - #454 zenodo referencing in README.rst and support for ducredit for heudiconv and reproin heuristic - #445 more tutorial references in README.md - [#485][] placed files during conversion right away into the target directory (with a `_heudiconv???` suffix, renamed into ultimate target name later on), which avoids hitting file size limits of /tmp ([#481][]) and helped to avoid a regression in dcm2nixx 1.0.20201102 - #477 replaced `rec-<magnitude|phase>` with `part-<mag|phase>` now that BIDS supports the part entity - #473 made default for CogAtlasID to be a TODO URL - #459 made AcquisitionTime used for acq_time scans file field - #451 retained sub-second resolution in scans files - #442 refactored code so there is now heudiconv.main.workflow for more convenient use as a Python module - minimal version of nipype set to 1.2.3 to guarantee correct handling of DWI files ([#480][]) - `heudiconvDCM*` temporary directories are removed now ([#462][]) - compatibility with DataLad 0.13 ([#464][]) - #443 pathlib as a dependency (we are Python3 only now) * tag 'v0.9.0': Do no bother ensuring that version changed - should be no changes

tsalo added 10 commits April 16, 2020 16:53

Initial work.

7388d3d

Merge remote-tracking branch 'nipy/master' into ref/python-access

a74716b

Reorder arguments and rename function.

5ab426b

Document workflow function.

b7d19a1

Fix spacing.

61f7d48

Move workflow into new workflows module.

38f323e

Add init file for workflows module.

f773516

Fix test.

31b586a

Fix test.

8376051

Drops "args" and just use parameters.

c60b50f

yarikoptic requested changes Apr 20, 2020

View reviewed changes

tsalo added 3 commits April 23, 2020 12:33

Remove shebang.

86d9864

Sort imports.

330404a

Eliminate workflows module.

893d008

tsalo added 2 commits April 24, 2020 16:03

Fix imports after refactor.

b7340be

Fix many more imports.

3a25458

tsalo mentioned this pull request May 6, 2020

[ENH] Adds extraction of physio signals from DICOMs #446

Draft

yarikoptic reviewed May 13, 2020

View reviewed changes

yarikoptic requested changes May 13, 2020

View reviewed changes

tsalo commented May 13, 2020

View reviewed changes

tsalo added 3 commits May 12, 2020 22:42

Disable positional arguments in workflow.

7070c49

Fix circular variable assignment.

4950336

Add docstrings.

f9cf569

Also kind of hoping that the test failures were random and that they’ll pass now.

tsalo added 2 commits May 15, 2020 13:58

Drop kwarg-only to test CI.

59430c0

My local testing in Docker isn’t working for some reason.

Revert 59430c0.

54da90b

tsalo added 3 commits May 15, 2020 14:05

Drop circularity fix to test CI.

ea2aea7

Try a different approach for the circularity issue.

f8d8ec1

Remove datalad comment.

c530d29

BF: do interpolate the string msg in the exception

e79df64

I guess it was a mental side effect from using lgr.error initially or smth like that

yarikoptic reviewed Jun 16, 2020

View reviewed changes

RF: do not sys.exit(0) - just return from main upon queueing up conve…

c49134e

…rsion

yarikoptic merged commit f4c2dd2 into nipy:master Jun 16, 2020

yarikoptic added this to the 0.9.0 milestone Dec 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REF] Split main workflow from CLI for Python access #442

[REF] Split main workflow from CLI for Python access #442

tsalo commented Apr 18, 2020

codecov bot commented Apr 18, 2020 •

edited

Loading

yarikoptic commented Apr 19, 2020

yarikoptic left a comment

yarikoptic Apr 20, 2020

tsalo Apr 24, 2020

yarikoptic Apr 20, 2020

tsalo Apr 24, 2020

yarikoptic Apr 20, 2020

tsalo Apr 24, 2020

satra Apr 24, 2020

satra Apr 24, 2020

tsalo Apr 24, 2020

tsalo commented Apr 24, 2020

mgxd commented May 11, 2020

yarikoptic May 13, 2020

tsalo May 13, 2020

yarikoptic May 14, 2020

yarikoptic left a comment

yarikoptic May 13, 2020

tsalo May 13, 2020

tsalo May 13, 2020

yarikoptic May 14, 2020

yarikoptic commented May 14, 2020

tsalo commented May 15, 2020

tsalo commented Jun 16, 2020

yarikoptic commented Jun 16, 2020

yarikoptic Jun 16, 2020

yarikoptic Jun 16, 2020

yarikoptic Jun 16, 2020

yarikoptic commented Jun 16, 2020

tsalo commented Jun 16, 2020

		% ', '.join(get_known_heuristic_names()))


		def workflow(dicom_dir_template=None, files=None, subjs=None,

[REF] Split main workflow from CLI for Python access #442

[REF] Split main workflow from CLI for Python access #442

Conversation

tsalo commented Apr 18, 2020

codecov bot commented Apr 18, 2020 • edited Loading

Codecov Report

yarikoptic commented Apr 19, 2020

yarikoptic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsalo commented Apr 24, 2020

mgxd commented May 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yarikoptic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yarikoptic commented May 14, 2020

tsalo commented May 15, 2020

tsalo commented Jun 16, 2020

yarikoptic commented Jun 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yarikoptic commented Jun 16, 2020

tsalo commented Jun 16, 2020

codecov bot commented Apr 18, 2020 •

edited

Loading