Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --skiptop flag to prevent writing top level bids files #344

Merged
merged 7 commits into from
May 21, 2019

Conversation

stilley2
Copy link
Contributor

This is useful in batch processing to prevent potential race
conditions. The batch processing example has also been
modified to use this flag.

See #340

This is useful in batch processing to prevent potential race
conditions. The batch processing example has also been
modified to use this flag.
@stilley2 stilley2 mentioned this pull request May 14, 2019
2 tasks
@codecov-io
Copy link

codecov-io commented May 14, 2019

Codecov Report

Merging #344 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #344      +/-   ##
==========================================
+ Coverage   74.24%   74.28%   +0.04%     
==========================================
  Files          35       35              
  Lines        2671     2687      +16     
==========================================
+ Hits         1983     1996      +13     
- Misses        688      691       +3
Impacted Files Coverage Δ
heudiconv/dicoms.py 82.07% <100%> (ø) ⬆️
heudiconv/cli/run.py 77.56% <100%> (ø) ⬆️
heudiconv/convert.py 80.3% <100%> (ø) ⬆️
heudiconv/tests/test_heuristics.py 100% <100%> (ø) ⬆️
heudiconv/utils.py 90.95% <0%> (-1.18%) ⬇️
heudiconv/heuristics/test_reproin.py 100% <0%> (ø) ⬆️
heudiconv/heuristics/reproin.py 82% <0%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8cdf06d...92856e3. Read the comment docs.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We better avoid extending the cmdline list of options. I suggested a possible solution for that

parser.add_argument('--skiptop', action='store_true',
help='Skip creating of top-level bids files. '
'Useful when running in batch mode to prevent '
'possible race conditions.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our --help is already LONG enough and I expect that only more options would come to alter BIDS handling behavior. What if we "convert" --bids option into something more useful?
Especially since I think we are no longer using positional args, I think it should be easy to do:

Here is a prototype showing that we could use `args='*'` mode:
~/proj/repronim/reproin > cat /tmp/argparse_.py          
__version__ = '1'
from argparse import ArgumentParser
docstr = ("""Example:
         heudiconv -d rawdata/{subject} -o . -f heuristic.py -s s1 s2 s3""")
parser = ArgumentParser(description=docstr)
parser.add_argument('--version', action='version', version=__version__)

parser.add_argument('-b', '--bids', nargs='*',
                    choices=['notop', 'nostriptimes'],
                    help="""\
                    flag for output into BIDS structure.  Additional options which
                    could be specified:  notop - no top level metadata-files, 
                    nostriptimes - do not strip times from side-car files""")

out = parser.parse_args()
print(out)


~/proj/repronim/reproin > python /tmp/argparse_.py --help
usage: argparse_.py [-h] [--version]
                    [-b [{notop,nostriptimes} [{notop,nostriptimes} ...]]]

Example: heudiconv -d rawdata/{subject} -o . -f heuristic.py -s s1 s2 s3

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -b [{notop,nostriptimes} [{notop,nostriptimes} ...]], --bids [{notop,nostriptimes} [{notop,nostriptimes} ...]]
                        flag for output into BIDS structure. Additional
                        options which could be specified: notop - no top level
                        metadata-files, nostriptimes - do not strip times from
                        side-car files

~/proj/repronim/reproin > python /tmp/argparse_.py       
Namespace(bids=None)                               

~/proj/repronim/reproin > python /tmp/argparse_.py -b
Namespace(bids=[])

~/proj/repronim/reproin > python /tmp/argparse_.py -b notop
Namespace(bids=['notop'])

~/proj/repronim/reproin > python /tmp/argparse_.py -b notop -b nostriptimes
Namespace(bids=['nostriptimes'])

~/proj/repronim/reproin > python /tmp/argparse_.py -b notop nostriptimes 
Namespace(bids=['notop', 'nostriptimes'])

~/proj/repronim/reproin > python /tmp/argparse_.py -b notop nostriptimes nonsense
usage: argparse_.py [-h] [--version]
                    [-b [{notop,nostriptimes} [{notop,nostriptimes} ...]]]
argparse_.py: error: argument -b/--bids: invalid choice: 'nonsense' (choose from 'notop', 'nostriptimes')

so we can discriminate when it is not specified (None -- current False) for when specified but without options ([] -- current True). So, if instead of carrying around a bool bids we just carry bids a list of options with an empty list for False, ['no-options'] if no options were provided, and otherwise just a full list of options, we could later add other options and remain compatible while testing if bids: and check for options if bids and 'notop' not in bids: where needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I restructured it with your suggestion. Internally, I set args.bids to either True of False and add args.bids_options, but from the point of view of the user it works as you suggested.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THANK YOU!

  • what about --bids help instead of yet another option?

@@ -222,6 +253,8 @@ def get_parser():
help='Additional queue arguments passed as '
'single string of Argument=Value pairs space '
'separated.')
parser.add_argument('--help-bids', action='store_true', dest='help_bids',
help='Display bids specific help.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not to just add "help" option to --bids and react on it? part of the goal in my suggestion was not to expand the cmdline API options list more ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes that's a good ideal. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored as per @mgxd's suggestion and just included all the help in the argparse help field

@@ -80,7 +80,7 @@ def conversion_info(subject, outdir, info, filegroup, ses):

def prep_conversion(sid, dicoms, outdir, heuristic, converter, anon_sid,
anon_outdir, with_prov, ses, bids, seqinfo, min_meta,
overwrite, dcmconfig):
overwrite, dcmconfig, bids_options):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is what I was hoping to avoid -- needing to eventually add/adjust signature of every function which now gets bids but might eventually need to get access to bids_options...

but may be making it more obvious at the cost of some duplication would be better than my more obscure suggestion... Let's see what others thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see what you're saying. I wanted to keep bids as a boolean because I think it's more obvious to people and prevents us having to check every usage of args.bids. (e.g., if bids is an empty list, we would want to calculate bids stuff, but if args.bids: would no longer work.) Therefore a change is required in either case, and I feel the bids + bids_options method is more explicit and clear. I'm curious what others thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd like to avoid additional parameters to the signatures whenever possible. Though if we're changing the variable type + intention, it may be worthwhile to rename it to something more readable - I like bids_options or bids_opts. Most of the current checks can be ported rather easily

if args.bids_opts is not None:  # if args.bids
   if ['notopdirs'] in args.bids_opts:  # special case
       ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use bids_options internally

Copy link
Member

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on @stilley2, we get excited whenever a new contributor submits a PR! I like @yarikoptic's suggestion of extending the --bids argument to handle these BIDS-specific cases

If you run into any problems/issues, don't hesitate to ask either of us!

@@ -80,7 +80,7 @@ def conversion_info(subject, outdir, info, filegroup, ses):

def prep_conversion(sid, dicoms, outdir, heuristic, converter, anon_sid,
anon_outdir, with_prov, ses, bids, seqinfo, min_meta,
overwrite, dcmconfig):
overwrite, dcmconfig, bids_options):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd like to avoid additional parameters to the signatures whenever possible. Though if we're changing the variable type + intention, it may be worthwhile to rename it to something more readable - I like bids_options or bids_opts. Most of the current checks can be ported rather easily

if args.bids_opts is not None:  # if args.bids
   if ['notopdirs'] in args.bids_opts:  # special case
       ...

@@ -98,9 +103,20 @@ def process_extra_commands(outdir, args):
return


def help_bids():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be added to the argument's desc field instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added bids options information to the help field (as far as I can tell desc is only parser level).

@@ -112,6 +128,19 @@ def main(argv=None):
random.seed(args.random_seed)
import numpy
numpy.random.seed(args.random_seed)
# Ensure only supported bids options are passed
allowed_options = [option for option, _ in BIDS_OPTIONS]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argparse's choices will check if each input is valid, so this isn't needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Internal functions now check whether bids_options
is None instead of checking the boolean bids.
Also place bids options related help in the help
field of the argparse "bids" argument.
Copy link
Member

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! This is almost ready, we just need to add a test confirming the intended notop behavior

@yarikoptic
Copy link
Member

a possible idea to followup with #349

Copy link
Member

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me - @yarikoptic any other comments?

@yarikoptic
Copy link
Member

I think not

@yarikoptic yarikoptic merged commit 75a4c82 into nipy:master May 21, 2019
@yarikoptic yarikoptic added this to the 0.6 milestone Nov 22, 2019
yarikoptic added a commit that referenced this pull request Dec 16, 2019
This is largely a bug fix.  Metadata and order of `_key-value` fields in BIDS
could change from the result of converting using previous versions, thus minor
version boost.
14 people contributed to this release -- thanks
[everyone](https://github.com/nipy/heudiconv/graphs/contributors)!

 Enhancement

- Use [etelemetry](https://pypi.org/project/etelemetry) to inform about most
  recent available version of heudiconv. Please set `NO_ET` environment variable
  if you want to disable it ([#369][])
- BIDS:
  - `--bids` flag became an option. It can (optionally) accept `notop` value
    to avoid creation of top level files (`CHANGES`, `dataset_description.json`,
    etc) as a workaround during parallel execution to avoid race conditions etc.
    ([#344][])
  - Generate basic `.json` files with descriptions of the fields for
    `participants.tsv` and `_scans.tsv` files ([#376][])
  - Use `filelock` while writing top level files. Use
    `HEUDICONV_FILELOCK_TIMEOUT` environment to change the default timeout value
    ([#348][])
  - `_PDT2` was added as a suffix for multi-echo (really "multi-modal")
    sequences ([#345][])
- Calls to `dcm2niix` would include full output path to make it easier to
  discern in the logs what file it is working on ([#351][])
- With recent [datalad]() (>= 0.10), created DataLad dataset will use
  `--fake-dates` functionality of DataLad to not leak data conversion dates,
  which might be close to actual data acquisition/patient visit ([#352][])
- Support multi-echo EPI `_phase` data ([#373][] fixes [#368][])
- Log location of a bad .json file to ease troubleshooting ([#379][])
- Add basic pypi classifiers for the package ([#380][])

 Fixed
- Sorting `_scans.tsv` files lacking valid dates field should not cause a crash
  ([#337][])
- Multi-echo files detection based number of echos ([#339][])
- BIDS
  - Use `EchoTimes` from the associated multi-echo files if `EchoNumber` tag is
    missing ([#366][] fixes [#347][])
  - Tolerate empty ContentTime and/or ContentDate in DICOMs ([#372][]) and place
    "n/a" if value is missing ([#390][])
  - Do not crash and store original .json file is "JSON pretification" fails
    ([#342][])
- ReproIn heuristic
  - tollerate WIP prefix on Philips scanners ([#343][])
  - allow for use of `(...)` instead of `{...}` since `{}` are not allowed
    ([#343][])
  - Support pipolar fieldmaps by providing them with `_epi` not `_magnitude`.
    "Loose" BIDS `_key-value` pairs might come now after `_dir-` even if they
    came first before ([#358][] fixes [#357][])
- All heuristics saved under `.heudiconv/` under `heuristic.py` name, to avoid
  discrepancy during reconversion ([#354][] fixes [#353][])
- Do not crash (with TypeError) while trying to sort absent file list ([#360][])
- heudiconv requires nipype >= 1.0.0 ([#364][]) and blacklists `1.2.[12]` ([#375][])

* tag 'v0.6.0': (60 commits)
  Version boost to 0.6.0
  DOC: populate detailed changelog for 0.6.0 and tune up formatting in previous one
  Fix miscellaneous typos in ReproIn heuristic file.
  BF: fix check for the sbatch (SLURM) not being available
  ENH: make test-compare-two-versions take any two worktrees, and just show diff if results already known
  Update heudiconv/convert.py
  apply @mgxd 's suggestions, adding a warning and a timeout environment variable
  need str typecast
  Use empty string not None
  Empty acq_time results in empty cell not 'n/a'
  DOC: Clarify tarball session handling
  remove repetitive import statement
  respond to review - add explicit py2 check - change file saving strategy - use logger instead of print
  fix remaning py2 errors
  MNT: Add Python support metadata to package
  fix some python2/3 incompatibilities
  add return data (accidently removed return)
  make content unicode
  test that load_json provides filename if invalid
  explicitly name invalid json
  ...
yarikoptic added a commit that referenced this pull request Dec 16, 2019
[0.6.0] - 2019-12-16

This is largely a bug fix.  Metadata and order of `_key-value` fields in BIDS
could change from the result of converting using previous versions, thus minor
version boost.
14 people contributed to this release -- thanks
[everyone](https://github.com/nipy/heudiconv/graphs/contributors)!

Enhancement

- Use [etelemetry](https://pypi.org/project/etelemetry) to inform about most
  recent available version of heudiconv. Please set `NO_ET` environment variable
  if you want to disable it ([#369][])
- BIDS:
  - `--bids` flag became an option. It can (optionally) accept `notop` value
    to avoid creation of top level files (`CHANGES`, `dataset_description.json`,
    etc) as a workaround during parallel execution to avoid race conditions etc.
    ([#344][])
  - Generate basic `.json` files with descriptions of the fields for
    `participants.tsv` and `_scans.tsv` files ([#376][])
  - Use `filelock` while writing top level files. Use
    `HEUDICONV_FILELOCK_TIMEOUT` environment to change the default timeout value
    ([#348][])
  - `_PDT2` was added as a suffix for multi-echo (really "multi-modal")
    sequences ([#345][])
- Calls to `dcm2niix` would include full output path to make it easier to
  discern in the logs what file it is working on ([#351][])
- With recent [datalad]() (>= 0.10), created DataLad dataset will use
  `--fake-dates` functionality of DataLad to not leak data conversion dates,
  which might be close to actual data acquisition/patient visit ([#352][])
- Support multi-echo EPI `_phase` data ([#373][] fixes [#368][])
- Log location of a bad .json file to ease troubleshooting ([#379][])
- Add basic pypi classifiers for the package ([#380][])

Fixed

- Sorting `_scans.tsv` files lacking valid dates field should not cause a crash
  ([#337][])
- Multi-echo files detection based number of echos ([#339][])
- BIDS
  - Use `EchoTimes` from the associated multi-echo files if `EchoNumber` tag is
    missing ([#366][] fixes [#347][])
  - Tolerate empty ContentTime and/or ContentDate in DICOMs ([#372][]) and place
    "n/a" if value is missing ([#390][])
  - Do not crash and store original .json file is "JSON pretification" fails
    ([#342][])
- ReproIn heuristic
  - tolerate WIP prefix on Philips scanners ([#343][])
  - allow for use of `(...)` instead of `{...}` since `{}` are not allowed
    ([#343][])
  - Support pipolar fieldmaps by providing them with `_epi` not `_magnitude`.
    "Loose" BIDS `_key-value` pairs might come now after `_dir-` even if they
    came first before ([#358][] fixes [#357][])
- All heuristics saved under `.heudiconv/` under `heuristic.py` name, to avoid
  discrepancy during reconversion ([#354][] fixes [#353][])
- Do not crash (with TypeError) while trying to sort absent file list ([#360][])
- heudiconv requires nipype >= 1.0.0 ([#364][]) and blacklists `1.2.[12]` ([#375][])

* tag 'v0.6.0':
  Boost perspective release date in changelog to today
  ENH(TST): Fix version to older pytest to ease backward compatibility testing
  RF: use tmpdir not tmp_path fixture
  FIX: minor typo in CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants