ENH: run_command: Support injecting a detached command #2937

kyleam · 2018-10-22T17:21:55Z

Main changes:

New function prepare_inputs to help datalad-htcondor to avoid some duplicated code.
format_command now handles custom substitutions. This fixes rerun --script for commands that include substitutions and helps (a little bit) with code duplication in datalad-htcondor.
run_command accepts a new argument extra_info that allows arbitrary information to be added to the run record.
run_command accepts a new argument inject that means "take the current working tree state as the result of the command" (i.e., don't prepare inputs or outputs before the command and don't execute the command) (run helper to fake the actual execution part? #2934).

@mih, will inject=True work for you wrt datalad-htcondor?

kyleam · 2018-10-22T19:05:48Z

Note for @yarikoptic, who I know likes to review PRs from github. This is the sort of PR that is much easier to view from your terminal, making use of options like --color-moved and --ignore-space-change.

kyleam · 2018-10-23T14:35:42Z

The buildbot failure is the previously seen certs import error, and one of the appveyor builds failed with "failed to assign job to 'gce' build environment".

mih · 2018-10-23T14:57:22Z

@kyleam FTR: I saw your PR, but I will need a sec or two before I can work the changes in and test them. Still chewing on the symlink issues, but making good progress.

gpaths is a single GlobbedPaths instance not a list of GlobbedPaths.

Otherwise 'rerun --script' will fail if it encounters a command that uses custom placeholders. This also somewhat addresses the code duplication in datalad-htcondor. Getting rid of the "KeyError to result" duplication is trickier. Re: <datalad/datalad-htcondor#6 (comment)>

... for use in datalad-htcondor. Re: <datalad/datalad-htcondor#6 (comment)>

Document how run_command differs from Run.__call__(), especially because upcoming commits add more parameters that are unique to run_command().

This should help outside callers fit run_command to their needs. For example, datalad-htcondor may want to store extra information about where the command was actually executed.

This probably won't be used elsewhere, but it makes run_command() a bit more digestible, especially because this code would be indented one more level when we add the 'inject' parameter.

run_command() doesn't rely on detecting whether cmd_exitcode is None, so we might as well handle the None->0 conversion inside _execute_command().

... to group code that will be moved under a conditional in the next commit.

This allows a record to be created for a command that is executed remotely. Note that most of the code change in run.py corresponds to pure indentation. Closes datalad#2934.

codecov · 2018-10-24T20:08:50Z

Codecov Report

Merging #2937 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2937      +/-   ##
==========================================
+ Coverage   90.32%   90.34%   +0.02%     
==========================================
  Files         246      246              
  Lines       32029    32060      +31     
==========================================
+ Hits        28930    28966      +36     
+ Misses       3099     3094       -5

Impacted Files	Coverage Δ
datalad/interface/rerun.py	`96.26% <ø> (ø)`	⬆️
datalad/interface/run.py	`100% <100%> (ø)`	⬆️
datalad/interface/tests/test_run.py	`99.83% <100%> (ø)`	⬆️
datalad/support/external_versions.py	`96.89% <0%> (ø)`	⬆️
datalad/support/gitrepo.py	`88.84% <0%> (+0.03%)`	⬆️
datalad/cmdline/main.py	`78.63% <0%> (+0.09%)`	⬆️
datalad/downloaders/credentials.py	`88.81% <0%> (+3.49%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f935535...869d26b. Read the comment docs.

yarikoptic

Just left a minor comment, otherwise seems to be legitimate. If fits @mih needs, we should merge

yarikoptic · 2018-10-29T19:03:01Z

datalad/interface/run.py

+        Additional information to dump with the json run record. Any value
+        given here will take precedence over the standard run key. Warning:
+        Callers should try to use fairly specific key names to avoid collisions
+        with future keys added by `run`.


minor: I wonder if we should even encourage adding additional information in subsections (e.g. niceman to contain any relevant/added field, ...) instead of flooding this record at the top level?

Sounds like a good idea, yes.

Re: datalad#2937 (review)

datalad/datalad#2937 Full RF yet to come.

## 0.11.1 (Nov 25, 2018) -- v7-better-than-v6 Rushed out bugfix release to stay fully compatible with recent [git-annex] which introduced v7 to replace v6. ### Fixes - [install]: be able to install recursively into a dataset ([#2982]) - [save]: be able to commit/save changes whenever files potentially could have swapped their storage between git and annex ([#1651]) ([#2752]) ([#3009]) - [aggregate-metadata]: - dataset's itself is now not "aggregated" if specific paths are provided for aggregation ([#3002]). That resolves the issue of `-r` invocation aggregating all subdatasets of the specified dataset as well - also compare/verify the actual content checksum of aggregated metadata while considering subdataset metadata for re-aggregation ([#3007]) - `annex` commands are now chunked assuming 50% "safety margin" on the maximal command line length. Should resolve crashes while operating ot too many files at ones ([#3001]) - `run` sidecar config processing ([#2991]) - no double trailing period in docs ([#2984]) - correct identification of the repository with symlinks in the paths in the tests ([#2972]) - re-evaluation of dataset properties in case of dataset changes ([#2946]) - [text2git] procedure to use `ds.repo.set_gitattributes` ([#2974]) ([#2954]) - Switch to use plain `os.getcwd()` if inconsistency with env var `$PWD` is detected ([#2914]) - Make sure that credential defined in env var takes precedence ([#2960]) ([#2950]) ### Enhancements and new features - [shub://datalad/datalad:git-annex-dev](https://singularity-hub.org/containers/5663/view) provides a Debian buster Singularity image with build environment for [git-annex]. [tools/bisect-git-annex]() provides a helper for running `git bisect` on git-annex using that Singularity container ([#2995]) - Added [.zenodo.json]() for better integration with Zenodo for citation - [run-procedure] now provides names and help messages with a custom renderer for ([#2993]) - Documentation: point to [datalad-revolution] extension (prototype of the greater DataLad future) - [run] - support injecting of a detached command ([#2937]) - `annex` metadata extractor now extracts `annex.key` metadata record. Should allow now to identify uses of specific files etc ([#2952]) - Test that we can install from http://datasets.datalad.org - Proper rendering of `CommandError` (e.g. in case of "out of space" error) ([#2958]) * tag '0.11.1': Adjust the date -- 25th fell through due to __version__ fiasco BF+ENH(TST): boost hardcoded version + provide a test to guarantee consistency in the future This (expensive) approach is not needed in v6+ small tuneup to changelog

kyleam added 9 commits October 24, 2018 16:05

DOC: run: Correct a parameter's description

d6a6673

gpaths is a single GlobbedPaths instance not a list of GlobbedPaths.

RF: run: Expose "install and get inputs" sequence as a function

4ac7b44

... for use in datalad-htcondor. Re: <datalad/datalad-htcondor#6 (comment)>

DOC: run: Add a docstring to run_command()

f6fd37e

Document how run_command differs from Run.__call__(), especially because upcoming commits add more parameters that are unique to run_command().

ENH: run_command: Allow arbitrary information to be put in record

29eb451

This should help outside callers fit run_command to their needs. For example, datalad-htcondor may want to store extra information about where the command was actually executed.

RF: run: Move "execute command" logic into a separate function

621bc2e

This probably won't be used elsewhere, but it makes run_command() a bit more digestible, especially because this code would be indented one more level when we add the 'inject' parameter.

RF: run: Move exit code default into _execute_command()

48116cf

run_command() doesn't rely on detecting whether cmd_exitcode is None, so we might as well handle the None->0 conversion inside _execute_command().

RF: run: Rearrange some lines

af2367e

... to group code that will be moved under a conditional in the next commit.

ENH: run_command: Support injecting a detached command

36394d1

This allows a record to be created for a command that is executed remotely. Note that most of the code change in run.py corresponds to pure indentation. Closes datalad#2934.

kyleam force-pushed the enh-run-inject branch from 711cf4d to 36394d1 Compare October 24, 2018 20:08

yarikoptic approved these changes Oct 29, 2018

View reviewed changes

DOC: run: Tweak extra_info description to encourage using sub-keys

869d26b

Re: datalad#2937 (review)

yarikoptic merged commit 7a74773 into datalad:master Nov 1, 2018

mih added a commit to datalad/datalad-htcondor that referenced this pull request Nov 1, 2018

BF: Minimal adjustment to work after change in run

23896a5

datalad/datalad#2937 Full RF yet to come.

kyleam deleted the enh-run-inject branch November 1, 2018 15:45

yarikoptic added this to the Release 0.11.1 milestone Nov 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: run_command: Support injecting a detached command #2937

ENH: run_command: Support injecting a detached command #2937

kyleam commented Oct 22, 2018

kyleam commented Oct 22, 2018

kyleam commented Oct 23, 2018

mih commented Oct 23, 2018

codecov bot commented Oct 24, 2018 •

edited

yarikoptic left a comment •

edited

yarikoptic Oct 29, 2018

kyleam Oct 29, 2018

ENH: run_command: Support injecting a detached command #2937

ENH: run_command: Support injecting a detached command #2937

Conversation

kyleam commented Oct 22, 2018

kyleam commented Oct 22, 2018

kyleam commented Oct 23, 2018

mih commented Oct 23, 2018

codecov bot commented Oct 24, 2018 • edited

Codecov Report

yarikoptic left a comment • edited

Choose a reason for hiding this comment

yarikoptic Oct 29, 2018

Choose a reason for hiding this comment

kyleam Oct 29, 2018

Choose a reason for hiding this comment

codecov bot commented Oct 24, 2018 •

edited

yarikoptic left a comment •

edited