Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annexrepo: Try to avoid silent bugs when processing json output #3516

merged 3 commits into from
Jul 8, 2019


Copy link

@kyleam kyleam commented Jul 8, 2019

Report non-json output from --json commands to replace silent failures, in particular #3513, with more informative messages.

We have a scenario where some annex functionality is broken by the
latest git-annex.  Add an appropriate exception for this situation.
When processing --json output of annex commands, we silently drop
lines that don't start with "{".  If we receive unexpected annex
output, this can lead to hard-to-troubleshoot bugs, such as the one
described in the next commit.

Based on reviewing commit messages, it's not clear if this is just to
drop empty lines or whether some commands are known to put non-json
output on stdout.  If non-json lines are detected, let's raise an
exception if no other valid json output was received and issue a
warning otherwise.  If it turns out that this warning happens in
common scenarios, we should demote it to debug().
On git-annex 7.20190626, 'find --json' produces regular output rather
than json.  In addition to being broken, this is problematic because,
before the previous commit, we silently dropped lines that didn't
start with "{".  In the case of 'datalad get', this led to us
incorrectly claiming that an absent file was present.

We now fail with a RuntimeError in this situation, but let's raise a
more informative error when 'find --json' is used with git-annex

This has been fixed in b8ef1bf3b (Fix find --json to output json once
more, 2019-07-05).

Re: datalad#3510 (comment)
Closes datalad#3513.
Copy link
Contributor Author

kyleam commented Jul 8, 2019

The travis build isn't showing up as a check. It's here.

Copy link

Thanks! Looks LGTM - if no travis errors let's merge! Thought to mention that "others" could be too long, but since this is our first incident like this, may be ok as is, without shortening to give some summary (X lines) with some sample (beginning of 100 characters + ... when too long)

Copy link
Contributor Author

kyleam commented Jul 8, 2019

@yarikoptic Thanks for having a look.

Thought to mention that "others" could be too long, but since this is our first incident like this, may be ok as is

Yeah, "wait and see if it's too noisy" was my thinking with this, along the same lines of deciding to go with warning() rather than debug() for now as mentioned in 0863ca8.

@kyleam kyleam merged commit 0863ca8 into datalad:0.11.x Jul 8, 2019
kyleam added a commit that referenced this pull request Jul 8, 2019
@kyleam kyleam deleted the annex-bad-json branch July 8, 2019 17:11
kyleam added a commit to kyleam/datalad that referenced this pull request Jul 8, 2019
kyleam added a commit to kyleam/datalad that referenced this pull request Jul 8, 2019
@kyleam kyleam mentioned this pull request Jul 8, 2019
@yarikoptic yarikoptic added this to the Release 0.11.6 milestone Jul 26, 2019
yarikoptic added a commit that referenced this pull request Jul 31, 2019
0.11.6 (Jul 30, 2019) -- am I the last of 0.11.x?

Primarily bug fixes to achieve more robust performance


- Our tests needed various adjustments to keep up with upstream
  changes in Travis and Git. ([#3479][]) ([#3492][]) ([#3493][])

- `AnnexRepo.is_special_annex_remote` was too selective in what it
  considered to be a special remote.  ([#3499][])

- We now provide information about unexpected output when git-annex is
  called with `--json`.  ([#3516][])

- Exception logging in the `__del__` method of `GitRepo` and
  `AnnexRepo` no longer fails if the names it needs are no longer
  bound.  ([#3527][])

- [addurls][] botched the construction of subdataset paths that were
  more than two levels deep and failed to create datasets in a
  reliable, breadth-first order.  ([#3561][])

- Cloning a `type=git` special remote showed a spurious warning about
  the remote not being enabled.  ([#3547][])

Enhancements and new features

- For calls to git and git-annex, we disable automatic garbage
  collection due to past issues with GitPython's state becoming stale,
  but doing so results in a larger .git/objects/ directory that isn't
  cleaned up until garbage collection is triggered outside of DataLad.
  Tests with the latest GitPython didn't reveal any state issues, so
  we've re-enabled automatic garbage collection.  ([#3458][])

- [rerun][] learned an `--explicit` flag, which it relays to its calls
  to [run][[]].  This makes it possible to call `rerun` in a dirty
  working tree ([#3498][]).

- The [metadata][] command aborts earlier if a metadata extractor is
  unavailable.  ([#3525][])

* tag '0.11.6': (56 commits)
  [DATALAD RUNCMD] make update-changelog
  finalize entry and boost version
  BF(DOC): close [create] with [] to not cause WARNING by md-strict pandoc Link entry from b3e8adb Add entry for gh-3547 Add entry for gh-3561 Add link for addurls
  RF: inform about special remotes based on autoenable config Second batch for 0.11.6
  BF: addurls: Process datasets in a stable, breadth-first order
  BF: addurls: Fix construction of nested subpaths
  TST: addurls: Don't hard-code path separator
  BF(TST): skip test_v7_detached_get in direct mode - fails to annex upgrade
  TST: benchmark-travis-pr: Swap 'pip install' and 'git show'
  TST: benchmark-travis-pr: Move repeated logic to run_asv()
  TST: benchmark-travis-pr: Support other bases
  TST: benchmark-travis-pr: Tweak message about current HEAD
  TST: benchmark-travis-pr: Simplify two git commands into one
  TST: benchmark-travis-pr: Reorder and break up lines
  TST: benchmark-travis-pr: Move command for running asv into function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

None yet

2 participants