New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: channel output from annex init into our logger #2881

Merged
merged 1 commit into from Sep 30, 2018

Conversation

Projects
None yet
3 participants
@yarikoptic
Member

yarikoptic commented Sep 28, 2018

This pull request partially fixes #2844 and replaces #2868

Partially, because whenever we do consume/redirect stderr, git no longer outputs progress information. Here is the function it decides upon https://git.kernel.org/pub/scm/git/git.git/tree/progress.c#n75

To force it to provide progress output, checkout/pull/push have additional --progress option to force it. there is no config setting (according to IRC chat). But just from a few days back, https://public-inbox.org/git/20180921222454.GD11177@sigill.intra.peff.net/ points to a proposal to be able to pipe progress information to any process to provide custom progress rendering without needing to parse etc. Might come handy in the future

So, for now I am offering to get following:

$> tools/eval_under_testloopfs DATALAD_REPO_VERSION=6 /usr/bin/time datalad install -s https://github.com/psychoinformatics-de/studyforrest-data-structural.git studyforrest-data-structural_alt                            I: vfat of 10:  /home/yoh/.tmp/datalad-fs-6wyWm
10+0 records in
10+0 records out
10321920 bytes (10 MB, 9.8 MiB) copied, 0.00474188 s, 2.2 GB/s
mkfs.fat 4.1 (2017-01-24)
I: running DATALAD_REPO_VERSION=6 /usr/bin/time datalad install -s https://github.com/psychoinformatics-de/studyforrest-data-structural.git studyforrest-data-structural_alt
[INFO   ] Cloning https://github.com/psychoinformatics-de/studyforrest-data-structural.git [1 other candidates] into '/home/yoh/.tmp/datalad-fs-6wyWm/studyforrest-data-structural_alt' 
[INFO   ]   Detected a filesystem without fifo support. 
[INFO   ]   Disabling ssh connection caching. 
[INFO   ]   Detected a crippled filesystem. 
[INFO   ]   Entering an adjusted branch where files are unlocked as this filesystem does not support locked files. 
[INFO   ] Switched to branch 'adjusted/master(unlocked)' 
[INFO   ] download failed: Not Found 
[INFO   ]   Remote origin not usable by git-annex; setting annex-ignore 
install(ok): /home/yoh/.tmp/datalad-fs-6wyWm/studyforrest-data-structural_alt (dataset)
6.74user 3.51system 0:15.83elapsed 64%CPU (0avgtext+0avgdata 54752maxresident)k
8inputs+18768outputs (0major+847758minor)pagefaults 0swaps
/home/yoh/.tmp/datalad-fs-6wyWm
tools/eval_under_testloopfs DATALAD_REPO_VERSION=6 /usr/bin/time datalad  -s   6.79s user 3.55s system 64% cpu 16.003 total

IMHO it is already good enough to inform user about ongoing activities... not yet sure about download failed: Not Found message there.

As for progress, which as some of you already discovered first hand is not really "logging/CI friendly" if just dumped to stderr filling up the logs, I would wait for us to be able to get and handle it.

@codecov

This comment has been minimized.

codecov bot commented Sep 28, 2018

Codecov Report

Merging #2881 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2881   +/-   ##
=======================================
  Coverage   90.29%   90.29%           
=======================================
  Files         246      246           
  Lines       31879    31879           
=======================================
  Hits        28786    28786           
  Misses       3093     3093
Impacted Files Coverage Δ
datalad/support/annexrepo.py 88.21% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 006dbd2...f746c25. Read the comment docs.

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 28, 2018

for clarity -- the "Download failed" is about annex trying to fetch the /config from the remote .git repo:

Switched to branch 'adjusted/master(unlocked)'
[2018-09-28 12:50:27.639680713] process done ExitSuccess
[2018-09-28 12:50:27.640106105] read: uname ["-n"]
[2018-09-28 12:50:27.643498338] process done ExitSuccess
[2018-09-28 12:50:27.661445928] Request {
  host                 = "github.com"
  port                 = 443
  secure               = True
  requestHeaders       = [("Range","bytes=0-"),("Accept-Encoding","identity")]
  path                 = "/psychoinformatics-de/studyforrest-data-structural.git/config"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}

download failed: Not Found

channeled to Joey since IMHO annex shouldn't spit it out: http://git-annex.branchable.com/todo/prevent_directly_printing_to_stderr_error_from_an_attempt_to_download_remote___47__config_file/?updated

@bpoldrack

This comment has been minimized.

Member

bpoldrack commented Sep 29, 2018

Well, although I'd just let stderr through until git/annex allows for a better handling, this PR still is an improvement and the least common denominator. So, I'm fine with merging.

@mih

This comment has been minimized.

Member

mih commented Sep 30, 2018

LGTM, thx!

Commented on #2844 regarding the impact of this PR.

@mih mih merged commit 687cd9b into datalad:master Sep 30, 2018

10 checks passed

WIP ready for review
Details
codecov/patch 100% of diff hit (target 90.29%)
Details
codecov/project 90.29% (+0%) compared to 006dbd2
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
datalad-pr-dl-osx-64 DEV build done.
Details
datalad-pr-docker-dl-nd14_04 DEV build done.
Details
datalad-pr-docker-dl-nd16_04 DEV build done.
Details
datalad-pr-docker-dl-nd80 DEV build done.
Details
datalad-pr-docker-dl-nd90 DEV build done.
Details

@yarikoptic yarikoptic modified the milestone: Release 0.10.4 Oct 22, 2018

yarikoptic added a commit that referenced this pull request Oct 24, 2018

Merge tag '0.11.0' into debian
 Upgrade of [git-annex] to the most recent available to your release is
 advisable since a number of issues were resolved at that level.

 ### Major refactoring and deprecations

 - `datalad.consts.LOCAL_CENTRAL_PATH` constant was deprecated in favor
   of `datalad.locations.default-dataset` [configuration] variable
   ([#2835])

 ### Minor refactoring

 - `"notneeded"` messages are no longer reported by default results
   renderer
 - [run] no longer shows commit instructions upon command failure when
   `explicit` is true and no outputs are specified ([#2922])
 - `get_git_dir` moved into GitRepo ([#2886])
 - `_gitpy_custom_call` removed from GitRepo ([#2894])
 - `GitRepo.get_merge_base` argument is now called `commitishes` instead
   of `treeishes` ([#2903])

 ### Fixes

 - [update] should not leave the dataset in non-clean state ([#2858])
   and some other enhancements ([#2859])
 - Fixed chunking of the long command lines to account for decorators
   and other arguments ([#2864])
 - Progress bar should not crash the process on some missing progress
   information ([#2891])
 - Default value for `jobs` set to be `"auto"` (not `None`) to take
   advantage of possible parallel get if in `-g` mode ([#2861])
 - [wtf] must not crash if `git-annex` is not installed etc ([#2865]),
   ([#2865]), ([#2918]), ([#2917])
 - Fixed paths (with spaces etc) handling while reporting annex error
   output ([#2892]), ([#2893])
 - `__del__` should not access `.repo` but `._repo` to avoid attempts
   for reinstantiation etc ([#2901])
 - Fix up submodule `.git` right in `GitRepo.add_submodule` to avoid
   added submodules being non git-annex friendly ([#2909]), ([#2904])
 - [run-procedure] ([#2905])
   - now will provide dataset into the procedure if called within dataset
   - will not crash if procedure is an executable without `.py` or `.sh`
     suffixes
 - Use centralized `.gitattributes` handling while setting annex backend
   ([#2912])
 - `GlobbedPaths.expand(..., full=True)` incorrectly returned relative
    paths when called more than once ([#2921])

 ### Enhancements and new features

 - Report progress on [clone] when installing from "smart" git servers
   ([#2876])
 - Stale/unused `sth_like_file_has_content` was removed ([#2860])
 - Enhancements to [search] to operate on "improved" metadata layouts
   ([#2878])
 - Output of `git annex init` operation is now logged ([#2881])
 - New
   - `GitRepo.cherry_pick` ([#2900])
   - `GitRepo.format_commit` ([#2902])
 - [run-procedure] ([#2905])
   - procedures can now recursively be discovered in subdatasets as well.
     The uppermost has highest priority
   - Procedures in user and system locations now take precedence over
     those in datasets.

* tag '0.11.0':
  Make it a 0.11.0 release since there were some API RFings
  REL: slight tune up to Changelog following the advices
  DOC: v0.10.4: Mention change in procedure precedence (a0cbcba)
  DOC: v0.10.4: Fix description of db715b7
  DOC: v0.10.4: Improve description of 6f615a4
  DOC: v0.10.4: Remove duplicate word

yarikoptic added a commit that referenced this pull request Oct 24, 2018

Merge tag '0.11.0' into debian
 ## 0.11.0 (Oct 23, 2018) -- Soon-to-be-perfect

 [git-annex] 6.20180913 (or later) is now required - provides a number of
 fixes for v6 mode operations etc.

 ### Major refactoring and deprecations

 - `datalad.consts.LOCAL_CENTRAL_PATH` constant was deprecated in favor
   of `datalad.locations.default-dataset` [configuration] variable
   ([#2835])

 ### Minor refactoring

 - `"notneeded"` messages are no longer reported by default results
   renderer
 - [run] no longer shows commit instructions upon command failure when
   `explicit` is true and no outputs are specified ([#2922])
 - `get_git_dir` moved into GitRepo ([#2886])
 - `_gitpy_custom_call` removed from GitRepo ([#2894])
 - `GitRepo.get_merge_base` argument is now called `commitishes` instead
   of `treeishes` ([#2903])

 ### Fixes

 - [update] should not leave the dataset in non-clean state ([#2858])
   and some other enhancements ([#2859])
 - Fixed chunking of the long command lines to account for decorators
   and other arguments ([#2864])
 - Progress bar should not crash the process on some missing progress
   information ([#2891])
 - Default value for `jobs` set to be `"auto"` (not `None`) to take
   advantage of possible parallel get if in `-g` mode ([#2861])
 - [wtf] must not crash if `git-annex` is not installed etc ([#2865]),
   ([#2865]), ([#2918]), ([#2917])
 - Fixed paths (with spaces etc) handling while reporting annex error
   output ([#2892]), ([#2893])
 - `__del__` should not access `.repo` but `._repo` to avoid attempts
   for reinstantiation etc ([#2901])
 - Fix up submodule `.git` right in `GitRepo.add_submodule` to avoid
   added submodules being non git-annex friendly ([#2909]), ([#2904])
 - [run-procedure] ([#2905])
   - now will provide dataset into the procedure if called within dataset
   - will not crash if procedure is an executable without `.py` or `.sh`
     suffixes
 - Use centralized `.gitattributes` handling while setting annex backend
   ([#2912])
 - `GlobbedPaths.expand(..., full=True)` incorrectly returned relative
    paths when called more than once ([#2921])

 ### Enhancements and new features

 - Report progress on [clone] when installing from "smart" git servers
   ([#2876])
 - Stale/unused `sth_like_file_has_content` was removed ([#2860])
 - Enhancements to [search] to operate on "improved" metadata layouts
   ([#2878])
 - Output of `git annex init` operation is now logged ([#2881])
 - New
   - `GitRepo.cherry_pick` ([#2900])
   - `GitRepo.format_commit` ([#2902])
 - [run-procedure] ([#2905])
   - procedures can now recursively be discovered in subdatasets as well.
     The uppermost has highest priority
   - Procedures in user and system locations now take precedence over
     those in datasets.

* tag '0.11.0':
  CHANGELOG adjusted to reflect new minimal version of git-annex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment