New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH(UI): interface our progress bars into GitPython for clone/push/pull #2876

Merged
merged 8 commits into from Oct 2, 2018

Conversation

Projects
None yet
3 participants
@yarikoptic
Member

yarikoptic commented Sep 26, 2018

This pull request fixes #1564

This pull request proposes to provide GitPython with our progress reporting for clone/push/pull. May be there is other command where it could be provided - didn't find with git grep

Note:

  • I think either git or GitPython doesn't flush some buffers somewhere so typically in my cases progress bar just blinks on the screen rapidly just screwing display while leaving a blank line. But please try it out - would it show you the progress? we might need to check with/in GitPython about buffering of inputs it gets from git calls etc.

Changes

  • This change is complete
@kyleam

This comment has been minimized.

Member

kyleam commented Sep 26, 2018

I think either git or GitPython doesn't flush some buffers somewhere so typically in my cases progress bar just blinks on the screen rapidly just screwing display while leaving a blank line. But please try it out - would it show you the progress?

Hmm, I tried datalad clone https://github.com/datalad/datalad, and the progress bars look nice on my end -- no blinking.

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 26, 2018

Which is great! Thanks . Haven't tried GitHub yet

@codecov

This comment has been minimized.

codecov bot commented Sep 26, 2018

Codecov Report

Merging #2876 into master will decrease coverage by <.01%.
The diff coverage is 89.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2876      +/-   ##
==========================================
- Coverage    90.3%   90.29%   -0.01%     
==========================================
  Files         246      246              
  Lines       31872    31929      +57     
==========================================
+ Hits        28782    28831      +49     
- Misses       3090     3098       +8
Impacted Files Coverage Δ
datalad/distribution/tests/test_clone.py 99.51% <100%> (ø) ⬆️
datalad/distribution/clone.py 98.3% <86.66%> (-1.7%) ⬇️
datalad/support/gitrepo.py 88.34% <90.32%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bfaf586...7e4dde5. Read the comment docs.

yarikoptic added some commits Sep 27, 2018

RF: adjusted error message to not include "data" while talking about …
…failed clone

Clone is never used when referring to data AFAIK, so it was a bit misleading msg
@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 27, 2018

ok -- the reason for a test failure all around is our good old reliance on parsing stderr from git. By enabling progress reporting, we make GitPython to parse stderr output from git, so if something goes kaboom, we end up without stderr being recorded as part of the exception's .stderr. I do feel that it is an issue within GitPython since it overall could still somehow record the entirety of stderr and assign it to the exception in such cases (yet to investigate/report) BUT in this particular case we should might better just explicitly check if that path could be created/modified even before invoking clone if we want to provide such a custom error message.

@kyleam

This comment has been minimized.

Member

kyleam commented Sep 27, 2018

You mentioned in the GitPython issue that it may be py2/py3. Indeed, if I try on python2, I don't see nice/smooth progress bars. I see a few jumpy ones that I can't quite make out, but they look to start mid-way and quickly disappear.

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 27, 2018

heh heh, with all the RFing and bugs workarounding seems have managed to kill progress bar reporting somehow in 9afc60a

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 27, 2018

ha ha . I bet it is this fix which is needed ;)

     def __enter__(self):
-        return self.close()
+        return self

@yarikoptic yarikoptic force-pushed the yarikoptic:enh-clone-progress branch from 052a4b5 to 247332f Sep 27, 2018

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 27, 2018

indeed, in python3 it looks much better! Also lacks that ugly empty line after progress bar is done, which happens to me with python2

BF: match at the end of the string/line (MULTILINE), not only at \n
Also robustified it, so if matching fails, we do not blow but just include the
full stderr
@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Sep 28, 2018

coverage is reduced due to code which hopefully would never be triggered but is there to provide robust handling of those cases without overall blow up on a progress bar error

@kyleam

kyleam approved these changes Sep 30, 2018

The progress reporting looks nice on my end (py3). I looked over the code and don't have any major comments.

I'm still confused about the py2 issues. It's not clear to me based on your comments whether you figured out what the issue is. I'm guessing you haven't because, based on quick testing under python 2, things still look off to me. In that case, it would have been nice to see some mention/discussion of this in the commit messages or code comments so that it's clear that it is a known issue to someone investigating it later.

@mih

This comment has been minimized.

Member

mih commented Sep 30, 2018

My 2ct: I am OK with merging based on the PY3 performance alone, but I find @kyleam comments reasonable. I will leave the merge to you guys.

@mih mih self-requested a review Sep 30, 2018

@mih

mih approved these changes Sep 30, 2018

@kyleam kyleam merged commit 698e7ff into datalad:master Oct 2, 2018

8 of 10 checks passed

codecov/patch 89.87% of diff hit (target 90.3%)
Details
codecov/project 90.29% (-0.01%) compared to bfaf586
Details
WIP ready for review
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
datalad-pr-dl-osx-64 DEV build done.
Details
datalad-pr-docker-dl-nd14_04 DEV build done.
Details
datalad-pr-docker-dl-nd16_04 DEV build done.
Details
datalad-pr-docker-dl-nd80 DEV build done.
Details
datalad-pr-docker-dl-nd90 DEV build done.
Details
@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Oct 2, 2018

py2 issue, as relating to GitPython, was backlinked above: gitpython-developers/GitPython#799 . I do not think it is something we could/should "fix" on our end.

@kyleam

This comment has been minimized.

Member

kyleam commented Oct 2, 2018

py2 issue, as relating to GitPython, was backlinked above: gitpython-developers/GitPython#799 . I do not think it is something we could/should "fix" on our end.

The main point of my review was: "it would have been nice to see some mention/discussion of this in the commit messages or code comments so that it's clear that it is a known issue to someone investigating it later."

@yarikoptic

This comment has been minimized.

Member

yarikoptic commented Oct 2, 2018

Yeah.. sorry I haven't done it. Feel welcome to push a commit to master with such comment added, will be appreciated

@yarikoptic yarikoptic modified the milestone: Release 0.10.4 Oct 22, 2018

yarikoptic added a commit that referenced this pull request Oct 24, 2018

Merge tag '0.11.0' into debian
 Upgrade of [git-annex] to the most recent available to your release is
 advisable since a number of issues were resolved at that level.

 ### Major refactoring and deprecations

 - `datalad.consts.LOCAL_CENTRAL_PATH` constant was deprecated in favor
   of `datalad.locations.default-dataset` [configuration] variable
   ([#2835])

 ### Minor refactoring

 - `"notneeded"` messages are no longer reported by default results
   renderer
 - [run] no longer shows commit instructions upon command failure when
   `explicit` is true and no outputs are specified ([#2922])
 - `get_git_dir` moved into GitRepo ([#2886])
 - `_gitpy_custom_call` removed from GitRepo ([#2894])
 - `GitRepo.get_merge_base` argument is now called `commitishes` instead
   of `treeishes` ([#2903])

 ### Fixes

 - [update] should not leave the dataset in non-clean state ([#2858])
   and some other enhancements ([#2859])
 - Fixed chunking of the long command lines to account for decorators
   and other arguments ([#2864])
 - Progress bar should not crash the process on some missing progress
   information ([#2891])
 - Default value for `jobs` set to be `"auto"` (not `None`) to take
   advantage of possible parallel get if in `-g` mode ([#2861])
 - [wtf] must not crash if `git-annex` is not installed etc ([#2865]),
   ([#2865]), ([#2918]), ([#2917])
 - Fixed paths (with spaces etc) handling while reporting annex error
   output ([#2892]), ([#2893])
 - `__del__` should not access `.repo` but `._repo` to avoid attempts
   for reinstantiation etc ([#2901])
 - Fix up submodule `.git` right in `GitRepo.add_submodule` to avoid
   added submodules being non git-annex friendly ([#2909]), ([#2904])
 - [run-procedure] ([#2905])
   - now will provide dataset into the procedure if called within dataset
   - will not crash if procedure is an executable without `.py` or `.sh`
     suffixes
 - Use centralized `.gitattributes` handling while setting annex backend
   ([#2912])
 - `GlobbedPaths.expand(..., full=True)` incorrectly returned relative
    paths when called more than once ([#2921])

 ### Enhancements and new features

 - Report progress on [clone] when installing from "smart" git servers
   ([#2876])
 - Stale/unused `sth_like_file_has_content` was removed ([#2860])
 - Enhancements to [search] to operate on "improved" metadata layouts
   ([#2878])
 - Output of `git annex init` operation is now logged ([#2881])
 - New
   - `GitRepo.cherry_pick` ([#2900])
   - `GitRepo.format_commit` ([#2902])
 - [run-procedure] ([#2905])
   - procedures can now recursively be discovered in subdatasets as well.
     The uppermost has highest priority
   - Procedures in user and system locations now take precedence over
     those in datasets.

* tag '0.11.0':
  Make it a 0.11.0 release since there were some API RFings
  REL: slight tune up to Changelog following the advices
  DOC: v0.10.4: Mention change in procedure precedence (a0cbcba)
  DOC: v0.10.4: Fix description of db715b7
  DOC: v0.10.4: Improve description of 6f615a4
  DOC: v0.10.4: Remove duplicate word

yarikoptic added a commit that referenced this pull request Oct 24, 2018

Merge tag '0.11.0' into debian
 ## 0.11.0 (Oct 23, 2018) -- Soon-to-be-perfect

 [git-annex] 6.20180913 (or later) is now required - provides a number of
 fixes for v6 mode operations etc.

 ### Major refactoring and deprecations

 - `datalad.consts.LOCAL_CENTRAL_PATH` constant was deprecated in favor
   of `datalad.locations.default-dataset` [configuration] variable
   ([#2835])

 ### Minor refactoring

 - `"notneeded"` messages are no longer reported by default results
   renderer
 - [run] no longer shows commit instructions upon command failure when
   `explicit` is true and no outputs are specified ([#2922])
 - `get_git_dir` moved into GitRepo ([#2886])
 - `_gitpy_custom_call` removed from GitRepo ([#2894])
 - `GitRepo.get_merge_base` argument is now called `commitishes` instead
   of `treeishes` ([#2903])

 ### Fixes

 - [update] should not leave the dataset in non-clean state ([#2858])
   and some other enhancements ([#2859])
 - Fixed chunking of the long command lines to account for decorators
   and other arguments ([#2864])
 - Progress bar should not crash the process on some missing progress
   information ([#2891])
 - Default value for `jobs` set to be `"auto"` (not `None`) to take
   advantage of possible parallel get if in `-g` mode ([#2861])
 - [wtf] must not crash if `git-annex` is not installed etc ([#2865]),
   ([#2865]), ([#2918]), ([#2917])
 - Fixed paths (with spaces etc) handling while reporting annex error
   output ([#2892]), ([#2893])
 - `__del__` should not access `.repo` but `._repo` to avoid attempts
   for reinstantiation etc ([#2901])
 - Fix up submodule `.git` right in `GitRepo.add_submodule` to avoid
   added submodules being non git-annex friendly ([#2909]), ([#2904])
 - [run-procedure] ([#2905])
   - now will provide dataset into the procedure if called within dataset
   - will not crash if procedure is an executable without `.py` or `.sh`
     suffixes
 - Use centralized `.gitattributes` handling while setting annex backend
   ([#2912])
 - `GlobbedPaths.expand(..., full=True)` incorrectly returned relative
    paths when called more than once ([#2921])

 ### Enhancements and new features

 - Report progress on [clone] when installing from "smart" git servers
   ([#2876])
 - Stale/unused `sth_like_file_has_content` was removed ([#2860])
 - Enhancements to [search] to operate on "improved" metadata layouts
   ([#2878])
 - Output of `git annex init` operation is now logged ([#2881])
 - New
   - `GitRepo.cherry_pick` ([#2900])
   - `GitRepo.format_commit` ([#2902])
 - [run-procedure] ([#2905])
   - procedures can now recursively be discovered in subdatasets as well.
     The uppermost has highest priority
   - Procedures in user and system locations now take precedence over
     those in datasets.

* tag '0.11.0':
  CHANGELOG adjusted to reflect new minimal version of git-annex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment