Skip to content

Preserve final trailing slash in call_git() output #6754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 28, 2022

Conversation

adswa
Copy link
Member

@adswa adswa commented Jun 8, 2022

A failure in datalad-ukbiobank (datalad/datalad-ukbiobank#82 and a misguided fix in datalad/datalad-ukbiobank#85 has revealed that a change introduced in #6278 caused final trailing newlines to be removed from the final stdout of our call_git calls. In datalad-ukb, unit tests started to fail as the contents of files that were written with git cat-file -p <sourcebranch>:<sourcefile> lacked the trailing newlines that the source files had.

I'm poking in the dark and PRing a WIP PR in hopes of improvements and feedback to fix this by switching back to _call_git() from call_git_items(). The PR also contains half of a docstring fix @yarikoptic idenitified - input on the description of the env parameter (see commit) would be appreciated.

@yarikoptic
Copy link
Member

In principle IMHO it is a bug and worth submitting against maint , but given subtle nature and rarely be an issue, master might indeed be more appropriate.

@adswa
Copy link
Member Author

adswa commented Jun 8, 2022

In principle IMHO it is a bug and worth submitting against maint , but given subtle nature and rarely be an issue, master might indeed be more appropriate.

eh FML I keep forgetting to adjust the PR target. Sorry :(

@yarikoptic yarikoptic force-pushed the trailing-slash-git-output branch from 30b384c to e36c190 Compare June 10, 2022 14:47
@yarikoptic yarikoptic changed the base branch from master to maint June 10, 2022 14:47
@yarikoptic yarikoptic added the semver-patch Increment the patch version when merged label Jun 10, 2022
@yarikoptic
Copy link
Member

I have rebased onto maint, extended tuneup of the docstring, force pushed and changed the base to be maint here.

If no failures and no objections are stated (inviting @christian-monch for review), I think we should merge it and take ukbiobank out of hostage situation.

@codeclimate
Copy link

codeclimate bot commented Jun 10, 2022

Code Climate has analyzed commit e36c190 and detected 0 issues on this pull request.

View more on Code Climate.

Copy link
Member

@bpoldrack bpoldrack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the way to go.

This changes the underlying call from _generator_call_git to _call_git if I'm not mistaken. Hence, the change does something substantially different from its actual aim.

How about call_git_items_ simply getting an option to not swallow an empty line at the end?

Alternatively, one could think about a more specialized helper for cat-file, since it's a somewhat different usecase to read a file content from getting other kinds of results from git calls.

@adswa
Copy link
Member Author

adswa commented Jun 13, 2022

I guess we really need insights from @christian-monch

@yarikoptic yarikoptic force-pushed the trailing-slash-git-output branch from e36c190 to 9a25d23 Compare June 14, 2022 02:59
@yarikoptic
Copy link
Member

I think it didn't survive change of the base neatly... I have force pushed rewritten last commit to get CI redo it

@yarikoptic
Copy link
Member

====================================================================
FAIL: datalad.tests.test_config.test_bare
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/dl-miniconda-d53tixnp/lib/python3.9/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/travis/build/datalad/datalad/datalad/tests/utils.py", line 874, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/travis/build/datalad/datalad/datalad/tests/utils.py", line 874, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/travis/build/datalad/datalad/datalad/tests/test_config.py", line 660, in test_bare
    assert_equal(
AssertionError: '725ae47b529a6627b987523e19b3c3981ba44264\n' != '725ae47b529a6627b987523e19b3c3981ba44264'
- 725ae47b529a6627b987523e19b3c3981ba44264
?                                         -
+ 725ae47b529a6627b987523e19b3c3981ba44264

seems like legit/intended fail here since now we would have a trailing new line

@christian-monch
Copy link
Contributor

christian-monch commented Jun 14, 2022

This changes the underlying call from _generator_call_git to _call_git if I'm not mistaken. Hence, the change does something substantially different from its actual aim.

I think @adswa s PR is fine. The previous state was:

  • Inside call_git we call call_git_items_ (which does call _generator_call_git) in call_git and unwind the generator in call_git

The current state is:

  • Inside call_git we call _call_git (which does call _generator_call_git) and unwinds the generator

Both are almost equivalent besides the fact that _call_git assembles the stderr-output which is ignored in call_git.

Edit: I think call_git_items_ should probably be modified to faithfully recreate the output of git, if the results are joined by "".join(call_git_items_(....)). Will create an issue for that.
I should verify that the semantic of call_git_items_ did not change between 0.15.6 and 0.16.0. Will create an issue for that.

Edit: issue #6766 created

Copy link
Contributor

@christian-monch christian-monch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a good solution for the problem. The only fly in the ointment is that stderr-output is recorded and ignored, but correctness and PR simplicity should trump performance considerations.

The tests should be modified to reflect the updated behavior. I.e. something like this:

diff --git a/datalad/tests/test_config.py b/datalad/tests/test_config.py
index b5bc4f3ae..a9e171557 100644
--- a/datalad/tests/test_config.py
+++ b/datalad/tests/test_config.py
@@ -641,7 +641,7 @@ def test_global_config():
 def test_bare(src, path):
     # create a proper datalad dataset with all bells and whistles
     ds = Dataset(src).create()
-    dlconfig_sha = ds.repo.call_git(['rev-parse', 'HEAD:.datalad/config'])
+    dlconfig_sha = ds.repo.call_git(['rev-parse', 'HEAD:.datalad/config']).strip()
     # can we handle a bare repo version of it?
     gr = AnnexRepo.clone(
         src, path, clone_options=['--bare', '-b', DEFAULT_BRANCH])

This commit adapts a test that did not expect a
trailing newline on the result of 'GitRepo.call_git()'
because the previous version would not return a
trailing newline, but the current version does.
@christian-monch
Copy link
Contributor

Pushed a fix for the tests

@christian-monch christian-monch changed the title WIP: Preserve final trailing slash in call_git() output Preserve final trailing slash in call_git() output Jun 14, 2022
@christian-monch
Copy link
Contributor

Thanks @adswa for the nice fix.

@christian-monch
Copy link
Contributor

seems like legit/intended fail here since now we would have a trailing new line

Indeed, pushed an updated test

@codecov
Copy link

codecov bot commented Jun 14, 2022

Codecov Report

Merging #6754 (70f4582) into maint (e6eda7c) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            maint    #6754      +/-   ##
==========================================
+ Coverage   91.18%   91.20%   +0.01%     
==========================================
  Files         353      354       +1     
  Lines       44528    44616      +88     
==========================================
+ Hits        40604    40691      +87     
- Misses       3924     3925       +1     
Impacted Files Coverage Δ
datalad/support/tests/test_external_versions.py 93.63% <ø> (ø)
datalad/dataset/gitrepo.py 96.75% <100.00%> (ø)
datalad/support/external_versions.py 95.03% <100.00%> (+0.15%) ⬆️
datalad/support/gitrepo.py 90.82% <100.00%> (ø)
datalad/support/tests/test_annexrepo.py 97.87% <100.00%> (+<0.01%) ⬆️
datalad/tests/test_config.py 99.72% <100.00%> (ø)
datalad/downloaders/credentials.py 84.18% <0.00%> (-0.72%) ⬇️
datalad/runner/tests/test_threadsafety.py 100.00% <0.00%> (ø)
datalad/core/distributed/tests/test_clone.py 97.59% <0.00%> (+<0.01%) ⬆️
datalad/runner/nonasyncrunner.py 99.63% <0.00%> (+0.01%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98eb875...70f4582. Read the comment docs.

@github-actions
Copy link

github-actions bot commented Jul 6, 2022

🚀 PR was released in 0.16.7 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants