-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter Protocol Support #1617
Filter Protocol Support #1617
Conversation
46a8ee5
to
d874af9
Compare
filter_stream: Integration tests
// instead of a pointer. | ||
// | ||
// TODO(taylor): figure out if there is more data on the reader, | ||
// and buffer that as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I understand that 100% but I think according to the filter protocol there can't be more data on the reader. Because we need to acknowledge the data with a status
packet before Git sends more data, no?
Do you want to remove or address this TODO before merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing in meat-space, the consensus was there will be no more data on the io.Reader coming from Git, since we will have decoded the entire pointer (or at least, tried to) by this point, so I went ahead and removed the TODO.
@larsxschneider can clarify anything that I left out.
EDIT: Went ahead and removed the TODO
note in 480ec1b.
} | ||
|
||
func filterCommand(cmd *cobra.Command, args []string) { | ||
requireStdin("This command should be run by the Git filter process") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: This seems not to work on Windows (same is true for the legacy clean/smudge functions). However, I don't think that should be addressed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome catch. I agree that this is out of scope of the current PR, so I opened #1639 to address.
|
||
if err := s.Err(); err != nil && err != io.EOF { | ||
ExitWithError(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.Err() == io.EOF
is a legitimate signal from Git telling GitLFS that it should shut down. Maybe we should document/make this more clear here?
More importantly: If we get an error during w.Flush()
in L101 then we wouldn't exit with error here, right?
if err != nil { | ||
ptr.Encode(to) | ||
// Download declined error is ok to skip if we weren't requesting download | ||
if !(errors.IsDownloadDeclinedError(err) && !download) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we get a download error if we don't request a download?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. This happens since we treat downloading a file excluded with the -X
flag as optional. If we tried to download the file (i.e., repository was public), we just won't do the smudge in order to speed up the checkout. If the repo was private and we tried to download, we could get an error due to missing credentials, but that's safe to ignore, since we don't need the contents of the file anyway.
@@ -49,42 +97,18 @@ func smudgeCommand(cmd *cobra.Command, args []string) { | |||
Exit(err.Error()) | |||
} | |||
|
|||
stat, err := os.Stat(localPath) | |||
if err != nil { | |||
if stat, err := os.Stat(localPath); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: What is the advantage of this style over the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things: It's 1 less line of code, and the err
value is scoped to the if stat, err
block.
"smudge": []string{"git-lfs smudge %f"}, | ||
"clean": []string{"git-lfs clean %f"}, | ||
"smudge": []string{"git-lfs smudge %f"}, | ||
"process": []string{"git-lfs filter"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important: Shouldn't that be filter-process
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. This is in the Upgradeables
section of the setup code, meaning it's legit to overwrite this value so long as it matches one of the values in that slice. Since there was code publicly available during the course of this PR that referred to git-lfs filter
instead of git-lfs filter-process
, I'm going to keep this.
"smudge": []string{"git-lfs smudge --skip %f"}, | ||
"clean": []string{"git-lfs clean %f"}, | ||
"smudge": []string{"git-lfs smudge --skip %f"}, | ||
"process": []string{"git-lfs filter --skip"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important: Shouldn't that be filter-process
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #1617 (comment).
# Once 2.11 is released, replace this with: | ||
# | ||
# ``` | ||
# ensure_git_version_isnt $VERSION_LOWER "2.11.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should teach ensure_git_version
the g
format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thought. Since adding support for the git-describe
format would complicate this function (and I'm not planning on gating any more LFS features behind Git versions other than those released), I'm going to forgo this change for now, unless we need something like this in the future.
|
||
diff -u <(echo "$expected") <(echo "$got") | ||
) | ||
end_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer it if we set clean and smudge to false
here to ensure they are not used.
|
||
printf "$smudge" | grep "git-lfs smudge" | ||
printf "$clean" | grep "git-lfs clean" | ||
printf "$filter" | grep "git-lfs filter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/lfs filter/lfs filter-process/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, ha! Fixed it right up in: 4ae8afc.
The PR looks really good from my point of view. During my review I found 2 important problems that I think need to be addressed before merge and a couple of nits that could be/should be addressed. (Yeah, I should have used the "Request changes" button instead of "Approve" ... but well 😄 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work 🤘
// Truncate "p" such that it no longer includes the data that we | ||
// have in the internal buffer. | ||
p = p[m:] | ||
w.buf = append(w.buf, p[n:n+m]...) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what clean/smudge operations triggered this bug with your special repo? Can we extract that case and add it to the integration tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. The bug was actually not here, but in the io.Reader, as patched in: 021511a. I'm not sure an integration test is appropriate for that level of detail, but the unit tests (specifically this one) do assert for that behavior.
As for the change in the io.Writer
implementation, this is due to a contract that I failed to uphold when originally implementing this type:
Write must not modify the slice data, even temporarily.
Implementations must not retain p.
(see: https://golang.org/pkg/io#Writer).
That behavior is tested in this unit test, which would fail if the underlying dataums b1
and b2
were modified.
Thanks everyone! 🎉 |
Tremendous work! |
During sync, repo runs `git read-tree --reset -u -v HEAD` which causes git-lfs's smudge filter to run, which fails because git-lfs does not work with bare repositories. This was fixed in I091ff37998131e2e6bbc59aa37ee352fe12d7fcd to automatically disable this smudge filter. However, later versions of Git (2.11.0) introduced a new filter protocol [1], to avoid spawning a new command for each filtered file. This was implemented in Git-LFS 1.5.0 [2]. This patch fixes the issue by setting the git lfs process filter, in addition to the smudge filter. For any projects that have LFS objects, `git lfs pull` must still be executed manually afterwards. [1] git/git@edcc858 [2] git-lfs/git-lfs#1617 Bug: https://crbug.com/gerrit/10911 Change-Id: I277fc68fdefc91514a2412b3887e3be9106cab48
This pull-request is a continuation of #1382, originally authored by @larsxschneider.
It should serve as a place to merge in all of PRs in the filter-stream series (see description below), and then be merged into master as one huge thing. I'm suggesting this since the PR as of 46a8ee5 is incomplete, and not in a state that I would feel comfortable landing into master.
In terms of the PR series, here's what I'm thinking:
io.Reader
implementation for reading data off of the filter stream, and wiring thatio.Reader
up throughout the codeio.Writer
implementation for writing streams of data to the filter stream, and wiring that up[pending]:A cleanup PR to rename types, document anything that I missed, and generally apply some 💅 EDIT: this was done in place/cc @larsxschneider @technoweenie @sinbad @rubyist @sschuberth