Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dluksza include ref in upload request #1036

Closed
wants to merge 10 commits into from

Conversation

technoweenie
Copy link
Contributor

Re-applied #969 against the current master, fixing some merge conflicts.

dluksza and others added 10 commits February 3, 2016 12:17
Adds "ref" parameter extracted from git pre-push hook into the upload
request. This way servers with branch access permissions can enforce
them before large objects are transfered.

Issue: #845
In push hook refs comes from user and points to local
branches. Transporting them to server doesn't make sense.
Don't add empty "ref" parameter to the lfs-request object when its value
is empty.
Instead of passing around additional string parameter with destination
ref, we pass special struct called UploadMetadata.

For now UploadMetadata contains only ref field but in the future more
metadata like file name (or path) can be added.
Move "ref" property of batch request into "meta" object, this way all
meta data have a common place to live instead of being spread all around
the batch request.
Since we also want to have metadata included in fetch request the
UploadMetadata struct is renamed to TransferMetadata.

Patch that adds metadata to fetch request will follow.
Include meta.ref property in download request object, same as it is in
the upload request.

In case of download meta.ref contain name of currently check out local
branch and can be only a suggestion to server implementation.
Normalize the value of meta.ref parameter of batch request
object. Previously ref parameter format differs between "download" and
"upload" objects.

The "upload" always was using the fully qualified branch
name (eg. refs/heads/master), when the "download" used short
name (eg. master).

Right now meta.ref will have consistent values. When "refs/heads/"
prefix will be appended when it is not present. Special "HEAD" reference
will be replaced with empty string, which will result in meta.ref
parameter not being included in the request object.
@technoweenie
Copy link
Contributor Author

Echoing @sinbad's comments in #969 (comment):

I'm not crazy about how this makes the LFS push invoked by the pre-push hook semantically different to using git lfs push.

I'm sure it's not intentional. The fetch command sends it unless the --all flag is used. scanAll() only returns a slice of lfs pointers with no ref information.

None of the git lfs push calls because of how uploadPointers() is used.

I think scanAll() would have to change to return LFS pointers along with the ref that they're in. But I don't think git rev-list returns that info. If it did, it could loop through the refs, starting transfer queues for each one, like the lfs pre-push command.

@technoweenie
Copy link
Contributor Author

Instead of using git rev-list --all, any thoughts on getting the refs first with git show-ref --heads --tags? At least for pushes. Fetches probably do want the remote refs too (but not stash).

@sinbad
Copy link
Contributor

sinbad commented Feb 24, 2016

If it did, it could loop through the refs, starting transfer queues for each one, like the lfs pre-push command.

I assume you're talking about git lfs push --all only? The problem I can see with the above is that the same LFS object can be reached by multiple refs, so you're going to get upload requests for the same oid duplicated in different queues. You'd really have to serialise the pushing of each ref to avoid that - it probably would only result in duplicated uploads where the one that finishes last is thrown away (same as if 2 people uploaded simultaneously) but it becomes so much more likely it's worth avoiding.

@technoweenie
Copy link
Contributor Author

I assume you're talking about git lfs push --all only?

No, all code paths just build up a list of lfs pointers first, before uploading.

https://github.com/github/git-lfs/blob/master/commands/command_push.go#L55-L74

The pre-push command relies on the transfer queue's Wait() function.

https://github.com/github/git-lfs/blob/6b00ebc996a7cd2b28fdcd027fe527c48975c113/commands/command_pre_push.go#L59-L74

You're right that it does send the OIDs to the batch api multiple times, potentially. But it doesn't transfer the same one multiple times in a single command. We could probably solve this by tracking the successful uploads, and preventing them from being added to another ref's transfer queue.

@technoweenie technoweenie mentioned this pull request Feb 24, 2016
@technoweenie
Copy link
Contributor Author

I made a few code spikes that illustrate how we could refactor those commands. #1038 changes lfs push --all so it runs git show-ref to get the list of local refs. #1040 updates lfs pre-push so it doesn't attempt to upload duplicate objects that have been uploaded in the same process (from another ref).

@technoweenie
Copy link
Contributor Author

Wow, this is a major yak shave :/ Redoing this in #1128....

@technoweenie technoweenie deleted the dluksza-include-ref-in-upload-request branch April 7, 2016 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants