Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ref information to upload request #969

Closed
wants to merge 9 commits into from

Conversation

dluksza
Copy link

@dluksza dluksza commented Feb 3, 2016

Adds "ref" parameter extracted from git pre-push hook into the upload
request. This way servers with branch access permissions can enforce
them before large objects are transfered.

Issue: #845

Adds "ref" parameter extracted from git pre-push hook into the upload
request. This way servers with branch access permissions can enforce
them before large objects are transfered.

Issue: git-lfs#845
@dluksza dluksza force-pushed the include-ref-in-upload-request branch from db53621 to 2b38b2a Compare February 3, 2016 11:17
@@ -45,7 +46,7 @@ func uploadsBetweenRefAndRemote(remote string, refs []string) *lfs.TransferQueue
if len(refs) == 0 {
pointers := scanAll()
Print("Pushing objects...")
return uploadPointers(pointers)
return uploadPointers(pointers, strings.Join(refs, ","))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since pre-push is reading from the git hook, it's probably getting a full remote ref like refs/heads/master. However, the push command is receiving user input, and is likely a relative name (which could be a branch, tag, etc). Do we care? Should the server just do its best to validate it?

Also, instead of joining the refs array with commas, why not just pass a JSON array to the server.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, on the server side we are not interested in the name of local branch, we just want to know where you want to push (put) your changes.

Will remove my modifications from the command_push.go

@technoweenie
Copy link
Contributor

Thanks for resurrecting #850. That slipped through the cracks during the holiday break. I'm trying to put out a v1.1.1 bug fix release this week, and then I want to start looking at new things like this.

Compared to #850, I like that this gets the ref from existing data, instead of another git call.

I'm not thrilled about passing an extra string argument to functions like NewUploadQueue(). Some kind of meta struct like in #850 would be preferable. However, it does seem kind of silly if Refs is the only property. But, it would give us a place to potentially add more metadata properties in the future.

I think the refs given to the fetch command should be passed to the DownloadQueue as well. But I don't think there's any way to support this in the standard LFS workflow that uses the git smudge filter. Git runs the smudge filters during checkout, which is after the git pull (if there even was one). Therefore it doesn't know the remote ref.

In push hook refs comes from user and points to local
branches. Transporting them to server doesn't make sense.
Don't add empty "ref" parameter to the lfs-request object when its value
is empty.
Instead of passing around additional string parameter with destination
ref, we pass special struct called UploadMetadata.

For now UploadMetadata contains only ref field but in the future more
metadata like file name (or path) can be added.
Move "ref" property of batch request into "meta" object, this way all
meta data have a common place to live instead of being spread all around
the batch request.
@dluksza
Copy link
Author

dluksza commented Feb 4, 2016

I've added UploadMetadata struct as you suggested. Also move "ref" property from the root of upload request object to the "meta" property. The "meta" (and "ref") will be only appended when there is something set.

@technoweenie
Copy link
Contributor

Why keep it specific to uploads? You should be able to pass the same info from a git lfs fetch call, right? If the lfs server is going to look at the ref on uploads to verify authorization, it seems like it should for downloads too. Though I suppose the server could maintain a mapping of oids to remote refs.

@dluksza
Copy link
Author

dluksza commented Feb 5, 2016

I can add this to the fetch as well, but IMO this should be also verified on the server. Relaying only on the client data is not secure, since it is easy to craft malicious payload, through guessing the OID is hard.

One can say that that keeping branch-oid mapping is the implementation detail of server and LFS client should not impose that.

Since we also want to have metadata included in fetch request the
UploadMetadata struct is renamed to TransferMetadata.

Patch that adds metadata to fetch request will follow.
Include meta.ref property in download request object, same as it is in
the upload request.

In case of download meta.ref contain name of currently check out local
branch and can be only a suggestion to server implementation.
Normalize the value of meta.ref parameter of batch request
object. Previously ref parameter format differs between "download" and
"upload" objects.

The "upload" always was using the fully qualified branch
name (eg. refs/heads/master), when the "download" used short
name (eg. master).

Right now meta.ref will have consistent values. When "refs/heads/"
prefix will be appended when it is not present. Special "HEAD" reference
will be replaced with empty string, which will result in meta.ref
parameter not being included in the request object.
@technoweenie technoweenie mentioned this pull request Feb 5, 2016
17 tasks
@technoweenie
Copy link
Contributor

One can say that that keeping branch-oid mapping is the implementation detail of server and LFS client should not impose that.

100% agreed 👍. Also, LFS won't be collecting that info when called through the smudge filter, so it may not be there anyway.

Thanks for the patch. This should make it into v1.2.0: #844

@dluksza
Copy link
Author

dluksza commented Feb 10, 2016

Thanks, that is awesome! 💃

When v1.2.0 is planned to be released?

@technoweenie
Copy link
Contributor

I don't have a strict schedule. #844 has what I'd like to do for that release. Though if it's taking too long, I may decide to cut a release with what I have at the time :)

@sinbad
Copy link
Contributor

sinbad commented Feb 23, 2016

I'm not crazy about how this makes the LFS push invoked by the pre-push hook semantically different to using git lfs push. Seems that a perfectly valid git lfs push would now fail if the server requires ref information, while a pre-push hook invoked version that does the same thing would succeed. Could lead to some odd support edge-cases?

Also the argument against passing refs for fetch was that client-side data can't be trusted, but that applies to pre-push too; I could easily craft my local branch state to include any branches I wanted and the pre-push hook will happily supply that to the server while uploading the LFS objects, even if the actual git push would be rejected, because by nature it happens before any other validation. If it's just to avoid wasting upload time on mistakes then fair enough but I hope no-one interprets it as a security feature.

@technoweenie
Copy link
Contributor

I'm not crazy about how this makes the LFS push invoked by the pre-push hook semantically different to using git lfs push.

Ah yeah, we should be able to send the same data for a git lfs push.

I could easily craft my local branch state to include any branches I wanted and the pre-push hook will happily supply that to the server while uploading the LFS objects, even if the actual git push would be rejected, because by nature it happens before any other validation.

I think the idea is that their git server has special privileges for certain branches. It should authenticate based on the user AND the fetched ref. Though I imagine most LFS servers will ignore it.

@dluksza
Copy link
Author

dluksza commented Feb 23, 2016

I think the idea is that their git server has special privileges for certain branches. It should authenticate based on the user AND the fetched ref. Though I imagine most LFS servers will ignore it.

We are mostly concerned about push operations when we wold like to prevent users from pushing large files "to branches" that they don't have access to. This is a optimization that people won't waste time waiting for the huge blob to upload only just to know that they have access for particular branch.

@sinbad
Copy link
Contributor

sinbad commented Feb 23, 2016

It should authenticate based on the user AND the fetched ref.

Yeah, my point is that the user can basically make up any ref they like, with a bit of local branch fudging & the correct git push call, and the the pre-push hook will just pass it in the request. So it's not really valid as an authorisation test. But @dluksza has confirmed it's really just to avoid them wasting their time with the LFS upload associated with a git push that subsequently gets rejected.

@technoweenie
Copy link
Contributor

@dluksza: Yo, we didn't forget about this :) I really wanted this to work with git lfs push --all, so I had to update the scanner code: #1040. #1071 is a PR on top of that which includes this PR.

It's still kind of a mess now, but should be ready next week :)

@dluksza
Copy link
Author

dluksza commented Mar 14, 2016

Good to know that this change was not forgotten :) Looking forward for this change being merged.

technoweenie added a commit that referenced this pull request Apr 6, 2016
@technoweenie
Copy link
Contributor

@dluksza and @sinbad: would love your feedback on this:

So, this PR has no tests, and I caught a nasty bug in here. Now, testing this is pretty tricky (and I didn't even mention it in this PR), so I won't hold that against you. Here's the first pass of tests:

620c94b

These tests are a little weird. Normally, the test lfs server doesn't even bother checking the ref. So, I added failsMetaCheck() that only fails if both are true:

  • An object OID matching the content meta-test exists. This tells the test server to verify the given ref.
  • An object OID matching the content meta-test: {full-ref} (example: meta-test: refs/heads/master) does not exist.

Nice thing is that these tests caught the fact that the strings.HasPrefix args are swapped. I noticed the tests were failing with:

Fails meta check with: {refs/heads/refs/heads/master}

I fixed it by changing it to strings.HasPrefix(ref, "refs/"), which also allows full tag refs too.

The tests pass on top of this PR, after it's been applied to #1128. I tested all the scenarios I could think of that work. Two that didn't work that I want to follow up on:

  • git lfs push with a tag
  • git lfs fetch with a tag

I'm not positive how we'd fix this, but that can come in another commit. I think we'd just have to attempt to expand ref names (master => refs/heads/master, v1.0 => refs/tags/v1.0) based on the local refs, erroring if there are any ambiguities. Maybe look into branch tracking stuff too.

@sinbad
Copy link
Contributor

sinbad commented Apr 7, 2016

I'm not positive how we'd fix this, but that can come in another commit. I think we'd just have to attempt to expand ref names (master => refs/heads/master, v1.0 => refs/tags/v1.0) based on the local refs, erroring if there are any ambiguities.

I'm still a bit uncomfortable with this feature being somewhat unreliable and locally exploitable. Leaving aside wilful exploitation, it also adds context in some cases but not all (e.g. pushing oids), and the context it sends is somewhat inconsistent currently - when invoked via pre-push it will send the remote branch name (because its passed the destRef), when invoked via git lfs push it will send the local branch name because it only uses the local ref. Sure most people use identical names for local/remote branches, but some don't.

Maybe look into branch tracking stuff too.

This would address that last point, but remember the answer can be different depending on the context (pull vs push). We only have code for the pull case right now because that's all we've needed, we haven't needed to derive a remote branch for push cases before. That's a bit more complex, you need to check the push.default config because it's only the same as pull when that's upstream or simple. It also depends on whether you're pushing to the remote you're tracking too.

I dunno, obviously this feature is useful in specific cases right now, I'm just a bit uncomfortable with things that are only reliable on the 'happy path'. In this case the bad outcome is only that the ref information given to the server might not be correct in odd cases, so I guess so long as we put a big red flag on this to say it's advisory only, can be manipulated by a smart user if they decide to, and may have some rare edge cases. Puts my teeth on edge a little but I can live with it ;)

@dluksza
Copy link
Author

dluksza commented Apr 7, 2016

+1 for expanded refs

For me it is all client facing REST-like API that one can try to exploit. The lfs-server URL and protocol are know, one can easily craft upload and download request with wrong oid and size. Same goes for the meta-data.

Server implementors must be aware of that. But wouldn't say that meta-data are less trusted in the worse case one would upload blob's and git data would be rejected (same as it is right now).

This is just a simple optimization for the end users so that they don't need to wait for the blob's being uploaded and then got access deny from the git server.

@technoweenie
Copy link
Contributor

This PR is sorely in need of an update. I'll push an updated PR that fixes the merge conflicts. I have a few items I'd like to see before shipping in v1.4 or v1.5:

  • Ensure we're always sending full server side refs: refs/tags/v1.0, refs/heads/master. Never client ref names.

  • Integration tests that test git lfs push, git lfs fetch on repos where the local and remote branches don't match.

  • Allow additional properties in the json schema :)

    "meta": {
      "type": "object",
      "properties": {
        "ref": {
          "type": "string"
        }
      },
      "required": [],
      "additionalProperties": true
    }

@technoweenie technoweenie mentioned this pull request Aug 16, 2016
2 tasks
@technoweenie
Copy link
Contributor

Closing this, it's too far behind to merge. I've got an updated version though: #1456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants