Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] layers federation #1825

Open
2 tasks
runcom opened this issue Jul 8, 2016 · 37 comments
Open
2 tasks

[Proposal] layers federation #1825

runcom opened this issue Jul 8, 2016 · 37 comments

Comments

@runcom
Copy link
Contributor

runcom commented Jul 8, 2016

As discussed in moby/moby#23014 I'm proposing to provide proper layer federation support in the open source docker registry in order to have wider support in other operating systems (currently Windows is only capable of such a feature).

This proposal is targeting only docker/distribution. We'll re-iterate after this is finished on the docker client in docker/docker.

I'm explicitly taking for granted that we're familiar with #1725

What changes are required in order to have an MVP?

TODO

1) backward compatibility

We need to ensure that if an older docker daemon does not support fetching foreign layers, the registry provides a mechanism to download the layer itself and transparently serve it.
A concern here is about caching and avoid to retrieve the foreign layer every time (TTLs?) but I don't think it's strong requirement for a first implementation.
This point should take care of one of the points listed here moby/moby#23014 (comment)

docker pull imagewithforeignlayers
  -> client: get manifest
  -> registry: rewrite the manifest to remove urls
  -> client: start foreign layer blob fetch (as it was a normal one)
  -> registry: fetch the foreign layer
  -> registry: serve the foreign layer as it was a normal one
  -> remember layer diffID is a foreign one somewhere (?)

docker push imagewithforeignlayers
  -> registry: inspect manifest looking from previously remembered foreign layers
  -> registry: skip foreign layer upload (make this transparent wrt UI)
2) foreign layers unavailability

There should be an expressive way of notifying clients about foreign layer unavailability. The UX I'm thinking about is simple and it consists of just showing an error saying that the remote blob isn't available.
However, I don't really fell strong about that. Another option would be to transparently telling a client the blob isn't available as we do today if a blob is missing from the layer storage. How does this sound? We won't leak this aspect of the manifest this way.

Refer to @stevvooe grid here also moby/moby#23014

3) provide a way to opt-in for such a feature at the registry level (PR #1829)

Since there may be the need to defer this support on some deployment, we can have an explicit setting which opt-in for this new behavior.

4) foreign source whitelist (PR #1829)

The implementation would provide a way to whitelist foreign layers hosts and let registry operators select which domains are allowed to fetch a layer from.
UX will be consistent with this and provide feedback when such a requirement isn't met.

5) private images

This is a tricky point, I think as long as the manifest remains private, the external layer remain hidden as well. I understand @dmp42 concerns in moby/moby#23014 (comment) but I don't have an immediate solution. Happy to hear what do you think.

I believe this is what we can start with here, I'll be happy to start playing around with the code here.
/cc @RichardScothern @aaronlehmann @dmcgowan @stevvooe @dmp42

What is this proposal not covering for an MVP?

  • foreign layer urls authorization
@runcom
Copy link
Contributor Author

runcom commented Jul 8, 2016

/cc @aweiteka @vbatts

@aaronlehmann
Copy link
Contributor

docker pull imagewithforeignlayers
  -> client: get manifest
  -> registry: rewrite the manifest to remove urls
  -> client: start foreign layer blob fetch (as it was a normal one)
  -> registry: fetch the foreign layer
  -> registry: serve the foreign layer as it was a normal one
  -> remember layer diffID is a foreign one somewhere (?)

I'm a little concerned about breaking pull-by-digest use cases by rewriting manifests. It's true that we've done this once before (with the schema1 to schema2 transition), but I hoped we'd never do it again.

Another issue is that I'm not sure serving the foreign layer from the registry will always be acceptable. Some of the use cases for foreign layers involved legal restrictions on where the layer could be served from. I would think that we should limit foreign layers to cases like this, and focus on providing a useful error message to older engines that don't support foreign layers. Perhaps we could detect when an engine requests a foreign layer blob from the registry instead of from the provided foreign source URL, and serve an error telling the user to upgrade. I'm not sure Docker engine surfaces these error messages to the user, though. There may also be some complexity in keeping track of which blobs represent foreign layers.

@runcom
Copy link
Contributor Author

runcom commented Jul 9, 2016

@aaronlehmann I agree on your points. I thought there would be some sort of issues with older clients, but I'm ok to just provide a clear error instead of implementing the above, which could be tricky, and yes, break pull by digest again.
I believe the backward compatibility point can be replaced by just letting the registry operator to opt-in for this feature when she's ready. This way his service won't suffer any issues with this.

@dmp42
Copy link
Contributor

dmp42 commented Jul 11, 2016

@runcom thanks for bearing this.

Couple of notes:

  1. the external layer remain hidden as well

It's hidden as long as the domain + digest / url is not disclosed. This is a very weak guarantee and has bitten us in the past pretty badly (back before content-adressibility, when layer ids were randomly generated) - eg: these ids/url may appear in logs, in CI systems outputs, in... youtube broadcasts, and other weakly secured places.

Unless there is an ACL mechanism on the foreign source, it should be assumed that foreign layers in an access restricted / "private" image are public.

  1. foreign content availability at push time

I would strongly suggest that the registry verifies on PUSH that any foreign layer in the pushed manifest is actually there (either HEAD-ing it, or actually retrieving it and storing it locally / in a cache).

That would at least prevent broken images from being pushed.

  1. backward compat

@aaronlehmann what happens if layer with digest XYZ is listed in the manifest both as a foreign layer, and as a regular layer?

Would the engine fetch the remote layer, then just bypass the regular one? (since they have the same digest)

Wouldn't that provide a backward compatibility path without manifest rewriting (hence not breaking DCT)? (assuming the registry would be able to forward the foreign one to older engines)

@RichardScothern
Copy link
Contributor

hi @runcom . Thanks for taking this on.

1)I agree with @aaronlehmann. Breaking pull by digest is very undesirable, but saving some foreign layers to a registry will violate somebody's terms of service.

  1. Is definitely required, and we are expecting a PR from the hub team (@nwt) for this. 3) can be rolled into whatever configuration change that brings.

@dmp42
Copy link
Contributor

dmp42 commented Jul 11, 2016

but saving some foreign layers to a registry will violate somebody's terms of service.

Even if it's technically caching and not hosting? (now we need a lawyer :p)

@runcom
Copy link
Contributor Author

runcom commented Jul 11, 2016

Even if it's technically caching and not hosting? (now we need a lawyer :p)

ah, that's what I was thinking by "hosting it while just serving"

@runcom
Copy link
Contributor Author

runcom commented Jul 11, 2016

I would strongly suggest that the registry verifies on PUSH that any foreign layer in the pushed manifest is actually there (either HEAD-ing it, or actually retrieving it and storing it locally / in a cache).

That would at least prevent broken images from being pushed.

this is tricky, as in we should know anything about external services uptime/SLA, so it might be unavailable at that moment, do we stop ppl from pushing entirely?

@RichardScothern
Copy link
Contributor

@dmcgowan and I independently thought about serving redirects to foreign URLs for clients. The go http client follows them by default and this functionality is used for redirecting clients to storage locations for blobs (see the storagemiddleware for more details). This would avoid hosting and lawyers.

@RichardScothern
Copy link
Contributor

#1829 (PR for foreign layer whitelisting) is in also.

@aaronlehmann
Copy link
Contributor

backward compat
@aaronlehmann what happens if layer with digest XYZ is listed in the manifest both as a foreign layer, and as a regular layer?

Would the engine fetch the remote layer, then just bypass the regular one? (since they have the same digest)

Wouldn't that provide a backward compatibility path? (assuming the registry would be able to forward the foreign one to older engines)

It's an interesting idea. The problem is that the list of layers in the manifest is ordered, and the engine verifies each layer in the manifest against the corresponding one in the image configuration. So duplicating a layer in the manifest would break things unless the layer is really used twice in the image (which would probably technically work, but doing this feels like a hack).

@runcom
Copy link
Contributor Author

runcom commented Jul 12, 2016

@dmp42 notes about point 5)

I can understand the ease of letting foreign layers to be hosted raw (as in wget https://host.com/layerdigest.tar will result in layer's data). That said, a better option to support 5)
would be to have an API which foreign layers should implement and on which we could build on top some sort of authentication/autorization - lets' start this simple if any.
Another approach would be that foreign layers need to be hosted on docker/registry and leverage a registry authn/z mechanism as part of fetching a foreign layer (I mean to say, foreign layers are themselves hosted on a docker/distribution registry (or anything implementing the API))

@sergeyfd
Copy link
Contributor

Why registry has to be responsible for serving foreign layers and not just manifests, that include them? IMHO client should be responsible for retrieving those layers. In this case you don't need to worry about authentication, TTL, caching and many more other things.

@dmp42
Copy link
Contributor

dmp42 commented Jul 28, 2016

@runcom interesting - though that would then require the user to login multiple times during the pull (possibly even to know beforehand which foreign sources are required if you want to pull non-interactively):

  • docker login A
  • docker login B
  • manifest on registry A
  • layer 1 on registry A
  • layer 2 on registry A
  • layer 3 on registry B

The user experience will likely not be great.

I tend to think private images (from registry A) should forbid foreign layers (up to the registry implementation to inspect the manifest and prevent making it private).

@sergeyfd see above: from a registry operator perspective, a private image can no longer be guaranteed to be private if it contains foreign layers and/or the UX will be pretty crappy.

TTL & caching is likely irrelevant for that part of the discussion.

@runcom an alternative to all this would be to sign the foreign layer url like we do for CloudFront/Fastly/delegated delivery <- @aaronlehmann wdyt?

@aaronlehmann
Copy link
Contributor

@runcom an alternative to all this would be to sign the foreign layer url like we do for CloudFront/Fastly/delegated delivery <- @aaronlehmann wdyt?

It's an interesting thought. I guess the question we should ask is what we want to require on the part of the part of the server that hosts the foreign layer. If it needs to verify signatures in URLs, we'd be pushing a lot of logic and requirements to that host. I guess they would have to run software we provide to do this, or implement a custom version. I don't know whether that's realistic or not.

@aaronlehmann
Copy link
Contributor

If the signed URL had the signature in a query string (as I believe CloudFront signed URLs do), supporting it would be optional. For hosting a foreign layer that doesn't require authentication, you could use a standard HTTP server that would just ignore the query string. But for a setup where authentication matters, you would have something that checks the signature.

So one thing that's really nice about the signed URL idea is it's unobtrustive for setups where auth is not involved.

@dmcgowan
Copy link
Collaborator

@dmp42 the multiple login scenario would require significant changes to the way docker pull works today. Currently the pull process assumes interaction with a single registry and all authorization is determined through that registry. If a layer needed to be pulled from another registry with its own credentials, the client would be forced to send up all known registry credentials to be available to the daemon (docker build does this and it is cringeworthy). To keep the flow change less intrusive, it is probably best to limit the foreign layers to publicly accessible layers or signed URLs.

@stevvooe
Copy link
Collaborator

stevvooe commented Aug 1, 2016

I'm worried this discussion is getting mired in authorization support. The urls field was never meant to be hidden behind authorization and adding support for it isn't in the spirit of the feature. It is meant for resources that are available but must come from a particular source.

The main issue with signed urls is that the content will at some point become unreachable, which breaks the whole idea.

We could have an authorization callback for urls, but this is somewhat the user experience we wanted to avoid with manifest urls.

@stevvooe
Copy link
Collaborator

stevvooe commented Sep 9, 2016

@runcom Any progress on burning down the open items on this?

@runcom
Copy link
Contributor Author

runcom commented Sep 9, 2016

  1. backward compatibility

docker/distribution side, this has been taken care of by having the opt-in feature on the registry side. That means the only thing remaining to do for this one is to provide an error message from the registry (bubbled up till the docker CLI) which informs the client to upgrade because her version doesn't support foreign layers.
Other possibilities requires us to break stuff like pull-by-digest which is a no-go (to me as well), or caching and serving the foreign layer from the registry itself which may break legal terms (since the spirit of these foreign layers is to be served from external sites only).
So, if you're ok with the outlines solution I can go ahead and create a patch for docker/distribution.

  1. foreign layers unavailability

Based on moby/moby#23014 (comment) we would provide appropriate error messages. There's no option to have a cache of layers either so.

  1. private images

It seems this is going out of scope as per #1825 (comment) and shouldn't be an issue?

@stevvooe
Copy link
Collaborator

stevvooe commented Sep 9, 2016

@runcom Let's go ahead with the "good error message approach".

Could you add a checklist to the issue description with the PRs that are left to do? I really want to get these items closed out so we can get this in.

@runcom
Copy link
Contributor Author

runcom commented Sep 9, 2016

@stevvooe great, I guess 5) isn't an issue anymore? Though, what about the Docker Hub?

@stevvooe
Copy link
Collaborator

stevvooe commented Sep 9, 2016

  1. isn't an issue anymore?

We do need to ensure that the media type is preserved when pushed to a private registry.

Though, what about the Docker Hub?

Docker Hub will have ACLs that will only allow urls from certain providers vetted to ensure that they will keep the URL up. docker/distribution has support for this, as well. We really want to avoid having broken images getting pushed around everywhere.

@runcom
Copy link
Contributor Author

runcom commented Sep 10, 2016

@stevvooe added a checklist with the remaining work to be done.

I'll go ahead and tackle 1) - though, I'm not sure what's your plan there.

It seems to me that to provide a useful error to old clients (or for that matter, generally, API users) we'll need to have some kind of option which tells "get me the manifest with foreign blobs" or otherwise fail with a useful error and another to tell the API to not care about "serving manifests with foreign blobs".
I haven't though at other possibilities, this seems rather ugly to me but if you have something else in mind please share it.
IOW:

GET /v2/<name>/manifests/<reference> # name/reference contains foreign blobs
# FAIL: saying the manifest contains foreign urls
GET /v2/<name>/manifests/<reference>?foreign=true # override and get the manifest
# SUCCESS: query string is completely random, could be an header

It's not the better API out there but it will ensure manifests with foreign blobs are fetched only when the user/implemeter knows about them - and since it's a new feature which anyone can opt-in by adding an option. (older Docker client should fail showing that useful error, newer will work if supported, API users are - API users and can easily adjust their code accordingly).

@stevvooe
Copy link
Collaborator

@runcom I am not sure this is worth an API change but a header declaring support isn't out of the range of possibility.

I think we can probably return a specific error message for layers that are marked as foreign for a given repository. We can do this without a specification change.

That leaves two options here (reiterating both of the suggestions):

  1. Have newer clients declare support for foreign layers when interacting with a registry. Serve up an error message when the client doesn't have this header on manifest fetch.
  2. Mark layers as foreign (if only we had a media type for this!) and serve up an error message when clients try to fetch them.

@runcom
Copy link
Contributor Author

runcom commented Sep 13, 2016

Seems to me that point 2 requires having to remember which digests are foreign somewhere (since we don't actually have a tar ball) probably at push time, I'm not sure it's good to maintain a database for foreign digests also. I prefer point 1 though but I get it requires an API change which could be unwanted.

@stevvooe
Copy link
Collaborator

@runcom We already keep track of media types in the registry.

@runcom
Copy link
Contributor Author

runcom commented Sep 13, 2016

@stevvooe alright, I didn't know that, so the question is would you fail on layer or manifest retrieval?

@stevvooe
Copy link
Collaborator

@stevvooe alright, I didn't know that, so the question is would you fail on layer or manifest retrieval?

If we go with the marked layer approach, we'd fail on layer retrieval. I'm not sure what this would look like on the older clients.

@runcom
Copy link
Contributor Author

runcom commented Sep 13, 2016

If we go with the marked layer approach, we'd fail on layer retrieval.

I get that.

I'm not sure what this would look like on the older clients.

and because of the above I would still prefer to fail on manifest retrieval or any of you from the team to decide.

@runcom
Copy link
Contributor Author

runcom commented Sep 24, 2016

and because of the above I would still prefer to fail on manifest retrieval or any of you from the team to decide.

@stevvooe @aaronlehmann any decision on this? I would try to work on this

@stevvooe
Copy link
Collaborator

@runcom Marking layers as foreign is more in line with the design of the registry and doesn't require protocol changes. Specifically, layers that are uploaded with the foreign media type would fall into this category.

@runcom
Copy link
Contributor Author

runcom commented Oct 20, 2016

Specifically, layers that are uploaded with the foreign media type would fall into this category.

I'm sorry, either me or you is confused here...layers with foreign media type can't be uploaded... how is this supposed to work otherwise...you can't mark a layer foreign on a registry on upload because nobody is uploading that layer to the registry.

On manifest upload we can inspect it and for every foreign layer create a fake blob which just store a descriptor so when people ask for it we can match the descriptor.MediaType and return error if it's foreign (maybe we should always store a descriptor along with a blob).

@stevvooe
Copy link
Collaborator

@runcom Yes, a layer would be marked as foreign by a manifest. This would have to be per repository, which is pretty much how we manage it now. I am not sure that we need any registry changes to adopt the original PR if we understand the UX issues.

@runcom
Copy link
Contributor Author

runcom commented Oct 20, 2016

@stevvooe so docker/distribution doesn't require any change? And we just need to handle the errors in Docker? A fetch of a foreign layer right now will just return blob unknown - if we're OK with this then I'll reopen the Docker PR and we can move forward with that.

@stevvooe
Copy link
Collaborator

@runcom We mostly need a plan that covers all the failure modes for new and old clients. Let's re-open the docker PR and get that into the best possible state.

@stevvooe
Copy link
Collaborator

@runcom Make sure that the proposal meets the criteria set out in moby/moby#23014 (comment) before re-opening. Sorry if that wasn't clear above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants