Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: support for passing client image name for mirroring use case #12

Open
dmcgowan opened this issue Apr 16, 2018 · 15 comments · May be fixed by #66
Open

spec: support for passing client image name for mirroring use case #12

dmcgowan opened this issue Apr 16, 2018 · 15 comments · May be fixed by #66
Milestone

Comments

@dmcgowan
Copy link
Member

There is an assumption to today that a server implementation of the distribution specification will either not care about the name used by the client or that all requests will have a known common namespace. An example of this is the Docker Hub assuming that all requests are prefixed with docker.io even though the registry hostname is registry-1.docker.io. However this has always caused difficulty when a client then wants to mirror content, localhost:5000/library/ubuntu could proxy to registry-1.docker.io/library/ubuntu, however localhost:5000 could never proxy to anything else. Complicated registry configurations have been proposed to remedy this as well as a backwards incompatible approach of requesting as localhost:5000/docker.io/library/ubuntu. However a goal of this specification should be simplicity and backwards compatibility. I believe that a solution does belong in the specification to unlock the mirroring use cases without complicated configuration or DNS setup.

My proposal is to add a way to pass up the name resolved by the client to the registry, (e.g. docker.io/library/ubuntu). So if a request is going to localhost:5000/library/ubuntu, it could mirror both docker.io/library/ubuntu and quay.io/library/ubuntu and switch based on request parameters. There are 2 possible ways to achieve this, one is by creating adding an HTTP request header (e.g. OCI-REF-NAME: docker.io/library/ubuntu) or by adding a query parameter ?oci-ref-name=docker.io/library/ubuntu). The first is clean but the second may be more useful for static mirroring. I am not suggesting one over the other yet, just stating the problem and solutions to discuss.

@wking
Copy link
Contributor

wking commented Apr 16, 2018 via email

@dmcgowan
Copy link
Member Author

@wking number of components have no relevance here. The specification does not define anything about the path components. The backwards incompatibility comes from existing clients and servers. If a client is upgraded and now starts requesting localhost:5000/docker.io/library/ubuntu, the registry would have to be configured to treat docker.io as the same as previous requests it had seen. If it was an older registry, then it would just not understand the request, forcing the client to resend the request without docker.io. This sort of feature probing is a huge pain to implement for clients and this kind of configuration is really messy on the server. Using headers or query parameters can be safely ignored by older registries or omitted by older clients.

@wking
Copy link
Contributor

wking commented Apr 17, 2018 via email

@xiaods
Copy link

xiaods commented May 10, 2018

just like mirror-proxy function, it not spec scope in my mind.

@dmcgowan dmcgowan added this to the v1.0 RC1 milestone Jan 16, 2019
@atlaskerr
Copy link
Contributor

There hasn't been much talk about this issue. Is this something we want to put on the agenda for Wednesday's call or can we push this to a later release?

@dmcgowan
Copy link
Member Author

I am going to open up a PR for it this week. We can discuss the design further there. I think this is important to properly implement the mirroring use case in a less opinionated manner (currently a mirror can only mirror a single upstream registry).

@atlaskerr
Copy link
Contributor

atlaskerr commented Jan 28, 2019

How about implementing a /v2/mirror/<repo> or /v2/<repo>/mirror endpoint and have the client use the Host header to let the registry know where to pull from?

@dmcgowan
Copy link
Member Author

Mirrors should be mostly transparent to the client, kind of like setting an HTTP proxy. Also the issue with the current situation is the repository name used by registries does not contain the host name which could lead to namespace collision in the registry implementation in the mirroring case. Using the HOST header in this case would not give enough indication of what the upstream HOST is, only the HOST for the mirror. HTTP proxying already covers the case where HOST does not need to point to the mirror, but this doesn't solve the case of having a single proxy/cache that can be used for multiple upstream registries.

@dmcgowan
Copy link
Member Author

I am working on a PR for this now. I will add a section under Use Cases for this which will describe how it is used, please comment on the design. The PR will update the individual requests.

Mirroring and Proxy Caching

Company X sets up an internal registry which is capable of storing local copies of images from any upstream registry.
Registry clients are configured to send all requests to retrieve registry data to the internal registry.
The clients attaches the OCI-Repository-Authority HTTP header to every registry request indicating the original registry host name.
The original registry host name is the authority for the given repository and used by the internal registry to fetch content and authentication parameters.

@atlaskerr
Copy link
Contributor

I think X-Proxy-Registry or OCI-Proxy-Registry is cleaner. When I think of authority, I think TLS certificates :P

@atlaskerr
Copy link
Contributor

Also, is it possible for clients to use separate creds for the local and authority?

@dmcgowan
Copy link
Member Author

Using the term "authority" here because a proxy is really required to delegate authority over content and access to that content to somewhere else. Whether it does that delegation by proxying is an implementation detail by the registry, same as how it constructs any proxy requests.

One thing to consider though is the use of an HTTP header vs a query parameter. A query parameter gives better cacheability in cases where there could be an even less sophisticated HTTP cache in between. A query parameter would prevent identical requests returning different content based solely on a non-standard HTTP header. In that cases we would have something like /v2/dmcgowan/myrepo/manifests/latest?authority=docker.io as the path. This is slightly more visible for registries which would not implement this though, however it may be a better solution. Note this query parameter would only show up when the client knows it is going through a mirror, because the HTTP Host header does not match the intended authority.

@vbatts vbatts modified the milestones: v1.0.0-rc.0, v1.0.0-rc1 Feb 14, 2019
@justincormack
Copy link

Yes I think a query parameter is better, otherwise for caching you need to set vary-by on a nonstandard header.

@dmcgowan
Copy link
Member Author

@justincormack my plan here to PoC it in containerd then open a PR for the spec here. I am not sure we have used a consistent naming scheme for what we call this, in containerd we usually call this part the namespace. Sometimes it is referred to as domain, registry, or host.

@stevvooe
Copy link
Contributor

I think this is a good approach and the first step in separating registry location from the "authority". The eventual goal should be to encode the authority in the image name, but this will allow for cases where it is not.

Do registries currently ignore this parameter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants