NG: Image Federation #662

aweiteka · 2014-10-30T12:59:09Z

What is image layer federation?

Image federation is where dependent image layers are served from different servers. For example, an ISV builds on a Red Hat base image. The ISV layers are served from cdn.isv.com and the Red Hat layers are served from cdn.redhat.com.

The content-addressable v2 image format and registry makes this an ideal time to consider this model.

Why is it important? Who cares?

Many companies require to host their own bits. It's their control point. It's an important legal and provenance issue for them.

How does it work?

A simple example:

$ docker pull isv/app
bef54b8f8a2f <- served from cdn.redhat.com
8da983e1fdd5 <- served from cdn.isv.com

What might implementation look like?

Support pushing of metadata only. This assumes image has landed on CDN by another means.

$ docker push rhel7 --redirect-url https://cdn.redhat.com/registry/images/
bef54b8f8a2f <- pushing metadata only

When an image based on the above is pushed the layer upload is skipped.

$ docker push isv/app
bef54b8f8a2f <- skipped, metadata already uploaded
8da983e1fdd5 <- layer pushed to registry

Example Implementation

This has been implemented in Crane, a component of Pulp. Red Hat uses this as its production registry. Crane is a read-only implementation of the docker registry protocol. Registry metadata (json) is created by the Pulp server.

Crane serves calls to /v1/repositories/<namespace>/<repository>/images|tags directly and then redirects (302) any calls to /v1/images/<image_id>/*.

In the following example note the two URL values.

Red Hat base image

{
    "images": [
        {
            "id": "bef54b8f8a2fdd221734f1da404d4c0a7d07ee9169b1443a338ab54236c8c91a"
        }
    ],
    "protected": true,
    "repo-registry-id": "rhel7",
    "repository": "redhat-rhel7",
    "tags": {
        "0-23": "bef54b8f8a2fdd221734f1da404d4c0a7d07ee9169b1443a338ab54236c8c91a",
        "latest": "bef54b8f8a2fdd221734f1da404d4c0a7d07ee9169b1443a338ab54236c8c91a"
    },
    "type": "pulp-docker-redirect",
    "url": "https://cdn.redhat.com/images/registry/",
    "version": 1
}

A child ISV image file redirects to another URL.

ISV image

{
    "images": [
        {
            "id": "8da983e1fdd58a2fdd221734f1da404d4c0a7d07ee9169b1443a338b8f8a2fdd"
        }
    ],
    "protected": true,
    "repo-registry-id": "isv/app",
    "repository": "isv-app",
    "tags": {
        "latest": "8da983e1fdd58a2fdd221734f1da404d4c0a7d07ee9169b1443a338b8f8a2fdd"
    },
    "type": "pulp-docker-redirect",
    "url": "https://cdn.isv.example.com/images/registry/",
    "version": 1
}

The text was updated successfully, but these errors were encountered:

wking · 2014-10-30T14:50:20Z

On Thu, Oct 30, 2014 at 05:59:11AM -0700, Aaron Weitekamp wrote:

$ docker push rhel7 --redirect-url https://cdn.redhat.com/registry/images/
bef54b8f8a2f <- pushing metadata only
…
$ docker push isv/app
bef54b8f8a2f <- skipped, metadata already uploaded
8da983e1fdd5 <- layer pushed to registry

So docker pushes to the push-time registry by default, but skips any
layers for which --redirect-url has added a URL to a different
repository? When does the cdv.isv.… URL get injected into the
8da983e1fdd5 metadata? And you'll have to resign after each metadata
change with the proposed signature framework (moby/moby#8093).

Personally, I'd rather keep this out of the image metadata. Instead,
how about a check-down list of registries for pulling:

Pull from https://cdn.isv.example.com/, but if the image isn't there
Pull from https://cdn.redhat.com/, but if the image isn't there
Pull from https://registry.hub.docker.com/

Which you setup in your ~/.dockercfg. Then for pushing you have:

$ git push rhel7 https://cdn.redhat.com/
$ git push rhel7..isv/app https://cdn.isv.example.com/

to only push layers in isv/app but not rhel7 to cdn.isv.example.com.

dmp42 · 2014-10-30T18:14:55Z

Thanks a lot @aweiteka

A couple questions, and some infos about what's going on with v2:

Many companies require to host their own bits. It's their control point. It's an important legal and provenance issue for them.

Does image signing (coming with v2) change that situation for them?
My point being image signing allow bits to be served from (any) untrusted source while still ensuring the bits are untempered.

Also, if I understand well, crane does 302 to where the actual bits are. So, the company (content owner) has to trust the ISV's registry (/crane) to do what's right here - which to me kind of weakens the "control-point" - eg: I'm not sure I see a difference then between 302, proxy-pass and mirroring, from a control standpoint.

About the v2 protocol - it's quite likely that layers urls are going to be namespaced, eg:

image:
/v2/manifest/redhat/rhel7/latest
/v2/manifest/isv/foo/latest

layers:
/v2/blob/redhat/rhel7/A_RHEL_LAYER_ID
/v2/blob/isv/foo/A_RHEL_LAYER_ID
/v2/blob/isv/foo/A_FOO_LAYER_ID

The reason for that change is simpler access control (flat namespace for layers - we currently have - doesn't work well).

That doesn't mean content is actually duplicated on the registry - but that inside the registry mechanics, there are "mount points" for layers into namespaced url.

Right now, the way I see it, it should be pretty easy to:

ensure docker engine behaves correctly regarding 302
have some mechanism inside the registry to redirect specific layers

... the part where I'm not that confident is the engine bits allowing to instruct a registry selectively that a given layer is to be found "elsewhere". Actually, I'm wondering if this is at all an engine decision to make.

wking · 2014-10-30T18:39:23Z

On Thu, Oct 30, 2014 at 11:15:00AM -0700, Olivier Gambier wrote:

Also, if I understand well, crane does 302 to where the actual bits
are. So, the company (content owner) has to trust the ISV's registry
(/crane) to do what's right here - which to me kind of weakens the
"control-point" - eg: I'm not sure I see a difference then between
302, proxy-pass and mirroring, from a control standpoint.

But the ISV can put their own auth in front of their registry. That
means they can use their own auth there, but leave access to the Red
Hat images up to the folks running the Red Hat registry. If you used
proxy-pass from Red Hat, clients would have to give Red Hat their ISV
credentials. You could use proxy-pass from the ISV (assuming the ISV
has read-access to the Red Hat repository), but then the ISV has to
handle the extra load of distributing the Red Hat images.

About the v2 protocol - it's quite likely that layers urls are going
to be namespaced, eg:
image:
/v2/manifest/redhat/rhel7/latest
/v2/manifest/isv/foo/latest

layers:
/v2/blob/redhat/rhel7/A_RHEL_LAYER_ID
/v2/blob/isv/foo/A_RHEL_LAYER_ID
/v2/blob/isv/foo/A_FOO_LAYER_ID
The reason for that change is simpler access control (flat namespace
for layers - we currently have - doesn't work well).

Hmm. GitHub seems to do fairly well with a flat namespace. Is the
goal to make this easy to use with Basic Auth and a few reverse-proxy
configs? I think it would be easy for a more intelligent auth service
to say “ok, you've authenticated as ‘alice’, and that user does have
read (or write) access to the ‘rhel7’ namespace.” You'd just have to
store (user, namespace(/repository), permission) information in a
database for the auth service. That seems both simpler and more
flexbile than trying to sort all the repositories into a single
hierarchy.

That doesn't mean content is actually duplicated on the registry -
but that inside the registry mechanics, there are "mount points" for
layers into namespaced url.

Ah, maybe you indent to have multiple hierarchies? That would be as
flexible as an auth service mapping users to namespace/repository
permissions, but I don't see a point to exposing it outside of the
auth service.

have some mechanism inside the registry to redirect specific layers

Not inside the client to check several repositories for a given layer?

... the part where I'm not that confident is the engine bits
allowing to instruct a registry selectively that a given layer is to
be found "elsewhere". Actually, I'm wondering if this is at all an
engine decision to make.

I don't see why image hosting would be anyone's decision to make.
If Alice has an image, she should be able to host it wherever she
likes (modulo copyright/licensing/…), without needing to inform
another registry about her choice. And if Bob wants access to Alice's
images (modulo Alice's auth service), he should be able to have his
client check her registry (both implicitly via a registry preference
list, and explicitly via a command line option).

dmp42 · 2014-10-30T20:00:18Z

But the ISV can put their own auth in front of their registry. That
means they can use their own auth there, but leave access to the Red
Hat images up to the folks running the Red Hat registry.

Do you suggest there may be several layers of authentication (authenticate -> 302 -> authenticate again)?

Either way, the point still stands: is there actually a strong "control-point" in trusting a downstream registry to do a redirect?

If you used proxy-pass from Red Hat, clients would have to give Red Hat their ISV
credentials.

I don't think we are talking about anything requiring authentication, but rather publicly available content.
Otherwise, you might end-up with requiring multiple different authentication for one image (eg: one for each layer), and that doesn't sound reasonnable.

You could use proxy-pass from the ISV (assuming the ISV has read-access to the Red Hat repository), but then the ISV has to handle the extra load of distributing the Red Hat images.

With caching, that's exactly the kind of stuff people want.

Hmm. GitHub seems to do fairly well with a flat namespace. Is the goal to make this easy to use with [...]

This is exactly what I am saying. Layers must be accessed under a namespace (foo/myimage), like the manifest itself. Doing otherwise (eg: like we have, a flat namespace for layers) is a mess to get right as far as authorization is concerned.

Ah, maybe you indent to have multiple hierarchies? That would be as flexible as an auth service mapping users to namespace/repository permissions, but I don't see a point to exposing it outside of the auth service.

Here:

v1/layers/SOMEID

means when wking wants to fetch this, the registry/auth has to decide whether wking is entitled to read SOMEID.

And this has to be done for every layer.

v2/blob/foo/bar/SOMEID

means we need to verify that wking has access to foo/bar - and this authorization is the same for all layers.

Whether SOMEID was "authorized" to be made available under foo/bar in the first place is a one-time operation, at push.

Not inside the client to check several repositories for a given layer?

This sounds messy. Having to configure your client with multiple registry endpoints in order to be able to pull a single image - ending with situations where you try to figure out why you are missing some layers.
Client side configuration doesn't fly.
And craming registry urls into the manifest for every layer doesn't sound good either.

I don't see why image hosting would be anyone's decision to make. If Alice has an image, she should be able to host it wherever she likes (modulo copyright/licensing/…), without needing to inform
another registry about her choice. And if Bob wants access to Alice's images (modulo Alice's auth service), he should be able to have his client check her registry (both implicitly via a registry preference
list, and explicitly via a command line option).

You are missing the point.
People will be allowed to push their content wherever they want, and pull things from wherever they want. Still, so far, this is envisioned as a single unit:

docker pull foo/bar --from https://whatever is expected to work as... it sounds... and retrieve all needed content without the need for some acrobatic client configuration fiddling.

Now the question raised here is how to allow certain registries to delegate the responsibility of delivering specific layers to other registries.

wking · 2014-10-30T20:52:31Z

On Thu, Oct 30, 2014 at 01:00:19PM -0700, Olivier Gambier wrote:

But the ISV can put their own auth in front of their registry.
That means they can use their own auth there, but leave access to
the Red Hat images up to the folks running the Red Hat registry.

Do you suggest there may be several layers of authentication
(authenticate -> 302 -> authenticate again)?

I'm suggesting we skip the 302 entirely. The client would:

Check the config. The primary registry is Registry-A.
Ask Registry-A for the image
- Registry-A says, 401, please auth
Ask Registry-A for the image with the asked-for auth
- Registry-A says, 404, I don't have that
Check the config. The first fallback registry is Registry-B.
Ask Registry-B for the image
- Registry-B says, 401, please auth
  …

Either way, the point still stands: is there actually a strong
"control-point" in trusting a downstream registry to do a redirect?

I'm not trusting anyone to redirect.

I don't think we are talking about anything requiring
authentication, but rather publicly available content. Otherwise,
you might end-up with requiring multiple different authentication
for one image (eg: one for each layer), and that doesn't sound
reasonnable.

I don't think a few extra auth attempts are going to sink the service.
If you don't want your client doing that, just put the public
registries first in your list (e.g. check Red Hat for an image,
falling back to the ISV's registry). That's less auth, but you'll be
leaking the fact that you're trying to download an image and the id of
that image to Red Hat (instead of leaking it to your ISV when you ask
for an image that is hosted by Red Hat).

You could use proxy-pass from the ISV (assuming the ISV has
read-access to the Red Hat repository), but then the ISV has to
handle the extra load of distributing the Red Hat images.

With caching, that's exactly the kind of stuff people want.

If the ISV wants to handle the extra bandwidth (with my scheme), they
just have to serve a local copy of the image layer. No need for any
fancy redirects ;). If they want to do that via proxy-pass and ask
you for your ISV credentials (so it's transparent to you), then great.
If they can assume the Red Hat layers are world readable, then they
don't need to ask for credentials at all.

Hmm. GitHub seems to do fairly well with a flat namespace. Is
the goal to make this easy to use with [...]

This is exactly what I am saying. Layers must be accessed under a
namespace (foo/myimage), like the manifest itself. Doing otherwise
(eg: like we have, a flat namespace for layers) is a mess to get
right as far as authorization is concerned.

Ah, I see. If we kept 1:

PUT /v1/images/(image_id)/layer

or some such, then the auth decision would be something like (for a
single PUT request):

User claims to be ‘alice’.
- Query credential store and authenticate user.
User asks for write access to image layer 8gz4tQt5.
- Query registry's atomic storage, layer 8gz4tQt5 is part of the
  ancestry path for bob/foo:latest and charlie/bar:1.2.3.
- Query credential store, ‘alice’ has write access to ‘bob/foo’.
Grant access for the PUT.

With cheap image-id → repository lookup, I think that should be fairly
straightforward. There's no need to get the registry itself serving
8gz4tQt5 under ‘alice/’, ‘bob/’, …. In fact, doing so would basically
just reproduce my proposed image-id → repository lookup (which I'd
store in the registry's atomic storage 2).

Not inside the client to check several repositories for a given
layer?

This sounds messy. Having to configure your client with multiple
registry endpoints in order to be able to pull a single image -
ending with situations where you try to figure out why you are
missing some layers.

Registries that want to support stand-alone usage for their users are
free to host all their ancestor images locally (although maybe they
use proxy-pass behind the scenes to do this).

And craming registry urls into the manifest for every layer doesn't
sound good either.

This we agree on ;).

People will be allowed to push their content wherever they want, and
pull things from wherever they want. Still, so far, this is
envisioned as a single unit:

docker pull foo/bar --from https://whatever is expected to work
as... it sounds... and retrieve all needed content without the need
for some acrobatic client configuration fiddling.

Now the question raised here is how to allow certain registries to
delegate the responsibility of delivering specific layers to other
registries.

They can't do that with 302s, or proxy-pass, or whatever they like?
If the registry wants to cache (outside of the layer metadata) a list
of possible mirror registries for a given image, that sounds great to
me. You could have a set of default mirrors for “Never heard of that
one, you might want to check with…”. Then folks could make their
primary registry whoever they trust the most.

dmp42 · 2014-10-30T21:37:29Z

I'm suggesting we skip the 302 entirely. The client would: [...]

Simple answer: no. :-)

Blind trying a bunch of services one after the other for every layer is nonsense.

I'm not trusting anyone to redirect.

What about we let @aweiteka speak for himself? This is what they do with crane currently if I understand correctly.

[the rest]

This discussion is getting largely of-topic.
I would ask you if you wish to keep these long chitchats style for irc chat sessions, or focused on tickets where they are relevant.

All this is just muddying the water here - @aweiteka has a use-case, let's try to see clear in it instead of trying to defend your opinionated opinions on things like atomic storages Mr. @wking :-)

To sum-it up:

we are NOT going with a complex system where you have to know beforehand where to find the various pieces of a single image (read: a list of registries to try and fail where layers are scattered, all with possibly different authentications). This idea is simply broken, sorry to be harsh.
the image federation idea here needs to be studied since it's being suggested, specially in the light of what's coming next (eg: image signing), but the first thing is to listen and understand to the use-case. We can argue about the nitty-gritty-techy afterwards. eg: "I'm not trusting anyone to redirect." isn't helping :-)

Hang-on ;)

aweiteka · 2014-10-31T14:25:25Z

Does image signing (coming with v2) change that situation for them?

@dmp42 Signing helps but I am assuming image signing here. Third party distribution agreements may exist but I suspect they present a large legal barrier that would kill partnership and innovation. I understand that leveraging the layered image format removes this barrier. That said, I'm not a lawyer. ;)

Also, if I understand well, crane does 302 to where the actual bits are. So, the company (content owner)
has to trust the ISV's registry (/crane) to do what's right here - which to me kind of weakens the
"control-point" - eg: I'm not sure I see a difference then between 302, proxy-pass and mirroring, from a
control standpoint.

Yes, this is how it works. The content owners control and ensure access to the actual bits. If a third party's redirect service fails, that's the third party's problem. This assumes the content owners also have a redirect service for their direct customers.

I don't think we are talking about anything requiring authentication, but rather publicly available content.
Otherwise, you might end-up with requiring multiple different authentication for one image (eg: one for
each layer), and that doesn't sound reasonnable.

This may be a separate but related discussion. It does get tricky. We have an x509 scheme passed to the client that provides access to specific paths on our CDN (Akamai). It would be good to think through and discuss this a bit more.

About the v2 protocol - it's quite likely that layers urls are going to be namespaced

@dmp42 You rightly picked up that Crane has a flat namespace. We're assuming world-readable access at the application and control authn/authz using the above mentioned x509 scheme. The proposed namespace has a lot of benefits so I wouldn't want to discourage that. I don't know if you have have it both ways.

I'm not convinced 302 redirects are ideal, but it works and is flexible. Open to other ideas.

That doesn't mean content is actually duplicated on the registry - but that inside the registry mechanics,
there are "mount points" for layers into namespaced url.

Pulp does the same thing using symlinks on traditional block storage. Copies are cheap and content is never duplicated.

When does the cdv.isv.… URL get injected into the 8da983e1fdd5 metadata?

@wking I'm assuming URL information is never in image metadata. This is registry metadata only. It's doesn't go with the layer.

have some mechanism inside the registry to redirect specific layers

@dmp42 Right, "some mechanism." I don't have strong opinions on the specific way users manage registry metadata. Ultimately we're talking about a method to sync distributed metadata efficiently, reliably, securely. Pulse, IPFS, bittorrent etc. all look interesting. @vbatts may have some thoughts here. That may be for V2.$LATER but I suggest we start v2.0 with some fundamental support.

wking · 2014-10-31T14:34:50Z

On Fri, Oct 31, 2014 at 07:25:26AM -0700, Aaron Weitekamp wrote:

When does the cdv.isv.… URL get injected into the 8da983e1fdd5 metadata?

@wking I'm assuming URL information is never in image metadata. This
is registry metadata only. It's doesn't go with the layer.

Ah, good, that means it won't conflict with image signing :). When
does the URL get injected into 8da983e1fdd5's registry metadata? Your
example push for isv/app didn't have a --redirect-url flag, so I'm
wondering how that (default?) was set in the JSON.

aweiteka · 2014-10-31T14:50:35Z

When does the URL get injected into 8da983e1fdd5's registry metadata? Your example push for isv/app
didn't have a --redirect-url flag, so I'm wondering how that (default?) was set in the JSON.

@wking Good catch. That part of the example suggests the ISV image layers are hosted on the docker hub. That's a valid use case but it doesn't match with my "pull" example where ISV layers come from cdn.isv.com.

stevvooe · 2015-01-20T19:22:06Z

Superseded by distribution/distribution#88.

dmp42 added the Next-generation label Oct 30, 2014

dmp42 added this to the GO-RC milestone Nov 5, 2014

wking mentioned this issue Nov 8, 2014

Split streaming, content-addressable storage from transactional, mutable storage #704

Open

stevvooe mentioned this issue Jan 6, 2015

Port Next-Generation Issues from docker/docker-registry distribution/distribution#35

Closed

39 tasks

ncdc mentioned this issue Jan 20, 2015

Support for layer federation distribution/distribution#88

Closed

stevvooe closed this as completed Jan 20, 2015

aweiteka mentioned this issue Jul 6, 2016

Support external url layers in other os also moby/moby#23014

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NG: Image Federation #662

NG: Image Federation #662

aweiteka commented Oct 30, 2014

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

aweiteka commented Oct 31, 2014

wking commented Oct 31, 2014

aweiteka commented Oct 31, 2014

stevvooe commented Jan 20, 2015

NG: Image Federation #662

NG: Image Federation #662

Comments

aweiteka commented Oct 30, 2014

What is image layer federation?

Why is it important? Who cares?

How does it work?

What might implementation look like?

Example Implementation

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

wking commented Oct 30, 2014

dmp42 commented Oct 30, 2014

aweiteka commented Oct 31, 2014

wking commented Oct 31, 2014

aweiteka commented Oct 31, 2014

stevvooe commented Jan 20, 2015