Skip to content
This repository has been archived by the owner on Sep 12, 2018. It is now read-only.

Registry next generation #612

Closed
dmp42 opened this issue Oct 7, 2014 · 28 comments
Closed

Registry next generation #612

dmp42 opened this issue Oct 7, 2014 · 28 comments
Milestone

Comments

@dmp42
Copy link
Contributor

dmp42 commented Oct 7, 2014

Dear community (on top of the head: @wking @bacongobbler @noxiouz @ncdc @proppy @vbatts and many others - also @shin- @samalba @jlhawn @dmcgowan ),

In a shell

Work is starting to design an entirely new "registry" - meaning new storage driver API, new image format, new http API, new architecture (eg: relation to other services), new docker engine code, and finally new technology for the service.

If you haven't seen it yet, there is a proposal for the new image format allowing "signing" there: moby/moby#8093 that triggered and fuels this desire for change.

Reading it will give you a hint on the envisioned new image format from an engine perspective.

Below, I'll try to cover all bases in a Q&A fashion. Please comment if you have more questions. If you have ideas and suggestions, you can open tickets with a title like "NG: Fantastic Idea".

Holy c! What will happen to the registry as we know it?

As a part of the docker infrastructure, the existing "V1" registry will continue to be used on production servers for the foreseeable future, delivering V1 images to V1-only docker engines (< 1.4) and "both-ways" engines (>=1.4,<2?). It might eventually be replaced by a V2 registry with a reimplementation of V1 endpoints, but that remains to be seen, since that would be a rather dull task.

As an open-source project, I'll continue to steward it and will merge interesting work and fixes from the community, and we will continue to provide security releases if need be, but it's unlikely major new features or changes will happen.

I feel that it has now reached its full "maturity" (for better or worse), and that the new extension mechanism we merged in 0.9 opens room enough to everyone to keep doing interesting things with it while the core of it will enter "maintenance" mode.

Thus registry 1.0 will be the last (and final, IMO) major release of "V1", that will likely be maintained (like I said) for at least a full year.

You said, "new technology"?

Yes. The new registry will be developed in go instead of python.

The reasons for that are:

  • reduce the "gap" inside the community and build on a common technology, using common libraries (libtrust and @dmcgowan, I'm looking at you)
  • thus easing integration test with the rest of the platform, etc
  • start with a clean slate
  • bet on a language that has a good concurrency model from the get go - no pun - https://golang.org/doc/effective_go.html#concurrency
  • while python is a robust, mature and well established technology (stack), it really starts smelling funny in a number of places - some young blood / fresh air will do us all good :)

Starting from scratch sure has its downsides, and I can't say I'm happy ditching the accumulated experience with V1/python (especially all the good work done on drivers), but in the end it's a reasonned choice, and I believe the benefits out-weight the downsides.

Why oh why change? ... the storage format

We want image signing capability. We believe we can't have it without an image format change (content addressable ids for a start).

Furthermore, the current storage format has terrible shortcomings:

  • it's hard to garbage collect
  • it has a long history of security issues
  • it's awkward to understand and use
  • it consumes space
  • it breaks too easily
  • it's not versioned, or extensible
  • it's impossible to map that format to a purely "static" delivery service
  • hence it's not possible to envision radically different distribution channels (bittorrent, filesystem, etc)

The new image format drastically simplifies the concepts:

  • an image is a json file, with a mandatory, namespaced name, a list of tarsums (eg: content-addressable layers ids), some opaque metadata, a signature
  • a layer is a binary blob, mapping to a tarsum

Exit "ancestry" (now implicit from the order of layers inside the image "manifest").
Exit "layers are images are layers".
Exit "layer json" etc.

Backward compatibility is a requirement, so, it's likely the V2 registry will be able to "generate" V1 content as well on the backend storage. Generating V2 content from V1 datastores should also be possible (might be provided by third-party scripts).

Why oh why change? ... the rest API

The current API ties to the format, and shares most of its defects (awkward, needlessly complex, not "static-able").

Also, the authentication model and relation to other bricks I consider "broken" (given how difficult it is to use/implement for most people).

The new API will be much simpler, with only a couple endpoints.

GET/PUT image manifest

PUT link layer into image

PUT layer

GET layer from image

GET list tags

And the GET part will make it super-easy to deliver the payload through a simple "static" http server.

We hence expect cache mirroring (for example) to be much more simple.

As far as authentication is concerned, the plan should be standardizing on JWT/OAuth.

Why oh why change? ... the technology - I mean, man, that really sucks, python is so cool and I barely started understanding the codebase

Change is good, man.

New things, new adventures! Be a part of it!

Why oh why change? ... the drivers API

The drivers API was never really "designed".

There was an initial interface that eventually grew organically, then was then ripped out of the registry to provide a basis for third-party drivers implementors.

It does bear the scars of its history (eg: it's butt-ugly for one thing).

The new interface will likely be way more concise and clean.

What I can think of for now is something like:

write_stream
read_stream
put
get
list
mv

Given go nature, we need to figure out what's the best way to make drivers standalone (eg: without the need to recompile the registry to use a new one).

Also, I definitely want push-resume support in there (S3 does support that, though we don't exploit it right now).

These are the two challenges that face us.

Any other cool ideas on the driver side of things, please jump in (thinking specifically about you @noxiouz and @bacongobbler).

New extensions model?

It took us a year to finally come-up with a decent extension mechanism for the V1 registry (on top of signals).

I strongly believe that good extensibility is what will make the new registry cool, and I would love to have it, well thought, from the very first version of registry V2.

Again, given go nature, we can't have dynamically (runtime) loaded standalone extensions, so, we need to figure out something also there.

HTTP based communication is fine by me (in a micro-services world), and also elegantly solve scalability and delegation problems.

Here as well, ideas are welcome :).

Do you say the previous registry was just crap entirely?

No. It did serve its purpose well, parts of it are really cool, I enjoyed stewarding it a lot, and I really think the most awesome part of it is the nascent community around it.

Now, it's not ready for the future, which is why we need to move on.

Wait! You have it all figured out?

No, not yet.

The vision is there.

We know the shortcomings.

And we did all the errors.

But it remains to be designed and built, and I want this process to happen with the community, capitalizing on the good vibe we had these past months.

So, how does this work?

I'll start a V2 (or next-gen) branch soon, so that development happens in the open and PRs can be merged, and will bring in more manpower to contribute the "foundations" (research is going on for S3 and filesystem drivers).

The plan is to figure out ASAP:

  • the drivers interface and model soon enough so that drivers author can jump on it and dogfood it
  • the extension model
  • the HTTP API

so that we can move on the actual implementation and let the community get crazy with extensions.

Also, if you have desires, wishes, ideas, please submit a ticket here, starting with "NG: " in the title. I don't think we need this to be too formal to start with - so let see how this goes.

If you want to be more involved than that, then you can definitely help with answering / triaging said tickets, or go ahead with fully-fleshed proposals and PRs (proposals can be PRs themselves I guess? do we need to be formal on that?).

Thanks again community, for it has been a very good journey so far, and I'm confident the next one will be even more awesome!

@wking
Copy link
Contributor

wking commented Oct 7, 2014

On Tue, Oct 07, 2014 at 12:14:09PM -0700, Olivier Gambier wrote:

Yes. The new registry will be developed in go instead of python.

The reasons for that are:

  • reduce the "gap" inside the community and build on a common
    technology, using common libraries (libtrust and @dmcgowan, I'm
    looking at you)
  • thus easing integration test with the rest of the platform, etc

I think this would make more sense if there was going to be more
sharing of code between the registry and the daemon/client. However,
I don't think we need any brains in the registry, since I see
provanance as a contract between the builder and signer, and
completely separate from the registry [1,2].

  • start with a clean slate

This is a benefit?

If this means we get transactional backends for free, then great :).
Otherwise, I think the current implementation scales well (just add as
many threads as you need), since there's no need to communicate
between threads.

  • while python is a robust, mature and well established technology
    (stack), it really starts smelling funny in a number of places -
    some young blood / fresh air will do us all good :)

Where are the funny smells?

@noxiouz
Copy link
Contributor

noxiouz commented Oct 7, 2014

HTTP based communication is fine by me (in a micro-services world), and also elegantly solve scalability and delegation problems.

What about some kind of a binary protocol with multiplexing of read/writes streams? Draft of HTTP/2 looks good as a concept.
We can take a look at some common binary serialization libraries (for example msgpack) and use one of them to communicate between core and plugins over tcp/unix domain socket. It allows us to implement a fast, flexible, easy-to-extend protocol. This protocol should be bidirectional to provide a full control over communication.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 7, 2014

I think this would make more sense if there was going to be more sharing of code between the registry and the daemon/client. However, I don't think we need any brains in the registry, since I see provanance as a contract between the builder and signer, and completely separate from the registry [1,2].

Tarsum verification will have to occur also on the registry side. And I would expect the registry to verify images signatures as well.

These are the area of "shared" code I'm thinking about (so, libtrust, some bits for tarsum, and probably some other "engine" code related to manipulation of the image format).

Common tooling is a plus as well. Common development guidelines, etc.

This is a benefit?

I do believe there is benefit there - yes, I know http://www.joelonsoftware.com/articles/fog0000000069.html

If this means we get transactional backends for free, then great :).

Why not? :)

Otherwise, I think the current implementation scales well (just add as many threads as you need)

It does scale.

Now, what about things breaking in not so subtle ways because of libevent minor version differences?

Or the need to call "magical" monkey patching "before" any other code, that doesn't always seem to fully do the job?

since there's no need to communicate between threads.

There is a need to communicate between threads, right now, or be bitten by eventual consistency (which is one of the reasons we use redis for that). But then one could argue this will disappear with the new drivers.

Where are the funny smells?

  • namespaces
  • packaging
  • gevent

Other things are rather a matter of taste - don't get me wrong on this though, I do like python.

@wking
Copy link
Contributor

wking commented Oct 7, 2014

On Tue, Oct 07, 2014 at 02:16:58PM -0700, Olivier Gambier wrote:

I think this would make more sense if there was going to be more
sharing of code between the registry and the daemon/client.
However, I don't think we need any brains in the registry, since I
see provanance as a contract between the builder and signer, and
completely separate from the registry [1,2].

Tarsum verification will have to occur also on the registry
side. And I would expect the registry to verify images signatures as
well.

Why? It's going to have to happen in local Docker daemons after
downloads, so I don't see much benefit in checking on the registry
side too. And with clients doing verification, I doubt anyone will
bother uploading broken signatures to the registry.

Common tooling is a plus as well. Common development guidelines,
etc.

This makes sense, although I don't really see the need for much more
development since the existing registry code is fairly stable.

Otherwise, I think the current implementation scales well (just
add as many threads as you need)

It does scale.

Now, what about things breaking in not so subtle ways because of
libevent minor version differences?

Or the need to call "magical" monkey patching "before" any other
code, that doesn't always seem to fully do the job?

Can you links to the issues where these came up? gevent is not my
favorite package, but I'd probably just pick a different Gunicorn
worker (e.g. gaiohttp 1) instead of rewriting this whole project
from scratch ;).

since there's no need to communicate between threads.

There is a need to communicate between threads, right now, or be
bitten by eventual consistency (which is one of the reasons we use
redis for that). But then one could argue this will disappear with
the new drivers.

“But when we rewrite it, we'll do a better job” is less convincing to
me than “but when we use $TOOL, $PROBLEM will no longer be an issue
because of $FEATURE [$LINK]” ;).

Where are the funny smells?

  • namespaces
  • packaging
  • gevent

I'm happy to drop our current Setuptools stuff (and gevent, see above)
;). What sort of packaging problems are you talking about? I'm still
not sold on the whole docker-registry-core pull-out. Anyhow, I think
these are things best solved incrementally.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 7, 2014

And with clients doing verification, I doubt anyone will bother uploading broken signatures to the registry.

They will (broken tarsums).

And that will result in a DOS, at best (content-addressability comes at a cost).

Can you links to the issues where these came up? gevent is not my favorite package, but I'd probably just pick a different Gunicorn worker (e.g. gaiohttp [1]) instead of rewriting this whole project from scratch ;).

The most baffling ones are here:

Now, I'm not stating that go is a magic bug-free shiny new thing and solves everything - barely stating that concurrency is not a python core feature and that I expect a better situation on that front.

Finally, we are not rewriting from scratch because "X is so superior Y" - we are rewriting from scratch because we need to break things - the fact that we are going to change language as well is a different issue IMO.

And oh, ultimately, I would love to see multiple diverse implementations of the V2 protocol (and this is what should be cool with it, in allowing to do that more easily).

I wouldn't be shocked if someone would do a nodejs registry, or... a python one.

The one I want to focus on though is this one here in go, along with this community :-)

“But when we rewrite it, we'll do a better job” is less convincing to me than “but when we use $TOOL, $PROBLEM will no longer be an issue because of $FEATURE [$LINK]” ;).

I expect the following features:

  • go concurrency model
  • go typing
  • depart from python unicode mess (this one is so obviously a python shortcoming that I forgot to even mention it - and if you remember, clearing that up was one of my first contributions, and I can't say I enjoyed it - and it seems we still have one of these lurking around unfortunately)
  • easier to integration-test with the other pieces (engine)
  • more shared code, shared tools

... to give us a more resistant codebase, easier to maintain, easier to contribute to.

I also expect our "design (a bit) more" and "think (a bit) before" approach to prove more fruitful and also easier to maintain than our previous organisation.

I'm still not sold on the whole docker-registry-core pull-out.

Well, it was ugly, but it did benefit us a lot:

  • remove large chunks of code from the registry (yes, I'm lazy :-))
  • allow third-parties to maintain their own thing all by themselves (yes, my excuse for laziness is "empowering others" :-))
  • foster the community

Anyhow, I think these are things best solved incrementally.

I hear you. I do think there are indeed strong benefits in solving things incrementally, and I think we did a lot on that front already, from 0.6 up to 0.9.

But then defining what is an "increment" and what is "disruptive" is a matter of "scale".

And I do believe the important parts to preserve here are:

  • the community
  • the spirit
  • the backend driver design + good extensibility approach

The rest is not so much if you ask me.

And I still enjoy chitchatting with you, and I hope you will keep that voice up during that new journey :-)

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 7, 2014

@noxiouz #613 for specific extension model discussion (your ideas seem close to @dmcgowan 's)

@wking
Copy link
Contributor

wking commented Oct 7, 2014

On Tue, Oct 07, 2014 at 03:21:09PM -0700, Olivier Gambier wrote:

And with clients doing verification, I doubt anyone will bother
uploading broken signatures to the registry.

They will.

And that will result in a DOS, at best (content-addressability comes
at a cost).

You can't handle this the same way that you already presumably handle
folks uploading other objectionable content?

And oh, ultimately, I would love to see multiple diverse
implementations of the V2 protocol (and this is what should be cool
with it, in allowing to do that more easily).

I was coming from more of a “I don't have time to learn idiomatic Go
or work on untangling Gentoo's Go packaging” perspective, which is
probably not compatible with “I'm going to chip in to an unfunded
Python v2 registry without help from Docker, Inc. folks” ;).

I also expect our "design (a bit) more" and "think (a bit) before"
approach to prove more fruitful and also easier to maintain than our
previous organisation.

But I can still help with the design and thinking without needing to
figure out a sane Go development system locally ;).

@bacongobbler
Copy link
Contributor

Here's my two cents on the proposal:

I can totally understand the desire to completely start from scratch. Having the ability to share packages from docker is a huge benefit to the registry. Speaking from Deis's perspective, this will require us to migrate from the old registry to the newer version which may take some time, so hopefully docker/docker will be able to maintain the v1 endpoints from an older registry for the forseeable future or we may be stuck on an older version until we update our infrastructure. That shouldn't affect the decision made here, but I wanted to give you a bit of insight on one of my concerns with this change. This is a big change, but I personally support the decision if you feel like it is a step in the right direction.

On that front...

The new image format drastically simplifies the concepts:

an image is a json file, with a mandatory, namespaced name, a list of tarsums (eg: content-addressable layers ids), some opaque metadata, a signature
a layer is a binary blob, mapping to a tarsum
Exit "ancestry" (now implicit from the order of layers inside the image "manifest").
Exit "layers are images are layers".
Exit "layer json" etc.

So what happens to config/exposed ports/etc? Is that all going into the "some opaque metadata" format? Isn't this just an aggregation of all the concepts in the v1 API and just slapping on the v2 sticker?

Starting from scratch sure has its downsides, and I can't say I'm happy ditching the accumulated experience with V1/python (especially all the good work done on drivers), but in the end it's a reasonned choice, and I believe the benefits out-weight the downsides.

Most certainly. This change does not only affect the registry as well, but it also kills off all of the current python storage driver implementations, which may be affected substantially. For example, support for obscure drivers that only a small subset of users may be completely gone in a year or two because the maintainer has no time or experience in maintaining a second Go implementation of the same driver. If there's a way we can somehow make the drivers easy to develop and maintain, I'm happy with that.

In regards to store driver changes, I propose that these drivers should be maintained separately from core docker-registry, but instead of being separated into their own separate packages like last time, it should be maintained as a separate cloud-agnostic driver package as a key/value store on certain providers. That way, if someone wants support for an Openstack swift driver, a local filesystem driver, an in-memory key/value store like https://github.com/kelseyhightower/memkv or for an S3-based driver, for example, they only have to go to one repository that supports these drivers. This means that the registry has a hard dependency on this package, but it keeps maintenance of this package outside the context of docker-registry and other users can benefit from this package (e.g. someone else needs a cloud-agnostic filesystem driver for their backend). Thoughts on that?

All I can say is "specs, specs, specs". For my own needs, I'd like a way to contribute to the project in a way that would support this change. To facilitate that, we need a document so that external maintainers (i.e. contributors to docker/docker-registry who are not affiliated with Docker Inc.) like myself can contribute in some way, whether that be with the core API, the storage drivers, etc. This includes possibly some political discussion on the technology involved (do we handle API requests with something similar to Flask like Martini, or do we want to re-implement the world with the net/http package? Do we handle dependency management from third party libraries natively or do we use something like Godep?). I'm happy and comfortable with Go, so I'd definitely like to be in the loop if at all possible with the ongoing development of the v2 API. I assume that this issue is more of a "hey, we're doing this regardless but I wanted to give you a heads-up" more than an actual proposal. ;)

To note, the above point about getting into a discussion about the technologies/frameworks used to implement registry v2 is completely optional if we don't want to open that can of worms. It could potentially end up in a flamewar between what practice is better. Still, it'd be nice if there was some kind of heads-up or a day which we can discuss these changes in greater detail would be very much appreciated :)

All I can say is... Docker Registry hack day? :D

@smarterclayton
Copy link

Regarding:

PUT link layer into image

is this to mutate an image into a new image? I.e. given image A, PUT link B -> Image B with new signature? While useful for simple clients, it also makes the registry a bit more complex to implement - might there be an advantage in only having GET/PUT images, GET layer, GET tags?

@wking
Copy link
Contributor

wking commented Oct 8, 2014

On Tue, Oct 07, 2014 at 09:21:44PM -0700, Matthew Fisher wrote:

All I can say is "specs, specs, specs". For my own needs, I'd like a
way to contribute to the project in a way that would support this
change. To facilitate that, we need a document so that external
maintainers (i.e. contributors to docker/docker-registry who are not
affiliated with Docker Inc.) like myself can contribute in some way,
whether that be with the core API, the storage drivers, etc.

+1. In fact, I'd be tempted to put the specs in a separate repository
from the implementation, to encourage folks to not get bogged down in
a particular implementation. Of course, linking out to an external
implementation would be fine.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 8, 2014

@wking

You can't handle this the same way that you already presumably handle folks uploading other objectionable content?

Right now, for most DOS or security issues, we get away with ownership verifications.

Now, content-addressable ids (vs. random ids) makes the question of "ownership" more difficult.

Also, reducing the coupling to the auth. component is something I want.

Believe me, I hate the idea of computing tarsums on the server side - but for now I can't figure a way out...

About the dev env, I want to make this easy/easier to setup for contributors, so, efforts on that front are definitely worth it.

Now, about using gentoo, who am I to lecture you? :-))) http://www.motivationals.org/demotivational-posters/demotivational-poster-16518.jpg

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 8, 2014

@bacongobbler

So what happens to config/exposed ports/etc? Is that all going into the "some opaque metadata" format? Isn't this just an aggregation of all the concepts in the v1 API and just slapping on the v2 sticker?

Well, maybe it is :-)

Layers are still layers.

Images on the other hand are no longer "a specific layer". They are a chunk of json listing layers.

Content-adressability is a major change as well.

Indeed the per-layer config ends-up in the opaque part.

And yes, the engine itself will keep working as is - it's a transport-level format change - not (yet) an engine level change.

This change does not only affect the registry as well, but it also kills off all of the current python storage driver implementations, which may be affected substantially.

This is the one thing that really bugs me.
Now, maybe we can get creative on this? maybe some "special compat driver" that would let you use old drivers through a combination of (http?) socket communication magical wrap? -> let's move that discussion over here #616

I assume that this issue is more of a "hey, we're doing this regardless but I wanted to give you a heads-up" more than an actual proposal. ;)

Makes me think I should clarify things here.

I won't lie to you: in the end, I'm the one with write-powers on the repo :-) - and I will have to make some calls, veto some things, take the blame and suffer the insults :-)

Now, what I want to try here is not some BS open-source parody where I would just dump source code and tell you guys "live with it".

I want to build an open-design process that works for all of us:

  • efficient: I don't want things to languish for months before we can reach an agreement and move onto implementation
  • usage focused: I think usage should come first, technology / tool / technical-beauty second
  • concise: I would really love to see a small, extensible core - a basic set of flexible enough ideas that would let the community go crazy with custom stuff on top of it

I don't know how much we will succeed in making that open-design process mesh with the need to deliver and ship a usable product with strong time constraints, but I really, genuinely want to try to pull this of and end-up with a stronger, better, more satisfied community (and less work for me :-)).

Any help here, I can definitely use.

I think the idea of commiting proposal and architecture notes as PR is a good one and will help managing the discussion.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 8, 2014

@smarterclayton

is this to mutate an image into a new image? I.e. given image A, PUT link B -> Image B with new signature? While useful for simple clients, it also makes the registry a bit more complex to implement - might there be an advantage in only having GET/PUT images, GET layer, GET tags?

Ah, no.

Here it goes: since (layer) ids will now be content-addressable instead of random, there will no longer be clear ownership on a given layer (you AND me can legitimately generate it).
Also, I want access control to be simpler and be "set" at push time rather than at pull time (right now, layers live flat in a non specific namespace, making auth lookup mandatory for every layer).

So, the idea would be to allow NOT pushing again something you already had access to and "linked" into another repository you have access to.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 8, 2014

@wking @bacongobbler (and others) do you want we try a irc hack session / meetup / gathering thing?

Or even a hangout?

@bacongobbler
Copy link
Contributor

please feel free to contact me on IRC/email and get things started. I'm online from 8-4PST :)

@wking
Copy link
Contributor

wking commented Oct 9, 2014

On Thu, Oct 09, 2014 at 12:11:42AM -0700, Matthew Fisher wrote:

please feel free to contact me on IRC/email and get things
started. I'm online from 8-4PST :)

I'll likely be around for those hours as well, but I personally prefer
planning this sort of thing via something less synchronous, since I
usually have better ideas after sleeping on something overnight than I
do five minutes after prompting ;).

@proppy
Copy link
Contributor

proppy commented Oct 15, 2014

/cc @govidiupl from Google.

@shykes
Copy link

shykes commented Oct 15, 2014

What about some kind of a binary protocol with multiplexing of read/writes streams? Draft of HTTP/2 looks good as a concept.
We can take a look at some common binary serialization libraries (for example msgpack) and use one of them to communicate between core and plugins over tcp/unix domain socket. It allows us to implement a fast, flexible, easy-to-extend protocol. This protocol should be bidirectional to provide a full control over communication.

@noxiouz you are describing libchan :) It uses msgpack for serialization and implements multi-plexing over http2. https://github.com/docker/libchan. @dmp42 @dmcgowan for communication with extensions, I strongly recommend using libchan, since that is the direction we're going for Docker extensions also. If one of the goals is cohesion with the rest of the Docker platform, this one is a no-brainer.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 15, 2014

Thanks @shykes

We have a preliminary backend driver implementation using libchan here: #630

and some discussion going on extensions there:
#613

and on drivers there:
#616

Drivers and extensions have different targets though, and different speed/reliability/deployments strategy requirements, so we might end-up with different solutions here - libchan is definitely a strong lead.

@visualphoenix
Copy link
Contributor

+1 on a docker-registry ng hackday. Would love to help.

@ovidiupl-g
Copy link

Greetings from Google Kirkland! Not sure this is the best place for a minor suggestion, but I'd be happy to write up a more detailed proposal separately. It might be worth adding an extra knob to the client-server protocol for handling overload and transient failures.

I believe the current client-side logic uses linear retries up to a max number of failures. First, I and several others would be really happy to see that logic evolve into exponential back-off with jitter for timeouts, disconnects and other transient failures. Ideally, that logic should be "no back-off on 302, use exponential back-off on 500, 502 and 503".

Second, we'd be really happy if the registry responded with a Retry-After header on 503 (and possibly on 3XX, if it chooses so), and if the client honored the value given in the header.

The goal is to avoid self-inflicted denial-of-service states, where Docker clients in a large-scale deployment synchronize their retries after a transient issue (e.g. temporary network failure) and slam registries at the same time. With friendly clients, randomized exponential back-off is the first line of defense, and server-controlled retry delays are the bigger hammer. (With unfriendly clients, there's always the other first line of defense :) ).

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 16, 2014

@visualphoenix nice to have you in ;)
@govidiupl definitely welcome!

IRC meetings every monday 10AM PST. Otherwise, have a look around at tickets with the next generation label.

@shreyaskarnik
Copy link
Contributor

@dmp42 and other contributors to the project, I was curious that in the docker-registry next generation implementation will there be an similar event stream just as the docker daemon which lists events like pull, push, delete, create tags and so on and so fourth. This kind of event stream will be useful to monitor events in the registry and opens up the possibility for having loosely connected consumers which monitor the events to create summary of the events occurring with the registry. If this thread is not the right place to open these kinds of requests/discussions I can open a new proposal as well, I wanted to do a temperature check first before opening a detailed proposal.

@dmp42
Copy link
Contributor Author

dmp42 commented Oct 22, 2014

@shreyu86 that should certainly be part of the new extensions model: #613

@shreyaskarnik
Copy link
Contributor

Thanks @dmp42

@nathanleclaire
Copy link

Just a random thought I've had lately: whatever form the v2 registry and mechanics around it take, it'd be really lovely to get rid of the round-tripping messages for "image layer already pushed, skipping". This seems to slow down pushes a ton and, though I'm sure the decision to do it that way was probably made for the right reasons at the time, baffles every single person that experiences it for the first time ("why can't it check them all at once?").

@coolbrg
Copy link

coolbrg commented Nov 20, 2014

👍 from my side for next generation docker registry 😄
Looking forward to it.

@stevvooe
Copy link
Contributor

Closing this, citing the existence of docker/distribution.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests