New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better auth support for swarm-mode services running images in private repos #24940
Comments
If I've understood it correctly, running |
Most of the rolling update issues should be addressed with secrets, but I am thin on the details. The root problem here is that we are sending hub passwords. If this was a token, the hub could change the credentials without affecting the token. |
@borjaburgos @fermayo is this something that you can help with? I'm reading up on the OAuth registry spec. I guess it would be something like:
This should work for DTR too I suppose |
@friism at the moment, tokens given by the registry authentication service only live for 5 minutes and cannot be refreshed. However, users can generate an API key in Docker Cloud that can be used to authenticate against our registry using it as a password. That API key is valid even after password refreshes, and can be revoked from Docker Cloud. |
@fermayo There exists the concept of identity tokens. Such a key could also defer access to refresh the authorization. Let's think outside the box on what we can do with garant tokens. The current thinking that they are limited to just the registry or are limited to 5 minutes is preventing us from implementing further workflows. We haven't even scratched the surface of what is possible with the access model provided by garant. Let's build up on this solid base rather than continuing to re-invent the wheel. Registries don't care either way. cc @dmcgowan |
@fermayo can a user provide that API key base64 encoded in a way that Docker recognizes? If so, we could add that token, and change |
... or would it make sense for Docker Engine to auto-provision an API key (if possible) during |
Right now it's the full password. |
@diogomonica the user can provide that API key as a replacement of their password (i.e. providing it on |
/cc @simonferquel Also see the discussion on #31878 (comment), which contains information about (e.g.) the AWS credentials helper |
Oh, and my comment here #31632 (comment) (interested on thoughts on that one) |
Is there any way we can change the way credential stores and helpers work, so that they are not just a client side feature, and we can use them server side as well? If so, then we could allow the swarm service to pick which credential store and helpers they want to use, to help manage their registry credentials. It would be really cool if there was a swarm secrets credstore, where we can store the registry password as a swarm secret, and then if you need to change your password you just need to change the swarm secret, and ideally this doesn't trigger a rolling update of your service. This would also work well with the registries that use short lived tokens like ECR and the AWS ECR credential helper. When ever Swarm needs to auth, it will reach out via API to get a valid AUTH token, and it would never need to be manually rotated. Just an idea, are there any other current ideas, that haven't been posted to this issue yet? |
@kencochrane That sounds right on target. Secrets should be used here and we should introduce "external secrets". I am not sure what the current plans for this are. |
@kencochrane we're thinking about it. Currently creating "secret types" for our external stores. I believe "registry auth" will probably be one of those types. /cc @cyli @aaronlehmann |
Ok, thanks. If I remember correctly you can't change a swarm secret, once it is made. You need to rotate it out. Would these different secret types be able to change? If not, it wouldn't work for auth that uses short lived tokens. Since the registry auth secrets don't need to have access inside of a container, hopefully we can treat them differently and allow updating, or some other way for them to be dynamic. Like the way the ECR cred helper uses IAM roles to get a short lived auth token via an API. |
Ideally we would have some kind of plugin API that lets the manager request short-lived tokens for every task that's created, and provide individual tokens along with those tasks, rather than providing the actual credentials to all worker nodes. I'd really like to move away from the model where workers handle the credentials and request tokens for themselves. |
@aaronlehmann that works for me too and it's somewhat like what I proposed here: #24940 (comment) From the standpoint of Docker for AWS this is also ideal since only managers would required the privileged registry read-only role. |
Cross-posting what I think is the current path forward (cc @diogomonica):
|
Hi. Is this implemented yet? Is there any workaround to keep swarm deploying containers over time with short lived tokens? |
Still waiting for this feature. Managing 20+ service with private registry is not easy when you rotate or scale node count |
Given all these issues, I am thinking of just setting up a private registry that I can control that does not have the short lived tokens issue perhaps with I just want to use |
I ended up just paying for Docker Hub. |
I decided last week to use a credential manager with our Swarm. Been using Swarm for quite some time on Ubuntu 16.04 with about 6 nodes. It was set up by going through each node and using I decided to use the https://docs.docker.com/engine/reference/commandline/login/ I followed the instructions here: docker/docker-credential-helpers#102 Using latest 0.6.0 with gpg2. It tested fine as the instruction indicated. Did a simple test of pulling and pushing images to the private registry. When I removed one of my services for testing on the node using I have been reading the short-comings of credential manger with the Swarm. Is there one that works with the Swarm? I'd hate to keep pressing on to hit another wall. Appreciate your inputs on this. |
@RAKedz note that the credential-helper on the node is not used in a swarm setup (unless that node is the manager from which the service is created, or updated); In a swarm, credentials are distributed through the RAFT store; when you create a service, and add the
The service's tasks are now scheduled, and the node on which a task is deployed gets the credentials that are stored in the raft store; the node will pull the image using those credentials. It's important to know that credentials may expire (depending on the authentication mechanism used by the registry you're using); if a task is re-deployed to a different node (for example, a task failed, and a new one is started to replace it), and credentials have expired, pulling the image will fail, because the service only has the credentials that were stored at the time the service is created. If you update a service, passing |
@thaJeztah thanks so much for describing how this works. I would basically just install the credential helper on the managers and place I also read about this credential helper See the two links below: https://hackernoon.com/getting-rid-of-docker-plain-text-credentials-88309e07640d I will now go back to my credential helper journey. Will keep you posted. |
I am trying to migrate my docker swarm from using images on docker hub to images inside ECR. In summary it seems to say that it is not possible to use docker swarm with ECR repositories due to the credential token expiring. The best solution I can see is https://medium.com/@MahmoudGaballah/ecr-for-docker-swarm-fdea3a9b01b1 - creating an ECR Proxy but this is complicated. Is the correct current situation as of Mar 2021 that swarm is incompatabile with ECR? Is it a work in progress? |
@rmetcalf9 I would recommend NOT using ECR, instead just run your own registry like Sonatype Nexus. This allows you to decouple yourself from AWS if needed. |
@trajano Thanks for the response. I think you are right. I am in the process of setting up a repo. Fustrating though this should be one of the simple use cases! |
@rmetcalf9 if you want to be parcimonious initially you can use sonatype nexus Here's my setup (note I am still on Traefik 1.7 at work been too busy to move onto 2.x).
Mind you, you WILL get sticker shock when your team grows and you want to get the Pro version of Nexus. So another alternative that you can have would be JFrog but I don't think they have a good free tier. |
This is a follow-up to #24372
The current flow to deploy service with image in private repo is something like:
docker login
docker service create --registry-auth <private-repo-url>
With
--registry-auth
, registry authentication is passed to swarm and swarm passes it to the worker nodes when tasks are created so they can successfully pull images from the private repo.When this flag is used with Docker Hub / Cloud, resetting the Hub password causes credentials to be rolled (and similarly when creds are rolled with other authenticated registry implementations). That will then cause running apps relying on that registry/auth to randomly start failing. This is a bad failure mode, because a password update or cred roll (potentially even by a different user) will cause seemingly unrelated apps to slowly start failing. Because of the tight coupling, the only correct way to update credentials is to carefully coordinate the update with
service update
for all services that rely on a private repo.I don't know what's a better design, creating this issue for tracking.
cc @stevvooe
The text was updated successfully, but these errors were encountered: