Support logging in to a private registry before launch #104

DRuggeri · 2018-02-02T18:24:20Z

Simple patch that adds a few attributes to the job's properties to support logging in to a private registry.

cfdreddbot · 2018-02-02T18:24:21Z

Hey DRuggeri!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA.

cf-gitbot · 2018-02-02T18:24:22Z

We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.

The labels on this github issue will be updated when the story is started.

drnic · 2018-02-08T19:39:07Z

@DRuggeri I have two thoughts - 1) perhaps move this into the docker job so its of benefit to all users, not just people using the containers job; 2) perhaps change login_endpoint to docker_endpoint, etc?

DRuggeri · 2018-02-08T20:35:47Z

Thanks for the feedback, @drnic. No problem with naming it docker_endpoint. For the job location, I guess it could go either way. I placed it in the containers job just to be closer to the containers that get launched and to make it obvious that this login is related to this container.

I have no problem moving it to the docker job - I think the following would work well there:

properties:
  logins:
  - endpoint: https://foo
    username: usr1
    password: pass1
  - endpoint: https://bar
    username: usr2
    password: pass2

The one thing I wonder... with it being in the containers job, we would do the login pretty much right before the image would be pulled. This gives us a high degree of confidence that the login token is valid. I think it may be possible that a token obtained right after docker daemon start might expire when the containers job attempts to use it at some later time.

For example...
When the instance is first laid out at hour 0, the docker login occurs in the ctl script for the docker job and the containers jobs will start and use the logged in accounts. Life is good.
Say at hour expiry+1, an update to the containers job occurs and a new container must start. If the authentication token has expired, this container would fail to pull and start.

... does that sound like a realistic issue?

drnic · 2018-02-08T20:49:39Z

Gotcha on the issue of docker login timing out after expiry + 1.

Thought: I'm assuming that all the occasions that containers job starts (or any other job) and assumes docker job has very recently run docker login, would be from monit restarting all jobs sequentially. In which case, docker job will always have been recently restarted and docker login having been run.

Is there another occurrence of another bosh job wanting to use docker outside of monit restart/bosh restart/bosh deploy/resurrector?

DRuggeri · 2018-02-08T22:15:20Z

I'm not sure if any other job would want to use docker (job or command? I assume you mean command). At the least, we know that the containers job uses it which is why a login in the docker job would be enough to permit a pull of the image from the containers job. However, it's probably not safe to assume the containers job is the only consumer of the docker daemon as laid down by the docker job.

Thanks for explaining your assumptions - this is where I'm seeking a bit of guidance. The assumption on my part is that a change only to the containers job would cause bosh to update only that job. This would have the effect of causing monit to trigger the ctl script for the containers job only. If the expiry of the token from the login in the docker job has elapsed in that time, such a restart would cause a failure pulling from the registry.

So, I guess that's the only question that's plaguing me - is it safe to assume that the docker job will have been started or restarted just before the container job? If so, we're golden and I'll go ahead and move it. If not, maybe a sane response would be to add login capabilities to both jobs to cover all bases?

drnic · 2018-02-08T22:22:31Z

I feel confident that if you move this code to docker job then your containers job will keep work. But test it first ;)

…

________________________________ From: Daniel Ruggeri <notifications@github.com> Sent: Thursday, February 8, 2018 5:15:20 PM To: cloudfoundry-incubator/docker-boshrelease Cc: Dr Nic Williams; Mention Subject: Re: [cloudfoundry-incubator/docker-boshrelease] Support logging in to a private registry before launch (#104) I'm not sure if any other job would want to use docker (job or command? I assume you mean command). At the least, we know that the containers job uses it which is why a login in the docker job would be enough to permit a pull of the image from the containers job. However, it's probably not safe to assume the containers job is the only consumer of the docker daemon as laid down by the docker job. Thanks for explaining your assumptions - this is where I'm seeking a bit of guidance. The assumption on my part is that a change only to the containers job would cause bosh to update only that job. This would have the effect of causing monit to trigger the ctl script for the containers job only. If the expiry of the token from the login in the docker job has elapsed in that time, such a restart would cause a failure pulling from the registry. So, I guess that's the only question that's plaguing me - is it safe to assume that the docker job will have been started or restarted just before the container job? If so, we're golden and I'll go ahead and move it. If not, maybe a sane response would be to add login capabilities to both jobs to cover all bases? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#104 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAAAbKk-JhOBNPkJsYzEEpA4YfbJP5bEks5tS3H4gaJpZM4R3lgO>.

DRuggeri · 2018-02-12T21:10:25Z

Thanks, @drnic. Try as I did, I couldn't get it to fail the way I thought it would. I've modified the PR as suggested to do the login in the docker job instead of the containers job.

Any additional feedback is greatly welcomed.

drnic · 2018-02-13T03:55:04Z

jobs/docker/templates/bin/ctl

+    done
+
+    # Log in to any registries
+<% p('logins', []).each do |login| %>


Sorry for not thinking of this before as we were discussing different aspects of the PR - I think this setup of docker login might go well in a pre-start hook https://bosh.io/docs/post-start.html

As a bonus then we don't need to change the exec dockerd as above.

BTW, I'm sorry for suggestion subsequent changes. I know it can annoy me when someone doesn't just press the "Merge PR" button ;)

No problem - what's right is right so we should just aim for that.

I can do that. I hope to have an update in later today.

So things turned out a little funky when moving this to post-start. When running in combination with the containers job, the start of containers fails because it "depends" on the post-start of the docker job being successful. Bosh won't move to post-start hooks if any of the start hooks fail, so we never get to the point that the login happens. Basically, a circular dependency...

I think we may be stuck with this logic in the start hook of the docker job after all.

drnic · 2018-02-13T03:55:24Z

jobs/docker/spec

@@ -132,3 +132,10 @@ properties:
  store_dir:
    description: "Path to use as the root of the Docker runtime"
    default: "/var/vcap/store"
+
+  logins:


This looks simple and nice.

drnic · 2018-02-13T21:49:58Z

Thanks for investigating the alternate! The CFCR team have a tracker story scheduled to review + merge the PR soon.

glestaris · 2018-05-15T14:40:26Z

hey @DRuggeri, sorry for failing to respond for so long. The team is uncomfortable with this change. Particularly the nohup part of it. We realised that testing in CI would be fairly complicated so we are not going to pursue this feature currently.

@iainsproat can offer more context on the testing scope
You can see some comments in https://www.pivotaltracker.com/story/show/155143142
@cloudfoundry-incubator/cfcr-core anyone can comment on the nohup concerns?

Closing for now but we can keep the conversation going and re-open if there is another potential solution.

bgandon · 2018-05-15T23:14:26Z

I'm watching this repo and I'm jumping in because was curious about this nohup question.

After doing some testing around nohup and what happens to a background child process when its parent exits, I agree to say the nohup is useless. Indeed, there is no way a SIGHUP would be send to the dockerd child process by the time the ctl script ends its execution.

There is no reason why we would need to protect the dockerd daemon from receiving SIGHUP hangup signals, either.

And when the ctl script ends its executions, its background dockerd child just keeps running. monit is happy anyway because of the --pidfile ${DOCKER_PID_FILE} argument passed to dockerd. monit just needs a PID file to track.

So, I would confirm the nohup is useless, but for the rest of this PR, I'm comfortable with the change. monit does not require daemons to be exec-ed. They can be forked as background jobs as well, provided that the PID file is properly created. monit doesn't mind if the ctl script ends or not, because it supports both cases.

But there's one problem left.

When @DRuggeri asks:

is it safe to assume that the docker job will have been started or restarted just before the container job?

He's right, the question is relevant. Co-located jobs are started in no particular order with BOSH. So no one should assume that any jobs starts before the others.

That's why this “login” feature should be implemented in the containers job. And there the pre-start script would be a good fit.

@DRuggeri, are you OK to move the docker login code to the pre-start script of the containers job?
You'll also need in this pre-start script to wait for the dockerddaemon to be up and running. In this wait loop, you'll need to implement a timeout in order not to wait forever, and just bailout with an error (i.e. an non zero exit code) when no Docker daemon shows up within a reasonable time (you'll need this timeout to be configurable, in order to support slow infrastructures).

DRuggeri · 2018-07-23T15:14:00Z

Sorry for the super slow reply. It's been hectic.

@glestaris, I'm good with removing nohup and doing some testing around that based on the reply from @bgandon. No problem... so long as it scratches the itch of using a private registry in the long run. If you can confirm the nohup weirdness is the only concern, I can give it a whirl.

Regarding @bgandon's comments about flipping this over to pre-start, I'm not sure that would work since we cannot log into the registry until after the docker job starts. The job lifecycle doco points out that all of the pre-start scripts are called and waited for termination before moving to the next phase (calling start scripts).

I think a simple, happy medium could be to move the login logic to the containers job's start script. Implementing the same wait logic for the daemon to come up there. An added side benefit is that doing so avoids some of the noise in the logs when the containers job tries to start things but the daemon is not yet up. The down side, as @drnic points out, is that only the containers job benefits.

Thoughts? I am very motivated to see this make it upstream... if for any reason, so I don't have to maintain a custom patch 😁.

owwweiha · 2018-10-22T18:42:07Z

@DRuggeri would be really great to have this feature in the upstream... I would really need it ;)

DRuggeri · 2018-10-23T16:57:52Z

@Onke, agreed, but it doesn't seem like the maintainers are too interested. Still awaiting feedback.

cf-gitbot added the unscheduled label Feb 2, 2018

DRuggeri mentioned this pull request Feb 2, 2018

Issue while downloading the Image from Private registry #77

Closed

Support logging in to a private registry

77512c7

drnic reviewed Feb 13, 2018

View reviewed changes

jhvhs assigned iainsproat Feb 16, 2018

glestaris closed this May 15, 2018

Nibooor mentioned this pull request Dec 1, 2020

Add configurable login parameters #152

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support logging in to a private registry before launch #104

Support logging in to a private registry before launch #104

DRuggeri commented Feb 2, 2018

cfdreddbot commented Feb 2, 2018

cf-gitbot commented Feb 2, 2018

drnic commented Feb 8, 2018

DRuggeri commented Feb 8, 2018

drnic commented Feb 8, 2018 •

edited

Loading

DRuggeri commented Feb 8, 2018 •

edited

Loading

drnic commented Feb 8, 2018 via email

DRuggeri commented Feb 12, 2018 •

edited

Loading

drnic Feb 13, 2018

drnic Feb 13, 2018

DRuggeri Feb 13, 2018

DRuggeri Feb 13, 2018

drnic Feb 13, 2018

drnic commented Feb 13, 2018 via email

glestaris commented May 15, 2018

bgandon commented May 15, 2018

DRuggeri commented Jul 23, 2018

owwweiha commented Oct 22, 2018

DRuggeri commented Oct 23, 2018

Support logging in to a private registry before launch #104

Support logging in to a private registry before launch #104

Conversation

DRuggeri commented Feb 2, 2018

cfdreddbot commented Feb 2, 2018

cf-gitbot commented Feb 2, 2018

drnic commented Feb 8, 2018

DRuggeri commented Feb 8, 2018

drnic commented Feb 8, 2018 • edited Loading

DRuggeri commented Feb 8, 2018 • edited Loading

drnic commented Feb 8, 2018 via email

DRuggeri commented Feb 12, 2018 • edited Loading

drnic Feb 13, 2018

Choose a reason for hiding this comment

drnic Feb 13, 2018

Choose a reason for hiding this comment

DRuggeri Feb 13, 2018

Choose a reason for hiding this comment

DRuggeri Feb 13, 2018

Choose a reason for hiding this comment

drnic Feb 13, 2018

Choose a reason for hiding this comment

drnic commented Feb 13, 2018 via email

glestaris commented May 15, 2018

bgandon commented May 15, 2018

DRuggeri commented Jul 23, 2018

owwweiha commented Oct 22, 2018

DRuggeri commented Oct 23, 2018

drnic commented Feb 8, 2018 •

edited

Loading

DRuggeri commented Feb 8, 2018 •

edited

Loading

DRuggeri commented Feb 12, 2018 •

edited

Loading