Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED JENKINS-41316] Switch 'inside' back to CMD, detect if entrypoint was badly designed #116

Merged
merged 6 commits into from
Jan 25, 2018

Conversation

ndeloof
Copy link
Contributor

@ndeloof ndeloof commented Sep 29, 2017

See discussion on #85

…d is well executed and warn user if this is not the case

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
jglick
jglick previously requested changes Sep 29, 2017
* @return The container ID.
*/
public String run(@Nonnull EnvVars launchEnv, @Nonnull String image, @CheckForNull String args, @CheckForNull String workdir, @Nonnull Map<String, String> volumes, @Nonnull Collection<String> volumesFromContainers, @Nonnull EnvVars containerEnv, @Nonnull String user, @Nonnull String entrypoint) throws IOException, InterruptedException {
public String run(@Nonnull EnvVars launchEnv, @Nonnull String image, @CheckForNull String args, @CheckForNull String workdir, @Nonnull Map<String, String> volumes, @Nonnull Collection<String> volumesFromContainers, @Nonnull EnvVars containerEnv, @Nonnull String user, @CheckForNull String... command) throws IOException, InterruptedException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CheckForNull on varargs?! Please no…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side effect from git revert

@@ -139,6 +146,30 @@ public String run(@Nonnull EnvVars launchEnv, @Nonnull String image, @CheckForNu
}
}

public List<String> listProcess(@Nonnull EnvVars launchEnv, @Nonnull String containerId) throws IOException, InterruptedException {
LaunchResult result = launch(launchEnv, false, "top", containerId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

top is even less likely to be there. What about ps ax? Or pgrep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is docker top command

Copy link
Contributor Author

@ndeloof ndeloof Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actual implementation for docker top (here) is to run ps on host and filter result to restrict to container's processes.

assumeDocker();
WorkflowJob p = story.j.jenkins.createProject(WorkflowJob.class, "p");
p.setDefinition(new CpsFlowDefinition(
"docker.image('maven:latest').inside {\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what is the behavior of maven:latest today? Simply breaking it does not seem like a viable option, as this image is widely used. Is there an updated version which works well? Or an option which can be passed to image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maven just works, official docker image validation process do include check for this recommendation for entrypoint to accept an alternative command.
(https://github.com/carlossg/docker-maven/blob/master/jdk-8/mvn-entrypoint.sh#L39)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maven just works

Then why do you need to delete this test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test case was removed as I reverted commit 95df0ee. It can be re-introduced.

final List<String> ps = dockerClient.listProcess(env, container);
if (!ps.contains("cat")) {
listener.error("The container started but didn't ran the expected command. Please double check your Entrypoint does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices.");
throw new IOException("Failed to run container with CMD `cat`");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to print the warning but let it continue, in case the user knows something we do not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure about this, good point

@@ -177,6 +177,12 @@ private static void destroy(String container, Launcher launcher, Node node, EnvV
}

container = dockerClient.run(env, step.image, step.args, ws, volumes, volumesFromContainers, envReduced, dockerClient.whoAmI(), /* expected to hang until killed */ "cat");
final List<String> ps = dockerClient.listProcess(env, container);
if (!ps.contains("cat")) {
listener.error("The container started but didn't ran the expected command. Please double check your Entrypoint does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ran/run

@ndeloof
Copy link
Contributor Author

ndeloof commented Sep 29, 2017

I'm not sure about this build failure, test fail on my workstation with Error response from daemon: Mounts denied: - As /var on OSX is actually /private/var I can't add this path to my docker4mac configuration

@ndeloof
Copy link
Contributor Author

ndeloof commented Sep 29, 2017

I'm not sure what initially caused JENKINS-37987.

@ndeloof
Copy link
Contributor Author

ndeloof commented Sep 30, 2017

Asked Alvaro about the initial error in JENKINS-37987 but can't get info to reproduce and investigate further

@@ -177,6 +177,11 @@ private static void destroy(String container, Launcher launcher, Node node, EnvV
}

container = dockerClient.run(env, step.image, step.args, ws, volumes, volumesFromContainers, envReduced, dockerClient.whoAmI(), /* expected to hang until killed */ "cat");
final List<String> ps = dockerClient.listProcess(env, container);
if (!ps.contains("cat")) {
listener.error("The container started but didn't run the expected command. Please double check your ENETRYPOINT does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo here: ENETRYPOINT should be ENTRYPOINT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this needs to be fixed. =)

Copy link
Member

@abayer abayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo fix needed, otherwise looks good to me.

@@ -177,6 +177,11 @@ private static void destroy(String container, Launcher launcher, Node node, EnvV
}

container = dockerClient.run(env, step.image, step.args, ws, volumes, volumesFromContainers, envReduced, dockerClient.whoAmI(), /* expected to hang until killed */ "cat");
final List<String> ps = dockerClient.listProcess(env, container);
if (!ps.contains("cat")) {
listener.error("The container started but didn't run the expected command. Please double check your ENETRYPOINT does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this needs to be fixed. =)

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 24, 2017

@cyrille-leclerc yes indeed.

@abayer
Copy link
Member

abayer commented Nov 1, 2017

@jglick - are you still a -1 on this?

@abayer abayer requested a review from jglick November 9, 2017 16:01
@jglick jglick dismissed their stale review November 13, 2017 15:20

subsequent discussion & clarifications

@@ -129,7 +134,9 @@ public String run(@Nonnull EnvVars launchEnv, @Nonnull String image, @CheckForNu
argb.add("-e");
argb.addMasked(variable.getKey()+"="+variable.getValue());
}
argb.add("--entrypoint").add(entrypoint).add(image);
if (command != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

command should be @Nonnull. If you wish to permit no specified commands (which I do not recommend, since AFAIK there is only one caller of this method, and it passes a command), then you would check command.length > 0, which would anyway be gratuitous since add(String...) would just be a no-op in that case.

assumeDocker();
WorkflowJob p = story.j.jenkins.createProject(WorkflowJob.class, "p");
p.setDefinition(new CpsFlowDefinition(
"docker.image('maven:latest').inside {\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maven just works

Then why do you need to delete this test case?

@nervo
Copy link

nervo commented Nov 21, 2017

Can i point out that forcing "cat" in entrypoint, or in cmd (and check processes if "cat" is running) breaks this example in the doc, as we have no ways to start services: https://jenkins.io/doc/book/pipeline/docker/#running-sidecar-containers

@ndeloof
Copy link
Contributor Author

ndeloof commented Nov 23, 2017

@nervo as long as you can run entrypoint you can start services. Just need to follow docker best practice to eventually execute the command passed as docker run argument.
I also don't understand your note on breaking the sidecar pattern. This only applies to docker.inside, where we need to start the container in some "infinite waiting" state

@nervo
Copy link

nervo commented Nov 23, 2017

@ndeloof thanks for your answer :)
I think i got your point: by forcing the cmd (and only the cmd) to "cat", you allow a custom entrypoint to be played. However, this custom entrypoint MUST follow best practice, and execute the cmd, and that's exactly what you check. Clever.

You are also right, i did'nt realize that th sidecar pattern use "withRun" and not "inside"...

Sorry for the polution, and thanks for your hard work.

@jchauncey
Copy link

Do you have any idea when this fix might be released? For some reason we are starting to see this problem after a recent upgrade of jenkins (without any plugin upgrades) and its causing us some issues.

@ndeloof ndeloof merged commit 6df7676 into jenkinsci:master Jan 25, 2018
@dbingham
Copy link

Help me out here @ndeloof ... where do you get the idea that a cmd must be executed by the entrypoint? In fact, the docker docs linked from the new error message say just the opposite! i.e.

The best use for ENTRYPOINT is to set the image’s main command, allowing that image to be run as though it was that command (and then use CMD as the default flags).

The example given is s3cmd for the ENTRYPOINT and --help for the CMD. In that case, forcing CMD to cat would cause the container to exit with an error code because cat isn't a valid argument to s3cmd. This is the situation I find myself in now that docker-pipeline 1.15 has broken all my builds.

@dbingham
Copy link

dbingham commented Jan 31, 2018

FWIW, I'm not doing anything crazy here... just using a pipeline that looks roughly like this:

pipeline {
  agent any

  stages{
    stage('Build') {
      agent {
        docker {
          reuseNode true
          image 'myImage'
          args '-v ... -v ... -e ...'
        }
      }
      steps{
        sh 'compile-command'
      }
    }
  }
}

The image in question here is the same one developers use in our org. So, allowing the ENTRYPOINT to run as if the image were the command (like the s3cmd example above) is what we want. So, if I'm understanding the change correctly, I either have to add an arg to override the entrypoint with cat myself to make your check happy (which seems like leaking an implementation detail... why should I have to know that you are going to check for a cat process?), or I have to build a new image specifically for Jenkin's use that behaves differently (which violates the tenet that developers do their local builds/testing using the exact same environment that Jenkins uses... which is the whole value that Docker provides here for us).

What am I missing?

@ndeloof
Copy link
Contributor Author

ndeloof commented Jan 31, 2018

read the doc further 👍

The ENTRYPOINT instruction can also be used in combination with a helper script, allowing it to function in a similar way to the command above, even when starting the tool may require more than one step.
For example, the Postgres Official Image uses the following script as its ENTRYPOINT:
(...) exec "$@"
Note: This script uses the exec Bash command so that the final running application becomes the container’s PID 1. This allows the application to receive any Unix signals sent to the container. See the ENTRYPOINT help for more details.

Actually, all official docker images are REQUIRED by Docker review to allow passing an alternate command, typically you can run docker run maven bash to inspect container filesystem, while you still can run docker run maven install to just pass a lifecycle command to maven.
This is up to your entrypoint script to detect this scenario.

Please also note bypassing entrypoint was a major regressions. Many images do rely on entrypoint script to run some initial setup, or start background services (typically, selenium image to bootstrap a X11 server).

@dbingham
Copy link

Where do you see this?

Actually, all official docker images are REQUIRED by Docker review to allow passing an alternate command

Seems hard to justify calling it “best practice” based on linked documentation that starts with “The best use for ENTRYPOINT is...” and then proceeds with an example that conflicts with your expected usage.

Yes, the other pattern is there too. But in no way does that negate the first example which is identified as the “best use”.

@ndeloof
Copy link
Contributor Author

ndeloof commented Jan 31, 2018

I'm not sure it's documented, but I have the experience to maintain official jenkins docker image and this was a requirement.
Anyway even if documentation might look unclear, just check other images, and you'll notice this entrypoint design.

@dbingham
Copy link

dbingham commented Jan 31, 2018

Fair enough. And I respect your experience. But to force a behavior that directly conflicts with public documentation for Docker based on non-public internal guidelines somewhere seems like a bug to me. How can you reasonably expect image authors to follow practices that aren’t documented clearly and which are in direct conflict with part of what is documented clearly?

Either Docker docs need fixing or this plugin does. But insisting on undocumented “standards” isn’t a solution. It is trading one regression for another.

@ndeloof
Copy link
Contributor Author

ndeloof commented Jan 31, 2018

Then maybe https://github.com/docker-library/official-images#consistency would be a better link.
Anyway docker-pipeline plugin do use docker a non standard way. As there's no way to exec in a non-running container we need to hack the standard lifecycle, so there will always be many corner cases to not work out of the box.
Consider this a requirement for docker images so they can be used by docker.inside.

@dbingham
Copy link

I think using that link and referring to it as "plugin requirements" or "plugin expectations" would be more correct than "best practices" given the public Docker docs. Maybe something like this:

ERROR: The container started but didn't run the expected command. Please double check your ENTRYPOINT does execute the command passed as docker run argument. See https://github.com/docker-library/official-images#consistency for Docker's description of the ENTRYPOINT behavior required by this plugin.

However, it is worth mentioning that your requirements are actually more robust than just a specific ENTRYPOINT behavior. If I'm not mistaken, you also require a shell available in the image (probably a specific one? bash maybe?). Perhaps you have other requirements too. It would be ideal if you would document those somewhere and then link THAT doc from the error message (instead of Docker's docs).

In that documentation, given that you require a shell, it should also be trivial to recommend a workaround for image authors that, for whatever reason, cannot or will not make their entrypoints behave as you expect. Particularly, you could specify that if your ENTRYPOINT does not follow that convention, you will need to override it with a --entrypoint flag that does. The most trivial example being something like sh -c 'eval "$@"' (I didn't test that, so obviously you'd want a tested, known good simple case like this).

The combination of better documentation around this behavior, clear expectations, and simple workarounds for when your expectations cannot be met directly would setup users for success in a much greater way than relying on a failure to happen first, and then the user to read and interpret this cryptic, misleading, and incorrect message: "ERROR: The container started but didn't run the expected command. Please double check your ENTRYPOINT does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices."

@ndeloof
Copy link
Contributor Author

ndeloof commented Jan 31, 2018

no, if the entrypoint does not follow this recommendation we should not override it, or the image will run in some unexpected state both by the plugin and image author.
What make you think we expect a specific shell ?

@dbingham
Copy link

I'm not suggesting that you automatically override it. I'm suggesting that you could document for users how they can easily override their non-conformant ENTRYPOINT if they want to.

What make you think we expect a specific shell ?

I thought I remembered reading that in one of the JIRA threads or PR threads related to this issue and/or preceding ones. But I can't find it now, so I don't recall where I thought I saw it.

But in any case, you have to run steps inside the container with something right? I assumed it is a shell. In any case, that isn't super relevant. I was just speculating as to possible other expectations that the plugin might have. Whatever they are, it would be helpful to have them documented.

I've filed a JIRA for the documentation issues and linked back to this thread for background.

As you'll see in my JIRA, it turns out there is an easier workaround than the shell based snippet I pasted above. Docker allows simply unsetting the ENTRYPOINT via a run arg of --entrypoint "". As the most trivial possible workaround, that is sufficient. If a pipeline/image author wants more advanced behavior, obviously they could override ENTRYPOINT with whatever command they wanted. Provided that their substitute command honors the plugin's expectations (of eventually executing unexpected CMD values as a command rather than an arg).

@ndeloof
Copy link
Contributor Author

ndeloof commented Jan 31, 2018

plugin indeed run cat to force the container to stay in some pending IO wait state. This is required as docker don't let us exec in a stopped container (sic). This indeed should be documented and better reported when something goes wrong (for sample, some weird container not having this common command in PATH)

@myoung34
Copy link

myoung34 commented Feb 1, 2018

Our environment rebuilds are broken due to this. They use the terraform container and no longer work.

@myoung34
Copy link

myoung34 commented Feb 1, 2018

11:53:41 $ docker run -t -d -u 10000:10000 -u 0 -w /var/jenkins_home/workspace/Environment_Rebuild_master-H4NNFES5KCILFZZIYUFIHNUT7P6FIWBPBEZA4TEKUDL2E2ILLNRA/terraform --volumes-from 381af2f7bed3bd8749f916422f7776367afc2537d86429fb8408a47172523d7b -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** hashicorp/terraform:0.11.2 cat
11:53:42 $ docker top 2463f11ecc8add34688abd78af438e3c65a657ab242b15d3ccb16edece928f13
11:53:43 ERROR: The container started but didn't run the expected command. Please double check your ENTRYPOINT does execute the command passed as docker run argument. See https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#entrypoint for entrypoint best practices.

@chrono
Copy link

chrono commented Feb 1, 2018

This seems to break a whole bunch of scripted docker.inside and declarative docker pipelines for us. What's the recommended upgrade path? Clearly the plugin changed its behaviour and is no longer working for us. Was there a deprecation that we missed?

@ndeloof
Copy link
Contributor Author

ndeloof commented Feb 1, 2018

This is pretty annoying.
Replacing entrypoint introduced a regression, and we introduced another one here. And we can't get both :'(

@dbingham
Copy link

dbingham commented Feb 1, 2018

@chrono I'm not in a position to test extensively at the moment, but if you can add a docker arg of --entrypoint="" to your calls, it should simulate the behavior before this change went in.

nervo added a commit to manala/jenkins-pipeline-library that referenced this pull request Feb 5, 2018
nervo added a commit to manala/jenkins-pipeline-library that referenced this pull request Feb 5, 2018
@nervo
Copy link

nervo commented Feb 6, 2018

I wonder if the problem is not related to forcing --user parameter. Entrypoints often need privileged access (aka. root) to works (such as starting services).
@ndeloof, is it the purpose of #130 ?

@@ -177,6 +177,11 @@ private static void destroy(String container, Launcher launcher, Node node, EnvV
}

container = dockerClient.run(env, step.image, step.args, ws, volumes, volumesFromContainers, envReduced, dockerClient.whoAmI(), /* expected to hang until killed */ "cat");
final List<String> ps = dockerClient.listProcess(env, container);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hesco on #jenkins reports that this line can itself fail, breaking the entire step for the sake of a diagnostic. Root cause unknown.

@Arshdeep88
Copy link

This is pretty annoying.
Replacing entrypoint introduced a regression, and we introduced another one here. And we can't get both :'(

Based on this comment I am assuming that the issue is not fixed? Also I am running into this issue now? Can anyone point me to some useful documentation regarding this?

@brightshine1111
Copy link

So this was merged, but Jenkins is back to overriding the docker entrypoint again somehow. I can't find where the issue is in the source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.