Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ExitLifecycle when running in a container #1268

Merged
merged 1 commit into from
Jan 11, 2022

Conversation

basil
Copy link
Member

@basil basil commented Jan 11, 2022

When running in a container, the default UnixLifecycle (with all its fragile low-level systems logic) is unnecessary. The container orchestrator effectively functions as the service manager, so allow the process to exit and be restarted by the container orchestrator rather than trying to do this ourselves in UnixLifecycle. Similar changes should be made to the non-container-based packaging, but this is a start.

I tested this by running Jenkins in Docker with this change and applying a Docker restart policy of always to the container. (I assume Kubernetes has its own equivalent for this, but I don't use Kubernetes.) I verified that by default -Dhudson.lifecycle=hudson.lifecycle.ExitLifecycle was set and that after going to /safeRestart the process exited and Docker restarted the container automatically, resulting in the same behavior as with UnixLifecycle. I also verified that when manually adjusting JAVA_OPTS to override -Dhudson.lifecycle that my overridden choice was respected.

Fixes jenkinsci/helm-charts#529

@timja timja merged commit 0ab3a31 into jenkinsci:master Jan 11, 2022
@MarkEWaite
Copy link
Contributor

MarkEWaite commented Jan 11, 2022

When I add that setting to my Docker image run script, Jenkins does not restart after I've installed a new plugin and clicked the "Restart Jenkins when installation is complete and no jobs are running" checkbox.

That behavior happens when I run the script in the foreground or use the docker --detach argument to detach and run in the background.

It restarted previously when I used the default value for the hudson.lifecycle. I think that we need more explanation for users or more guidance for those that run with a docker command like I do.

If I use the /restart URL in my configuration, the docker process exits. Previously, it would restart the process and continue operating. I think this will be perceived as a regression by users running the docker command.

If I add the --restart command to my Docker command line, then I can use the /restart URL to restart Jenkins. Unfortunately, I can't use the "Restart Jenkins when installation is complete and no jobs are running" checkbox even with the --restart argument.

@basil basil deleted the lifecycle branch January 11, 2022 20:24
@timja
Copy link
Member

timja commented Jan 11, 2022

Right it would do that, might be better to move this up to the orchestrator level instead

@basil
Copy link
Member Author

basil commented Jan 11, 2022

You just need to add --restart on-failure to your docker run command, or modify your existing container restart policy with docker update --restart on-failure <container-id>. I agree that this should be called out as a breaking change in the release notes with the above advice suggested for any users that use Docker without a container orchestrator like Kubernetes.

@MarkEWaite
Copy link
Contributor

I think we should reconsider this change so that we are not introducing a breaking change to Docker behavior in the same release that delivers security fixes. Jenkins 2.330 and Jenkins 2.319.2 releases tomorrow are security releases. They will use this Docker image packaging.

We can bring the change back right after the security release is delivered. I think it is the right change, but introducing it at the same time as a security release is too disruptive for users.

@MarkEWaite
Copy link
Contributor

You just need to add --restart on-failure to your docker run command, or modify your existing container restart policy with docker update --restart on-failure <container-id>. I agree that this should be called out as a breaking change in the release notes with the above advice suggested for any users that use Docker without a container orchestrator like Kubernetes.

I used --restart always and that restored the behavior of the /restart URL. It did not restore the behavior of the plugin manager checkbox "Restart Jenkins when installation is complete and no jobs are running"

@timja
Copy link
Member

timja commented Jan 11, 2022

yeah my bad, will back it out.

@basil
Copy link
Member Author

basil commented Jan 11, 2022

I don't feel strongly that this change be delivered as part of the next security release, but I do feel that this is the right layer for the change (rather than the Helm chart). Even when Docker is used without Kubernetes, I believe that using its restart policy to manage the container lifecycle is preferable to managing this ourselves with exec.

timja added a commit to timja/jenkins-ci.org-docker that referenced this pull request Jan 11, 2022
@basil
Copy link
Member Author

basil commented Jan 11, 2022

I used --restart always and that restored the behavior of the /restart URL. It did not restore the behavior of the plugin manager checkbox "Restart Jenkins when installation is complete and no jobs are running"

I suspect user error. I tested this scenario and it works fine for me.

timja added a commit that referenced this pull request Jan 11, 2022
timja added a commit that referenced this pull request Jan 11, 2022
@MarkEWaite
Copy link
Contributor

I suspect user error. I tested this scenario and it works fine for me.

When I use unmodified jenkins/jenkins:2.319.1, the "Restart Jenkins when installation is complete and no jobs are running" checkbox works. When I use my configuration from lts-with-plugins, the "Restart Jenkins when installation is complete and no jobs are running" checkbox is always enabled and has no effect. I'll continue exploring the differences between my initial configuration and the default initial configuration.

@basil
Copy link
Member Author

basil commented Jan 11, 2022

The scenario I tested was:

  • jenkins/jenkins:latest (90afe7999758) with jenkins.sh patched with the changes in this PR
  • Container started with restart-policy of on-failure
  • Out-of-date plugins

In this scenario I verified that:

  • ExitLifecycle was passed in via JVM arguments
  • /safeRestart correctly resulted in the JVM process exiting and Docker restarting the existing container (verified by monitoring docker ps)
  • Upgrading the plugins gave me the option to check "Restart Jenkins when installation is complete and no jobs are running", and after checking this box I get the same results as /safeRestart

Furthermore looking at the code I see no difference between the /safeRestart and "Restart Jenkins when installation is complete and no jobs are running" code paths, so these results make sense.

@basil
Copy link
Member Author

basil commented Apr 7, 2022

We can bring the change back right after the security release is delivered. I think it is the right change, but introducing it at the same time as a security release is too disruptive for users.

But three months later?

@MarkEWaite
Copy link
Contributor

We can bring the change back right after the security release is delivered. I think it is the right change, but introducing it at the same time as a security release is too disruptive for users.

But three months later?

I think we've had enough time that we should bring this back. The minor change in the use case that I have found (restart causes my docker process that appears to be running in the foreground to switch to running in the background) is easy enough to document and not important enough to offset the benefits of moving lifecycle management to Docker.

@basil
Copy link
Member Author

basil commented Apr 7, 2022

We already have a PR open at #1270 so my understanding is that it just needs to be approved and merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set lifecycle to exit
3 participants