Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-46515] exit the Launcher process on 4XX errors #193

Merged

Conversation

witokondoria
Copy link
Member

while keep trying connecting on 5XX.

manually tested via deletion of a nodes/agentId folder at a jenkins installation and restart of the Jenkins service: the JNLP process stops.

@witokondoria witokondoria changed the title [JENKINS-46515] exit the process on 4XX errors [JENKINS-46515] exit the Launcher process on 4XX errors Aug 31, 2017
@oleg-nenashev oleg-nenashev self-assigned this Aug 31, 2017
@oleg-nenashev oleg-nenashev self-requested a review August 31, 2017 13:27
Copy link
Member

@oleg-nenashev oleg-nenashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about this PR, but I need to do a deeper review of the code. It will be delayed a bit due to the conferences.

@jenkinsci/code-reviewers , some feedback would be really useful

@@ -539,6 +543,11 @@ private SSLSocketFactory getSSLSocketFactory()
throw x;
} else
throw e;
} catch (FileNotFoundException e) {
System.err.println("Failing to obtain "+slaveJnlpURL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is direct usage of System I/O below as well :( Maybe needs refactoring (not in this PR)

System.err.println("Failing to obtain "+slaveJnlpURL);
e.printStackTrace(System.err);
System.err.println("Will silently exit without errors");
System.exit(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that Exiting without errors && with the zero error code is fine here To be reviewed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea after this zero exit code is to take advantage of KostyaSha/yet-another-docker-plugin#186: Defining a docker template with a restart policy (on-failure) would keep the agent alive while it is registered at the Jenknins master.
Once (If ever) it gets de-registered, this modified code would stop retrying and let the container die (wont be deleted)

Nevertheless, if anyone finds an objection, the exit code could be setup via a new flag. I would keep in mind that this PR tries to cover a case where agents are being erased after a Jenkins restart, while the agent itself is/will be back alive (leaving zombie agents trying to reconnect). As the real issue is almost unreproducible, this PR is about a defensive behaviour.

@jeffret-b
Copy link
Contributor

I share Oleg's concern here. It is weird to exit without warnings and with a zero exit code. This may help in some limited situations, but it introduces unusual and dramatic behavior in all situations. As this is intended to introduce a defensive behavior for certain, hard-to-reproduce situations the proposed fix seems too broad.

I would need to understand the situations and how this proposal resolves them better before approving this approach. I would want to see about alternatives.

@witokondoria
Copy link
Member Author

Forgot about this PR...

@jeffret-b: true, it is weird exiting with a zero exit code but the current situation involves keeping forever a remoting connection attempt if the Computer at the Jenkins master has been erased.

Terminating the connection reattempts on a 404 seems safe to me (disregarding the exit status value)

Copy link
Contributor

@jeffret-b jeffret-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to accept this PR. It looks like it may help in the specific scenario intended and probably won't harm in others.

I'd like to get a second approval before proceeding, though. @oleg-nenashev or @jvz, what do you think?

@jeffret-b
Copy link
Contributor

@oleg-nenashev @jvz What do you think?

@jvz
Copy link
Member

jvz commented Mar 4, 2019

No strong opinion from me on this. I'd probably be more at ease with a feature flag added.

@jeffret-b jeffret-b merged commit 21e59c5 into jenkinsci:master Apr 30, 2019
@jtnord
Copy link
Member

jtnord commented May 28, 2019

noting that this introduced a regression in a cloud plugin.
https://issues.jenkins-ci.org/browse/JENKINS-57713

jeffret-b added a commit to jeffret-b/remoting that referenced this pull request Jun 11, 2019
…entErrorCodesHandling"

This reverts commit 21e59c5, reversing
changes made to ab254bb.
@jeffret-b
Copy link
Contributor

Reverting in #328, with the intention of creation Remoting 3.31 without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants