-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not swallow IOException in case it is not recoverable #41
Conversation
@jenkinsci/code-reviewers, pretty please. I would like to get this released soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on what IOException
s we'd expect to be recovering from after this change - the point of the retry seems to be to overcome any network issues which this change now seems to be preventing, so why still have the retry?
// Premature connection close | ||
private boolean isRecoverable(String message) { | ||
List<String> knownFailures = Arrays.asList( | ||
"Connection refused", "Connection reset", "Connection timed out", "No route to host", "Premature connection close" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this list not be made a static field in this class rather than re-generating it for every failed connection attempt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be
I do not follow. It is supposed to retry on exception that is considered recoverable and throws those that are not right away. |
6ed4ffd
to
43c463d
Compare
LGTM |
…low IOException in case it is not recoverable.
43c463d
to
44d7e2f
Compare
Sorry, I'd read the logic the wrong way round! However, we're being very specific about the exceptions we're allowing to try and recover from. Are any exceptions outside of the list you've specified likely to be transient conditions that the next connection attempt may overcome? The IllegalArgumentException that's being shown in the logs is misleading ('This may be a bug in Jenkins'), and should be handled better, but I'm not convinced that white-listing certain messages is a good way to go. |
What you are referring to is probably:
This is no longer possible with this patch. The old code retried iff it knew it was recoverable and continued execution if it was not considered recoverable causing vague exception of first connection use ( |
I'd probably prefer to see a flag set inside the I'm not clear on what errors we think we're going to fail-fast with (you've covered off a large number of the normal network failures on your recoverable list), or the impact of re-attempting connection if they occur, so I'd just log them and continue trying unless there's a security or performance impact on doing so. This means we're consistent on re-trying for all network failures, but give the user some indication of what they are, and then don't print the |
That is not necessary. The only way for that retry loop to ever finish is by successful connection or throwing an exception.
See above. That is what this fix does exactly.
I do not know where to find an authority to provide the necessary guidance here. The retry code was in effect for a while so I see this as a step to more complete list that will likely be a subject of further refinement. I have been using this for testing and in production for a couple of days and it work better. |
I will go an and release this later this week unless there are objections. |
We tried building this patch and updating our Jenkins to use ssh-slaves plugin with this patch included but still get failure as below. The key is available on the server though and I'm able to manually login to the system through my own SSH client. ERROR: Server rejected the 1 private key(s) for jenkins (credentialId:test-credential/method:publickey) |
@zxiiro, that seems unrelated to this patch. The |
@olivergondza it does seem to be a different issue then. I'll continue to investigate and open a new issue. Did something change in OpenStack plugin around authentication? because this works if we revert OpenStack plugin to 2.11. |
This is a change on openstack side of things: jenkinsci/openstack-cloud-plugin#137 |
There are 2 changes here:
https://issues.jenkins-ci.org/browse/JENKINS-41163
https://issues.jenkins-ci.org/browse/JENKINS-26379