Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-58900] Improve error messages for agent disconnections inside of node #115

Merged
merged 5 commits into from Sep 5, 2019

Conversation

dwnusbaum
Copy link
Member

See JENKINS-58900.

The problem is that you get a MissingContextVariableException where you would expect to get some kind of connection-related exception (seen on some builds of Jenkins core on ci.jenkins.io where the Windows agent apparently disconnected but the error in the logs was a MissingContextVariableException). I think this is a side effect of #101 (JENKINS-41854), but I haven't investigated enough to know for sure. If so, maybe fixable by throwing some kind of exception instead of just logging here:

LOGGER.log(Level.FINE, "failing to serve {0}:{1}", new Object[] {r.slave, r.path});

CC @oleg-nenashev

@dwnusbaum
Copy link
Member Author

java.nio.file.FileSystemException: /home/jenkins/.jenkins/cache/jars/3F: No space left on device

I filed #116 to hopefully make the build a bit more robust.

@dwnusbaum dwnusbaum closed this Aug 12, 2019
@dwnusbaum dwnusbaum reopened this Aug 12, 2019
@dwnusbaum
Copy link
Member Author

win2012-4df5c0 was marked offline: Connection was broken: java.util.concurrent.TimeoutException: Ping started at 1565643975690 hasn't completed by 1565644215691

😩

@dwnusbaum dwnusbaum closed this Aug 12, 2019
@dwnusbaum dwnusbaum reopened this Aug 12, 2019
@@ -57,7 +59,14 @@
if (f != null) {
LOGGER.log(Level.FINE, "serving {0}:{1}", new Object[] {r.slave, r.path});
} else {
LOGGER.log(Level.FINE, "failing to serve {0}:{1}", new Object[] {r.slave, r.path});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not obvious to me if we were returning null here for any reason in particular. ExecutorStepTest.contextualizeFreshFilePathAfterAgentReconnection passes with the change, not sure if there are any other scenarios related to FilePathDynamicContext where the new behavior would be problematic.

@dwnusbaum dwnusbaum changed the title [JENKINS-58900] Add test for agent disconnections inside of node [JENKINS-58900] Improve error messages for agent disconnections inside of node Aug 23, 2019
@dwnusbaum dwnusbaum marked this pull request as ready for review August 23, 2019 18:42
dwnusbaum and others added 2 commits September 4, 2019 09:34
Co-Authored-By: Jesse Glick <jglick@cloudbees.com>
if (listener != null) {
OfflineCause oc = c.getOfflineCause();
if (oc != null) {
listener.getLogger().println(c.getDisplayName() + " was marked offline: " + oc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this assumes OfflineCause.toString has been meaningfully overridden. That is definitely true of SimpleOfflineCause, and seems to be done for other subtypes as well, though really it ought to have been defined as abstract to ensure that it is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants