Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-55287] Improve shutdown-related logging #354

Merged
merged 3 commits into from
Aug 11, 2020

Conversation

dwnusbaum
Copy link
Member

@dwnusbaum dwnusbaum commented Mar 30, 2020

See JENKINS-55287.

While looking through some potentially-relevant code for that issue, I noticed that the logging when saving CpsFlowExecutions using the PERFORMANCE_OPTIMIZED durability level was a bit inconsistent, didn't really have a single clear message of whether everything was successful or not, didn't log anything for a few conditions that should never happen, used a default impl of toString (e.g. MyClass@hash) in some cases that were intending to print the name of the flow execution, and mentioned shutdown in some cases where Jenkins isn't shutting down.

Opening as a draft PR for now, because I want to look through the logs of the tests before and after the change to see if anything interesting stands out, and maybe add some assertions on specific log messages to existing tests. I did check the output of some tests in PersistenceProblemsTest just as an overall sanity check.

More detailed analysis of JENKINS-55287 is in #363 (which incorporates this PR). I think this PR is still useful, so I am marking it as ready for review so it can be reviewed and merged independently from #363

LOGGER.log(Level.FINE, "Successfully saved program for: {0}", this);
} else {
persistOk = false;
LOGGER.log(Level.WARNING, "Unable to persist program because it was never loaded for: {0}", this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should ever happen, but definitely worth a log message if it does!

}

} else {
persistOk = false;
LOGGER.log(Level.WARNING, "No FlowNode storage for: {0}", this);
Copy link
Member Author

@dwnusbaum dwnusbaum Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, should never happen, but if it does, I definitely want to know.

@@ -1978,66 +1978,80 @@ private void checkpoint() {
// Nothing to persist OR we've already persisted it along the way.
return;
}
LOGGER.log(Level.INFO, "Attempting to save a checkpoint of {0} before shutdown", this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message and the one at the end of CpsFlowExecution.checkpoint are now the only messages that are logged at INFO level if everything is successful. Any errors are logged at WARNING level.

}

// Try to ensure we've saved the appropriate things -- the program is the last stumbling block.
try {
final SettableFuture<Void> myOutcome = SettableFuture.create();
LOGGER.log(Level.INFO, "About to try to checkpoint the program for build"+this);
LOGGER.log(Level.FINE, "About to try to checkpoint the program for: {0}", this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we only care about the overall failure/success, so this was moved to FINE.

} catch (ExecutionException | InterruptedException ex) {
persistOk = false;
LOGGER.log(Level.FINE, "Error saving program, that should be handled elsewhere.", ex);
LOGGER.log(Level.WARNING, "Error saving program for: " + this, ex);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved up from FINE to WARNING since the logging for exceptions in onSuccess and onFailure was moved down to FINE.

if (programPromise != null && programPromise.isDone()) {
runInCpsVmThread(new FutureCallback<CpsThreadGroup>() {
@Override
public void onSuccess(CpsThreadGroup result) {
try {
LOGGER.log(Level.INFO, "Trying to save program before shutdown "+this);
LOGGER.log(Level.FINE, "Trying to save program for: {0}", CpsFlowExecution.this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to CpsFlowExecution.this so that the log messages have a string like CpsFlowExecution[Owner[JobName/BuildNumber:JobName #BuildNumber]] instead of CpsFlowExecution$8@RandomHash.

if (persistOk) {
LOGGER.log(Level.INFO, "Successfully saved a checkpoint of {0} before shutdown", this);
} else {
LOGGER.log(Level.WARNING, "Unable to save a checkpoint of {0} before shutdown. The build will probably fail when Jenkins restarts.", this);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have one message to look for per build that will show up if any of the program, flow nodes, or build itself were unable to be saved successfully.

@@ -245,7 +244,6 @@ public void inProgressNormal() throws Exception {
}

@Test
@Ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests have been ignored since they were added in #223. They were unignored when the tests were duplicated over in jenkinsci/workflow-job-plugin#104 (see jenkinsci/workflow-job-plugin@35b435d), so I think the fact that they are still ignored here was just an accident. Since I was running these tests anyway to test the logging changes, I went ahead and unignored them.

@dwnusbaum dwnusbaum marked this pull request as ready for review August 10, 2020 20:59
@dwnusbaum dwnusbaum changed the title [JENKINS-55287] Improve shutdown-related logging to help investigate JENKINS-55287 [JENKINS-55287] Improve shutdown-related logging Aug 10, 2020
@bitwiseman bitwiseman closed this Aug 11, 2020
@bitwiseman bitwiseman reopened this Aug 11, 2020
@dwnusbaum dwnusbaum merged commit a86a2c4 into jenkinsci:master Aug 11, 2020
@dwnusbaum dwnusbaum deleted the JENKINS-55287-logging branch August 11, 2020 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants