-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The elastic-agent upgrade command can fail even though the upgrade succeeds #3890
Comments
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
New occurrence in https://buildkite.com/elastic/elastic-agent/builds/6981 // start the control listener
if err := control.Start(); err != nil {
return err
}
defer control.Stop() According to grpc documentation:
I will try to change to a |
another occurrence: |
The simplest option here is probably stop treating EOF from the Upgrade RPC call happening here as an error. We can print out a message to check |
@cmacknz I'll see what I can do next week. |
just for the record, another instance of it: |
Looks like it's still failing if we upgrade from a version that does not have the fix (https://buildkite.com/elastic/elastic-agent/builds/8162#018eb48e-5cdf-4daf-bf77-561e91e642ee). Despite the check in upgrader I introduced elastic-agent/testing/upgradetest/upgrader.go Lines 333 to 336 in bdff582
Looks like I need to check for a substring in the command output, not the error message. Will open another PR. |
First observed in https://buildkite.com/elastic/elastic-agent/builds/5460#018c4990-62ec-4dc1-a9c0-81953d923d80. One of the package version tests failed because the
elastic-agent ugprade
command exited with an EOF error.However the upgrade had actually started looking at the agent logs:
If we look at the log timestamps in Buildkite for when the failure occurred:
We see the agent reexec-ed at
2023-12-08T13:47:58.914
and theelastic-agent upgrade
error occurred later at2023-12-08 08:58:19
(the timezones don't match the the minutes and second are accurate).It seems as if the agent re-execed before responding to the
elastic-agent upgrade
command. We can either fix that or makeelastic-agent upgrade
asynchronous and expect users and our test to watch upgrade progress through theelastic-agent status
command.The text was updated successfully, but these errors were encountered: