-
Notifications
You must be signed in to change notification settings - Fork 156
Closed
elastic/elastic-agent
#10070Labels
Team:IngestIssues owned by the Ingest Docs TeamIssues owned by the Ingest Docs Team
Description
Description:
A regression introduced in #9634 can cause Elastic Agent upgrades to get stuck if an upgrade attempt fails early. This happens because the coordinator’s overrideState remains set, leaving the agent in a state that appears to be upgrading.
Affected Versions:
- Versions 8.18.7, 9.0.7
- Fixed in #9992, will be included in 9.1.4 (raised blocker), 8.19.4 (raised blocker), 9.0.8 (next patch release), 8.18.7 (next patch release)
Conditions:
This issue is triggered if the upgrade fails during one of the early checks inside Coordinator.Upgrade, for example:
- The agent is not upgradeable
- Capabilities check denies the upgrade
- Most commonly: when Elastic Agent is tamper-protected and Endpoint returns an error during action proxying — for example because the upgrade action signature is invalid, missing, or fails verification. This causes the coordinator’s override state to be stuck.
Symptoms:
- Upgrade remains stuck (Fleet shows the upgrade action in progress)
- No further upgrade attempts succeed
- elastic-agent status shows an override state indicating upgrade
Workaround:
Restarting the Elastic Agent clears the coordinator’s overrideState and allows new upgrade attempts to proceed.
Resolution:
This bug is fixed by #9992, which ensures that the coordinator clears its override state whenever an early failure occurs.
Action:
- Please add a Known Issue entry to the release notes for affected versions.
- Include the workaround (manual restart of Elastic Agent).
cc @lucabelluccini for visibility so that Support is aware
cc @ebeahan @cmacknz
lucabelluccini
Metadata
Metadata
Assignees
Labels
Team:IngestIssues owned by the Ingest Docs TeamIssues owned by the Ingest Docs Team