Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Agent] Improve error log messages when agent binary cannot be downloaded #104

Closed
Tracked by #26930 ...
joshdover opened this issue Dec 13, 2021 · 5 comments · Fixed by #308
Closed
Tracked by #26930 ...

[Agent] Improve error log messages when agent binary cannot be downloaded #104

joshdover opened this issue Dec 13, 2021 · 5 comments · Fixed by #308
Assignees
Labels
debugging Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.3.0

Comments

@joshdover
Copy link
Contributor

When a user initiates remote Agent upgrades from the Fleet UI, upgrades can fail due to firewalls or similar from preventing the agent binary from being downloaded. In this case, the error in Agent logs will look something like:

13:39:33.586 elastic_agent [elastic_agent][info] 2021-12-08T13:39:33+01:00 - message: Application: [24abba90-a8e6-430d-a889-33ec756b3574]: State changed to UPDATING: Update to version '7.16.0' started - type: 'STATE' - sub_type: 'UPDATING'
13:41:35.011 elastic_agent [elastic_agent][error] 2021-12-08T13:41:35+01:00 - message: Application: [24abba90-a8e6-430d-a889-33ec756b3574]: State changed to FAILED: failed upgrade of agent binary: 2 errors occurred:
	* package '/Library/Elastic/Agent/data/elastic-agent-fd322d/downloads/elastic-agent-7.16.0-darwin-x86_64.tar.gz.sha512' not found: open /Library/Elastic/Agent/data/elastic-agent-fd322d/downloads/elastic-agent-7.16.0-darwin-x86_64.tar.gz.sha512: no such file or directory
	* fetching package failed: context deadline exceeded (Client.Timeout or context cancellation while reading body)

 - type: 'ERROR' - sub_type: 'FAILED'
13:41:35.011 elastic_agent [elastic_agent][error] failed to dispatch actions, error: failed upgrade of agent binary: 2 errors occurred:
	* package '/Library/Elastic/Agent/data/elastic-agent-fd322d/downloads/elastic-agent-7.16.0-darwin-x86_64.tar.gz.sha512' not found: open /Library/Elastic/Agent/data/elastic-agent-fd322d/downloads/elastic-agent-7.16.0-darwin-x86_64.tar.gz.sha512: no such file or directory
	* fetching package failed: context deadline exceeded (Client.Timeout or context cancellation while reading body)

13:41:35.011 elastic_agent [elastic_agent][error] Elastic Agent status changed to: 'error'
13:46:10.847 elastic_agent [elastic_agent][info] Elastic Agent status changed to: 'online'

It's not obvious exactly what the issue is becuase:

  • The no such file or directory error message seems to indicate a missing file, which is not obvious that it comes from a download issue
  • The context deadline exceeded (Client.Timeout or context cancellation while reading body) error message is a little obscure and could be improved with a more explicit message like Agent binary could not be retrieved.
@joshdover joshdover added Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Dec 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@nimarezainia
Copy link
Contributor

nimarezainia commented Jan 7, 2022

Documenting what was discussed earlier as a small modification to the download timeout and making it a bit more robust:

  • Default agent.download.timeout to be extended to 10minutes
  • Every minute log an informational alert that at a minimum identifies the agent and logs the transmission rate achieved, and x of y bytes downloaded.
  • When the time out is reached there should again be an informational alert (if need be this can be a warning) timeout and the alert progression interval to be configurable via UI as part of [Fleet] Give the ability to set up elastic agent download timeout kibana#121069

[fyi @blakerouse @michel-laterman @jlind23 ]

@jlind23
Copy link
Contributor

jlind23 commented Mar 29, 2022

@blakerouse any update on this one?

@jlind23 jlind23 added v8.3.0 and removed v8.2.0 labels Mar 29, 2022
@jlind23
Copy link
Contributor

jlind23 commented Mar 29, 2022

Per discussed with @blakerouse postponing it to 8.3
cc @nimarezainia @ph

@jlind23
Copy link
Contributor

jlind23 commented Apr 13, 2022

@blakerouse we discussed yesterday with @ph and @nimarezainia the outcome was that we shouldn't backport it and rather keep it for 8.3 and every further versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debugging Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.3.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants