Skip to content

Add retry logic to early package installs and GPG key download #46

@Oddly

Description

@Oddly

The main service package installs (elasticsearch, kibana, logstash, beats) all have `retries`/`until`/`delay`, but several early-stage package installs and the GPG key download do not. These run before any Elastic packages are installed and are vulnerable to transient network issues, DNS hiccups, or package manager lock contention.

Missing retries:

  1. `roles/repos/tasks/debian.yml:10` — "Ensure Elastic Stack key is available" (`get_url` to download GPG key, no retries)
  2. `roles/repos/tasks/debian.yml:2` — "Ensure gpg exists" (`apt` install of gpg/gpg-agent, no retries)
  3. `roles/repos/tasks/redhat.yml:7` — "Ensure gpg exists" (`package` install of gnupg, no retries)
  4. `roles/repos/tasks/redhat.yml:28` — "Install crypto-policies-scripts" (`package`, no retries)
  5. `roles/elasticsearch/tasks/main.yml:126` — "Install openssl if security is activated" (`package`, no retries)
  6. `roles/elasticstack/tasks/packages.yml:17` — "Install packages for security tasks" (`package` install of unzip, python3-cryptography, etc., no retries)

Suggested fix:

Add the same `retries: 3` / `delay: 10` / `until: result is success` pattern already used by the main service installs. The GPG key download (`get_url`) should also get retry logic since it's the single point of failure for the entire repo setup on Debian.

All `ansible.builtin.uri` tasks in the codebase already have retry logic — this gap is limited to package/get_url tasks in the early setup phase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions