In roles/elasticsearch/tasks/elasticsearch-security.yml around line 697, the until condition uses elasticsearch_cluster_status_bootstrap.json.status to poll for cluster health. When the URI module returns an error (connection refused, 401, etc.) instead of valid JSON, the result dict doesn't have a .json attribute and the task fails with dict object has no attribute 'json' instead of retrying.
This is the retry loop that's supposed to wait for ES to become healthy, so transient errors during startup are exactly what it should tolerate. Adding a | default({}) guard on the .json access (or checking is defined first) would let the retry loop absorb connection errors gracefully.
Found during integration testing of #39 (scenario C1: full-stack, ES 9).
In
roles/elasticsearch/tasks/elasticsearch-security.ymlaround line 697, theuntilcondition useselasticsearch_cluster_status_bootstrap.json.statusto poll for cluster health. When the URI module returns an error (connection refused, 401, etc.) instead of valid JSON, the result dict doesn't have a.jsonattribute and the task fails withdict object has no attribute 'json'instead of retrying.This is the retry loop that's supposed to wait for ES to become healthy, so transient errors during startup are exactly what it should tolerate. Adding a
| default({})guard on the.jsonaccess (or checkingis definedfirst) would let the retry loop absorb connection errors gracefully.Found during integration testing of #39 (scenario C1: full-stack, ES 9).