The entire Elastic Fleet went Unhealthy after changing Fleet settings. Any ideas how I can make it work again? #15100

Lastautumnleaf · 2025-10-02T10:21:58Z

Lastautumnleaf
Oct 2, 2025

Version
2.4.180

Installation Method
Security Onion ISO image

Description
configuration

Installation Type
Standalone

Location
on-prem with Internet access

Hardware Specs
Exceeds minimum requirements

CPU
32

RAM
34

Storage for /
480G

Storage for /nsm
6000G

Network Traffic Collection
other (please provide detail below)

Network Traffic Speeds
1Gbps to 10Gbps

Status
Yes, all services on all nodes are running OK

Salt Status
No, there are no failures

Logs
On elastic-agent status it says:
┌─ fleet
│ └─ status: (HEALTHY) Connected
└─ elastic-agent
├─ status: (DEGRADED) 1 or more components/units in a failed state
├─ audit/file_integrity-so-manager_logstash
│ ├─ status: (HEALTHY) Healthy: communicating with pid '2007'
│ ├─ audit/file_integrity-so-manager_logstash
│ │ └─ status: (FAILED) could not start output: failed to reload output: could not setup output certificates reloader: unpacking 'ssl' config: key file not configured accessing 'logstash.ssl'

I made a terrible mistake changing Fleet server hosts -> grid-default and Outputs -> grid-logstash & so-manager_elasticsearch hosts. And now my whole Elastic fleet: FleetServer-sec-on, sec-on (so-grid-nodes_general) + 18 elastic agents (endpoints initial) in Unhealthy state. Any ideas, how can I fix this? I brought all setting back, but it didn't help.

Answered by reyesj2

Oct 2, 2025

Went ahead and created an issue #15101

To fix this you should be able to take the cert from your standalone and paste it back into the fleet output policy. From your manager run sudo cat /etc/pki/elasticfleet-logstash.key then copy that entire output should look something like

-----BEGIN PRIVATE KEY-----
....
....
-----END PRIVATE KEY-----

Go back into the logstash fleet output policy and paste that into the 'Client SSL certificate key' section. Save the policy and check your agents are coming back into healthy state.

If this happens again you can repeat the cert steps and disable updates to the fleet output policy from happening automatically

within SOC -> Administration -> config (hit …

View full answer

reyesj2 · 2025-10-02T14:58:55Z

reyesj2
Oct 2, 2025
Maintainer

Went ahead and created an issue #15101

To fix this you should be able to take the cert from your standalone and paste it back into the fleet output policy. From your manager run sudo cat /etc/pki/elasticfleet-logstash.key then copy that entire output should look something like

-----BEGIN PRIVATE KEY-----
....
....
-----END PRIVATE KEY-----

Go back into the logstash fleet output policy and paste that into the 'Client SSL certificate key' section. Save the policy and check your agents are coming back into healthy state.

If this happens again you can repeat the cert steps and disable updates to the fleet output policy from happening automatically

within SOC -> Administration -> config (hit options and toggle on advanced settings)
filter for
elasticfleet.config.server.enable_auto_configuration

set that to false and save

Where did the need to modify the policy hosts directly come from? On a standalone you only have 1 host to receive the logs (the standalone itself) if you wanted to add an fqdn for the standalone the proper way would be https://docs.securityonion.net/en/2.4/elastic-fleet.html#custom-fqdn-url because the fleet certs need to be updated (they automatically update in the background for you & make sure the endpoints DNS can resolve the custom fqdn)

1 reply

Lastautumnleaf Oct 3, 2025
Author

Hello, reyesj2.
Big thanks for your reply and explanation!

To answer your question: I had to change one of the FQDNs in order to fix an issue where my Elastic Agents weren’t sending system logs due to a “DNS lookup failure.” I added the Fleet server name to the local DNS and updated it in the Fleet settings. It solved the issue for a couple of minutes, until it collapsed everything with this SSL cert thing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The entire Elastic Fleet went Unhealthy after changing Fleet settings. Any ideas how I can make it work again? #15100

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The entire Elastic Fleet went Unhealthy after changing Fleet settings. Any ideas how I can make it work again? #15100

Uh oh!

Lastautumnleaf Oct 2, 2025

Replies: 1 comment · 1 reply

Uh oh!

reyesj2 Oct 2, 2025 Maintainer

Uh oh!

Lastautumnleaf Oct 3, 2025 Author

Lastautumnleaf
Oct 2, 2025

Replies: 1 comment 1 reply

reyesj2
Oct 2, 2025
Maintainer

Lastautumnleaf Oct 3, 2025
Author