[9.4](backport #50674) [Metricbeat] Add elasticsearch/security_stats metricset#50883
Merged
Conversation
Adds a per-node security_stats metricset that scrapes the new GET /_security/stats endpoint introduced in Elasticsearch 9.2. The first metric exposed is the Document Level Security cache (entries, memory, hits, misses, evictions, hit/miss latency), giving Stack Monitoring fleet-wide visibility into DLS cache health for spotting cache thrash, oversized working sets, and unhealthy hit/miss ratios. Each event is enriched with node name, roles, and stack version via a single filter-path-scoped /_nodes call per scrape, shared across all per-node events emitted in that scrape. This logic lives on the module's MetricSet as the new NodeEnrichment helper so future per-node metricsets can reuse it. node.version is also declared at the module level alongside id, name, roles, master, and mlockall. The shared metricbeat/docker-compose.yml elasticsearch service now runs with xpack.security.enabled=true plus an anonymous superuser, since /_security/stats is only registered when security is enabled. Anonymous superuser keeps the rest of the elasticsearch integration test suite working without threading credentials through every metricset's setup. * docs: register security_stats metricset page in toc.yml mage update regenerates per-metricset markdown but doesn't touch the navigation toc.yml. Add the missing entry so docs-build can locate the security_stats page in the Elasticsearch module section. * docs: replace "e.g." with "for example" per Vale style guide Elastic.Latinisms forbids Latin abbreviations in docs. Replace the lone "e.g." in the new node.version field description and regenerate the affected files. * metricbeat/elasticsearch: clean up pre-existing lint issues Two pre-existing lint findings in elasticsearch_integration_test.go became blocking once this branch touched the file (golangci-lint runs with --whole-files). Both fixes are mechanical: - Replace math/rand with math/rand/v2 in randString and drop the redundant per-call seeded local Rand. - Add the comma-ok form to the version.number type assertion in getElasticsearchVersion so errcheck (with check-type-assertions) is satisfied. * metricbeat/elasticsearch: dedupe node.version field declaration The new module-level node.version added for security_stats collided with a pre-existing node.version in the node metricset's local fields.yml, breaking `metricbeat export index-pattern` with "field <elasticsearch.node.version> is duplicated". Drop the metricset-local declaration in favor of the shared module-level one, which carries a richer description and is the right scope for a field emitted by multiple per-node metricsets. * metricbeat: provision file-realm users for secured ES test stack Enabling xpack.security on the shared elasticsearch service for security_stats coverage broke Kibana boot: Kibana 9.x's interactive setup plugin holds preboot when ES has security on without ELASTICSEARCH_USERNAME, and the existing Kibana healthchecks (curl -u beats:testing, curl -u myelastic:changeme) started actually validating against ES instead of being silently ignored. Provision the named users that the existing healthchecks expect via elasticsearch-users useradd in the startup command, and give Kibana real ES credentials. Anonymous=superuser is preserved so the integration tests' credential-less HTTP probes keep working without threading credentials through every metricset's setup. * x-pack/metricbeat: give kibana credentials to secured ES The previous commit enabled xpack.security on the shared Elasticsearch service and gave the OSS metricbeat kibana service real credentials, but x-pack/metricbeat hand-copies its kibana stanza (depends_on can't be extended) so the env didn't propagate. With no ELASTICSEARCH_USERNAME, Kibana entered interactive setup, the Dockerfile healthcheck (curl -u myelastic:changeme /api/stats) never reached green, and the proxy_dep busybox blocked all integration tests from starting. Mirror the env vars into the x-pack kibana stanza and note the duplication contract in a comment so future secured-ES changes are applied in both places. * metricbeat/elasticsearch: gate security_stats on xpack feature flag CI exposed that the previous PR commits enabled xpack.security on the shared metricbeat docker-compose stack to exercise /_security/stats, but that change rippled wider than fits in this PR: Kibana boot, healthcheck users, OTel test framework default credentials, and the Python `get_version` helper all assume an open ES. Revert both metricbeat and x-pack/metricbeat docker-compose.yml to their upstream/main shape and address the underlying problem in the metricset itself. `security_stats.checkAvailability` now mirrors the pattern used by ccr and ml_job: a free in-memory version comparison short-circuits old clusters first, then a proactive `GET /_xpack` probe checks `features.security.enabled` so we can emit a specific operator-facing log message and avoid hitting an endpoint we know would return 400. A new `Security` field is added to the shared `elasticsearch.XPack` struct to support the check. The elasticsearch_integration_test.go suite skips security_stats unconditionally for now, with a TODO pointing at a focused follow-up PR that migrates the metricbeat compose stack to an x-pack-security- enabled posture (file-realm users, Kibana credentials, test fixture auth). At that point the skip becomes vacuous and the metricset is exercised against a real /_security/stats response. --------- Co-authored-by: Visha Angelova <91186315+vishaangelova@users.noreply.github.com> (cherry picked from commit 7119b64)
Contributor
🤖 GitHub commentsJust comment with:
|
6 tasks
Contributor
Contributor
Vale Linting ResultsSummary: 1 warning, 2 suggestions found
|
| File | Line | Rule | Message |
|---|---|---|---|
| docs/reference/metricbeat/metricbeat-metricset-elasticsearch-security_stats.md | 2 | Elastic.MappedPages | mapped_pages should only be added or updated in rare scenarios. Talk with your local technical writer before pushing changes to this key. |
💡 Suggestions (2)
| File | Line | Rule | Message |
|---|---|---|---|
| docs/reference/metricbeat/metricbeat-metricset-elasticsearch-security_stats.md | 13 | Elastic.WordChoice | Consider using 'select, press, visits' instead of 'hit', unless the term is in the UI. |
| docs/reference/metricbeat/metricbeat-metricset-elasticsearch-security_stats.md | 17 | Elastic.WordChoice | Consider using 'deactivated, deselected, hidden, turned off, unavailable' instead of 'disabled', unless the term is in the UI. |
The Vale linter checks documentation changes against the Elastic Docs style guide.
To use Vale locally or report issues, refer to Elastic style guide for Vale.
pickypg
approved these changes
May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type of change: Enhancement
Proposed commit message
[Metricbeat] Add elasticsearch/security_stats metricset
WHAT:
security_statsmetricset to the Elasticsearch modulethat scrapes the per-node
GET /_security/statsendpointintroduced in Elasticsearch 9.2.
cache: entries, memory, hits, misses, evictions, hit/miss latency.
node.{name,roles,version}via asingle filter-path-scoped
/_nodescall per scrape, exposed as areusable
NodeEnrichmenthelper on the module's MetricSet sofuture per-node metricsets can adopt it.
xpackEnabledMetricSetsso monitoringdeployments using
xpack.enabled: trueroute events to.monitoring-*indices.xpack.Features.Security.Enabled(proactiveGET /_xpackprobe),mirroring the conditional-availability pattern in
ccrandml_job. Aggregates/_nodesenrichment failures viaerrors.Joinrather than logging Debug, so they surface inself-monitoring (matches
node_stats).WHY:
health (cache thrash, oversized working sets, unhealthy hit/miss
ratios) — currently invisible across the fleet.
endpoint may expose in future ES releases (additive sibling
paths under
roles.*).Checklist
./changelog/fragmentsusing the changelog tool.Disruptive User Impact
None for end users — the metricset is opt-in via configuration.
Shared test infrastructure (
metricbeat/docker-compose.ymlandx-pack/metricbeat/docker-compose.yml) is unchanged frommain: theelasticsearchtest service still runs withxpack.security.enabled=false,so
/_security/statsis not registered in CI andTestFetch/security_statsis skipped unconditionally with a TODO pointing at a follow-up PR that
migrates the metricbeat compose stack to an x-pack-security-enabled posture
(file-realm users, Kibana credentials, test fixture auth). At that point
the skip becomes vacuous and the metricset is exercised against a real
/_security/statsresponse.In production, the metricset checks
GET /_xpackon each scrape — the sameproactive feature-availability pattern used by
ccrandml_job— andshort-circuits with a throttled debug log when
features.security.enabledis false, so clusters running without security get no events and no error
spam.
How to test this PR locally
From
metricbeat/, run the unit tests:Expect
TestMapper,TestEmpty,TestSkipsNodesWithoutDLSStats,TestRejectsMalformedResponse, andTestEnrichmentMissingForNodeto pass.
Spin up ES 9.2+ via the module's docker-compose and confirm the
integration suite still passes —
TestFetch/security_statsskipsunder the unmodified compose stack, all other elasticsearch
metricsets pass:
Expect every
TestFetch/*PASS, withTestFetch/security_statsSKIP and the message
/_security/stats requires xpack.security.enabled=true on the test cluster (deferred to a follow-up compose change).To exercise the metricset end-to-end against a real
/_security/statspayload, point a metricbeat instance at anylocal ES 9.2+ cluster that has
xpack.security.enabled: trueandconfigure it with
metricsets: [security_stats]. The nextiteration of this work (the compose-migration follow-up) will
fold this into the standard integration suite.
Confirm version gating: point the metricset at an ES < 9.2 host
and verify
Fetchreturns nil with a (throttled) debug log.Related issues
This is an automatic backport of pull request #50674 done by [Mergify](https://mergify.com).