-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kubernetes] Add support for TSDB - all metrics datastreams except the state ones #5464
Conversation
🌐 Coverage report
|
@@ -14,7 +14,6 @@ | |||
type: group | |||
fields: | |||
- name: pod.name | |||
dimension: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to remove those? I think we should also keep them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pod.uid
is enough as it is unique across the whole cluster. The name is only unique within a namespace, so we would have to set more fields as dimension. Since we also have a limit for dimensions (default is 16, and for all non-ecs fields is 8), I think it is better if we only use the necessary ones. @gizas
@@ -34,7 +34,6 @@ | |||
Kubernetes namespace | |||
|
|||
- name: node.name | |||
dimension: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. I think we should keep this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case in specific, service.address
(always present) and orchestrator.cluster.url
(not always present) should be enough as they are unique.
Should we include the |
We can't do that because that enables TSDB by default. However, we know that TSDB doesn't work in Kibana for 8.7. @gizas |
Package kubernetes - 1.34.1 containing this change is available at https://epr.elastic.co/search?package=kubernetes |
What does this PR do?
Sets the dimension fields for each metric datastream, except the
state_*
ones (for now), so they can migrate to TSDB.Why is this important?
The fields set as they are now, are not sufficient to migrate to TSDB. We want this migration so we can create "a space efficient, easy to use, and faster solution for storing metrics data generated by Elastic’s solutions and 3rd party data providers." (more details on the goal of this change can be found here).
Details on this PR
Important considerations:
_tsid
that distinguishes it from the other documents. This ID is generated based on the timestamp and all dimension fields of the document.ecs
fields. It is possible to change this default (check here).index.routing_path
. Every document is only "accepted" in a TSDB index if there is at least one dimension field present.Main changes:
service.address
andorchestrator.cluster.url
set to dimensions.pod.uid
, as it is unique across the whole cluster.pod.uid
(same reason as for Pod).kubernetes.system.container
, since it is a metric label and documents may be being split on it.kubernetes.volume.name
(metric label),kubernetes.pod.name
andkubernetes.namespace
, as the pod name is unique per namespace.Important warning: TSDB is not enabled by default (check Screenshots below on how). Visualizations with counter metric fields are unavailable as of the time of the creation of this PR - problem described here.
Example
Th
_tsid
for a TSDB Kubernetes system datastream is obtained through the combination of the timestamp and the dimension fieldskubernetes.system.container
,service.address
andorchestrator.cluster.url
. In an example consisting of an Elastic Agent in a two node clusters we could have the following documents:In this document, every field has the same value except for the
kubernetes.system.container
. If this field was not set to dimension, then the_tsid
generated would be the same for the two documents and a conflict would happen.For a more hands on example, check the section TSDB very simple example in #4618.
Checklist
changelog.yml
file.How to test this PR locally
elastic-package build
inside Kubernetes package.Related issues
Screenshots
General view of the documents of all these datastreams on Discover:
From this screenshot, there was no change in the number of documents before and after enabling TSDB.
To enable TSDB, all that is necessary to do is to go to agent Policy, Kubernetes integration and enable TSDB through the toggle available when clicking "Advanced options" under a datastream: