Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Add support for TSDB - all metrics datastreams except the state ones #5464

Merged
merged 5 commits into from
Mar 22, 2023
Merged

[Kubernetes] Add support for TSDB - all metrics datastreams except the state ones #5464

merged 5 commits into from
Mar 22, 2023

Conversation

constanca-m
Copy link
Contributor

@constanca-m constanca-m commented Mar 7, 2023

What does this PR do?

Sets the dimension fields for each metric datastream, except the state_* ones (for now), so they can migrate to TSDB.

Why is this important?

The fields set as they are now, are not sufficient to migrate to TSDB. We want this migration so we can create "a space efficient, easy to use, and faster solution for storing metrics data generated by Elastic’s solutions and 3rd party data providers." (more details on the goal of this change can be found here).

Details on this PR

Important considerations:

  1. Each document from a TSDB index has a _tsid that distinguishes it from the other documents. This ID is generated based on the timestamp and all dimension fields of the document.
  2. Each index has a default capacity for 16 dimension fields, but 8 of these are reserved for ecs fields. It is possible to change this default (check here).
  3. Dimension values cannot go over 1024b.
  4. Not all field types classify for dimension (check which ones here).
  5. Each dimension field is part of the index.routing_path. Every document is only "accepted" in a TSDB index if there is at least one dimension field present.
  6. If a metric is split by labels, then the labels should be set as dimension. Otherwise, only one of two (or more) documents with the same metric for a different label will be stored.

Main changes:

  1. Every datastream has the keyword fields service.address and orchestrator.cluster.url set to dimensions.
  2. In combination to that:
    1. Pod: pod.uid, as it is unique across the whole cluster.
    2. Node: no more are necessary.
    3. Container: pod.uid (same reason as for Pod).
    4. System: kubernetes.system.container, since it is a metric label and documents may be being split on it.
    5. Volume: kubernetes.volume.name (metric label), kubernetes.pod.name and kubernetes.namespace, as the pod name is unique per namespace.
    6. Controller manager, API server, proxy and scheduler: all labels are set to dimensions.

Important warning: TSDB is not enabled by default (check Screenshots below on how). Visualizations with counter metric fields are unavailable as of the time of the creation of this PR - problem described here.

Example

Th _tsid for a TSDB Kubernetes system datastream is obtained through the combination of the timestamp and the dimension fields kubernetes.system.container, service.address and orchestrator.cluster.url. In an example consisting of an Elastic Agent in a two node clusters we could have the following documents:
image
In this document, every field has the same value except for the kubernetes.system.container. If this field was not set to dimension, then the _tsid generated would be the same for the two documents and a conflict would happen.

For a more hands on example, check the section TSDB very simple example in #4618.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  1. Clone this repository.
  2. Run elastic-package build inside Kubernetes package.
  3. Deploy the Elastic stack.
  4. Check the number of documents in each index before enabling the TSDB and after enabling it (check Screenshots to see how to enable it).

Related issues

Screenshots

General view of the documents of all these datastreams on Discover:

Screenshot from 2023-03-07 15-03-33

From this screenshot, there was no change in the number of documents before and after enabling TSDB.

To enable TSDB, all that is necessary to do is to go to agent Policy, Kubernetes integration and enable TSDB through the toggle available when clicking "Advanced options" under a datastream:
image

@constanca-m constanca-m added enhancement New feature or request Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team labels Mar 7, 2023
@constanca-m constanca-m requested a review from a team March 7, 2023 14:41
@constanca-m constanca-m self-assigned this Mar 7, 2023
@constanca-m constanca-m requested review from gsantoro and devamanv and removed request for a team March 7, 2023 14:41
@constanca-m constanca-m requested a review from a team as a code owner March 7, 2023 14:43
@elasticmachine
Copy link

elasticmachine commented Mar 7, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-03-22T14:14:23.883+0000

  • Duration: 29 min 15 sec

Test stats 🧪

Test Results
Failed 0
Passed 92
Skipped 0
Total 92

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Mar 7, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (0/0) 💚
Files 100.0% (0/0) 💚
Classes 100.0% (0/0) 💚
Methods 96.154% (75/78) 👍
Lines 100.0% (0/0) 💚
Conditionals 100.0% (0/0) 💚

@@ -14,7 +14,6 @@
type: group
fields:
- name: pod.name
dimension: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to remove those? I think we should also keep them

Copy link
Contributor Author

@constanca-m constanca-m Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pod.uid is enough as it is unique across the whole cluster. The name is only unique within a namespace, so we would have to set more fields as dimension. Since we also have a limit for dimensions (default is 16, and for all non-ecs fields is 8), I think it is better if we only use the necessary ones. @gizas

@@ -34,7 +34,6 @@
Kubernetes namespace

- name: node.name
dimension: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. I think we should keep this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case in specific, service.address (always present) and orchestrator.cluster.url (not always present) should be enough as they are unique.

@gizas
Copy link
Contributor

gizas commented Mar 22, 2023

@constanca-m
Copy link
Contributor Author

constanca-m commented Mar 22, 2023

We should include the index_mode: "time_series" as well:

https://github.com/elastic/integrations/pull/4966/files#diff-8530c1396ab7551f52b1442ac68871d2cc966130c6ab2d6b6d02dea24b4f0a7bR26

We can't do that because that enables TSDB by default. However, we know that TSDB doesn't work in Kibana for 8.7. @gizas

@constanca-m constanca-m merged commit cfe89dd into elastic:main Mar 22, 2023
@constanca-m constanca-m deleted the check-tsdb-fields branch March 22, 2023 15:09
@elasticmachine
Copy link

Package kubernetes - 1.34.1 containing this change is available at https://epr.elastic.co/search?package=kubernetes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants