Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Add support for TSDB - set dimension fields for state data streams #5621

Merged
merged 5 commits into from
Mar 22, 2023
Merged

[Kubernetes] Add support for TSDB - set dimension fields for state data streams #5621

merged 5 commits into from
Mar 22, 2023

Conversation

constanca-m
Copy link
Contributor

@constanca-m constanca-m commented Mar 21, 2023

What does this PR do?

Set the dimension fields for each state metric datastream. This is a follow up to the PR #5464.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Changes

The general reasons for this change can be found in #5464, as well as more information such as testing or ecs fields set as dimension.

Specifically, for each data stream:

  • State container: kubernetes.pod.uid is a dimension, since it is unique across the whole cluster, as well as kubernetes.container.id. Every metric label is set as dimension (phase and reason) to avoid overlap on documents.
  • State cronjob: kubernetes.cronjob.name is set as dimension, along with kubernetes.namespace_uid as the name is unique per namespace.
  • State daemonset: for the same reason, kubernetes.daemonset.name andkubernetes.namespace_uid are dimensions.
  • State deployment: kubernetes.deployment.name and kubernetes.namespace_uid are dimensions.
  • State job: kubernetes.job.name and kubernetes.namespace_uid are dimensions. Note: other keyword fields inside fields.yaml are not set as dimension, because the job.name is enough to uniquely identify the document.
  • State node: kubernetes.node.name set as dimension, since it is always present and is unique across a cluster. Note: none of the keywords inside fields.yml is set as dimension. This is because they were obtained through the method label metric that only stores values that are set to 1, so no overlap will happen.
  • State persistent volume: kubernetes.persistentvolume.name set as dimension
  • State persistent volume claim: kubernetes.persistentvolumeclaim.name set as dimension and kubernetes.namespace_uid as well.
  • State pod: kubernetes.pod.uid set as dimension since it is unique. Note: none of the keywords inside fields.yml is set as dimension, for the same reason as stated for state node.
  • State replica set: kubernetes.namespace_uid and kubernetes.replicaset.name are dimensions.
  • State resource quota: All label metrics and kubernetes.namespace (for the unique combination name + namespace) are set as dimension.
  • State service: kubernetes.namespace_uid and kubernetes.service.name are dimensions.
  • State statefulset: kubernetes.namespace_uid and kubernetes.statefulset.name are dimensions.
  • State storageclass: kubernetes.storageclass.name is set as dimension.

Screenshots

Before and after enabling TSDB, there was no change in the number of docs:
image

@elasticmachine
Copy link

elasticmachine commented Mar 21, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-03-22T13:41:36.008+0000

  • Duration: 29 min 47 sec

Test stats 🧪

Test Results
Failed 0
Passed 92
Skipped 0
Total 92

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Mar 21, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (0/0) 💚
Files 100.0% (0/0) 💚
Classes 100.0% (0/0) 💚
Methods 96.154% (75/78) 👍
Lines 100.0% (0/0) 💚
Conditionals 100.0% (0/0) 💚

@MichaelKatsoulis
Copy link
Contributor

State container: kubernetes.pod.uid is a dimension, since it is unique across the whole cluster. Every metric label is set as dimension (phase and reason) to avoid overlap on documents.

@constanca-m why not the container.id? Isn't it unique across the cluster?

- description: Add support for TSDB on all state metric data streams
type: enhancement
link: https://github.com/elastic/integrations/pull/5621
- version: "1.33.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, you should go with the 1.33.0 version. If the other commit gets merged first you will rebase/merge on this one and increment the version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I chose it assuming the other one was being merged first. But I will adjust this if it is not case.

@constanca-m
Copy link
Contributor Author

@constanca-m why not the container.id? Isn't it unique across the cluster?

We just needed one set as dimension. I think they are both good and always available, there was no special reason to choose one over another. @MichaelKatsoulis

@MichaelKatsoulis
Copy link
Contributor

@constanca-m why not the container.id? Isn't it unique across the cluster?

We just needed one set as dimension. I think they are both good and always available, there was no special reason to choose one over another. @MichaelKatsoulis

I asked mainly because container.id had already dimension:true and you removed it. If we go with pod.uid then what happens if a pod has multiple containers. They will all have same pod.uid but different container.id

@constanca-m
Copy link
Contributor Author

I asked mainly because container.id had already dimension:true and you removed it. If we go with pod.uid then what happens if a pod has multiple containers. They will all have same pod.uid but different container.id

You're right, I hadn't remember that. I checked, and kubernetes.container.id is set as dimension in fields.yaml for state container. I haven't removed it. I will update the description to include that. @MichaelKatsoulis

@constanca-m constanca-m merged commit 9af5d34 into elastic:main Mar 22, 2023
@constanca-m constanca-m deleted the update-tsdb-states branch March 22, 2023 14:12
@elasticmachine
Copy link

Package kubernetes - 1.34.0 containing this change is available at https://epr.elastic.co/search?package=kubernetes

@constanca-m constanca-m self-assigned this May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants