Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MON-1708: Enforce label scrape limits in UWM #1350

Merged

Conversation

slashpai
Copy link
Member

@slashpai slashpai commented Aug 30, 2021

MON-1708 -> Enforce label scrape limits. This feature is available in prometheus-operator 0.50 only

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 30, 2021
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 30, 2021
@slashpai
Copy link
Member Author

cc: @simonpasquier @dgrisonnet

@slashpai slashpai changed the title Enforce label scrape limits MON-1708: Enforce label scrape limits Aug 30, 2021
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 31, 2021
jsonnet/jsonnetfile.json Outdated Show resolved Hide resolved
pkg/manifests/config.go Outdated Show resolved Hide resolved
test/e2e/user_workload_monitoring_test.go Outdated Show resolved Hide resolved
@slashpai
Copy link
Member Author

@dgrisonnet @simonpasquier I have created #1374 to update dependencies. Once that is merged I will remove the conflicting commits from this PR

@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 13, 2021
@slashpai
Copy link
Member Author

@dgrisonnet I updated the PR after syncing dependencies in another PR. Also added assertion

@slashpai slashpai changed the title MON-1708: Enforce label scrape limits MON-1708: Enforce label scrape limits in UWM Sep 13, 2021
@slashpai
Copy link
Member Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2021
Copy link
Member

@dgrisonnet dgrisonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me so far, but let's also look into setting sane defaults in this PR.

test/e2e/config_test.go Outdated Show resolved Hide resolved
@slashpai
Copy link
Member Author

/retest-required

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 29, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 22, 2021
@slashpai
Copy link
Member Author

This PR was on hold till we get the changes to downstream which removed a label with a large label value in kubelet metric. We thought of initially enforcing limits in platform monitoring first and then proceed with uwm but starting with UWM looks to be good since we will already the metrics we are using incase of platform monitoring and we can enforce values in there after testing a little bit more since scrape can fail if we set a lower value for label limits

@slashpai
Copy link
Member Author

cc: @simonpasquier

@slashpai slashpai force-pushed the enforce_label_scrape_limits branch 2 times, most recently from 1588397 to ecf0e02 Compare December 22, 2021 07:20
@slashpai
Copy link
Member Author

alert is already here

- alert: PrometheusLabelLimitHit
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has dropped
{{ printf "%.0f" $value }} targets because some samples exceeded the configured
label_limit, label_name_length_limit or label_value_length_limit.
summary: Prometheus has dropped targets because some scrape configs have exceeded
the labels limit.
expr: |
increase(prometheus_target_scrape_pool_exceeded_label_limits_total{job=~"prometheus-k8s|prometheus-user-workload"}[5m]) > 0
for: 15m
labels:
severity: warning

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 18, 2022
@slashpai slashpai force-pushed the enforce_label_scrape_limits branch from d2e007c to 7b2815e Compare March 18, 2022 10:06
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2022
@arajkumar
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Mar 18, 2022
@arajkumar
Copy link
Contributor

/lgtm

@slashpai
Copy link
Member Author

/skip

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@slashpai
Copy link
Member Author

/hold till docs approved on epic

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 18, 2022
This commit adds feature to support label scrape limits in user workload monitoring by
allowing cluster admin set enforcedLabelLimit, enforcedLabelNameLengthLimit and
enforcedLabelValueLengthLimit in config map

Signed-off-by: Jayapriya Pai <janantha@redhat.com>
@slashpai slashpai force-pushed the enforce_label_scrape_limits branch from 7b2815e to e6f53b8 Compare April 3, 2022 05:39
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 3, 2022
@slashpai
Copy link
Member Author

slashpai commented Apr 3, 2022

/unhold since docs-approved on epic https://issues.redhat.com/browse/MON-2195

@jan--f /@arajkumar Can you review again, I had to rebase due to changelog conflict

@slashpai
Copy link
Member Author

slashpai commented Apr 3, 2022

/retest

2 similar comments
@slashpai
Copy link
Member Author

slashpai commented Apr 3, 2022

/retest

@slashpai
Copy link
Member Author

slashpai commented Apr 4, 2022

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 4, 2022

@slashpai: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-single-node e6f53b8 link false /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jan--f
Copy link
Contributor

jan--f commented Apr 4, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 4, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 4, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arajkumar, jan--f, raptorsun, simonpasquier, slashpai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [arajkumar,jan--f,raptorsun,simonpasquier,slashpai]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@slashpai
Copy link
Member Author

slashpai commented Apr 4, 2022

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 4, 2022
@openshift-merge-robot openshift-merge-robot merged commit a539c9d into openshift:master Apr 4, 2022
@slashpai slashpai deleted the enforce_label_scrape_limits branch October 13, 2022 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants