Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix admission metrics bucket sizes #78608

Merged
merged 1 commit into from
Jun 6, 2019

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented May 31, 2019

When #72343 fixed admission metrics to use seconds instead of microseconds as the measured unit (which was totally my fault), the bucket sizes didn't get updated and so are off by 6 orders of magnitude. This fixes them.

/kind bug

Fix admission metrics histogram bucket sizes to cover 25ms to ~2.5 seconds.

/sig api-machinery
/cc @logicalhan @danielqsj @brancz @sttts @liggitt @cheftako

@k8s-ci-robot k8s-ci-robot requested a review from brancz May 31, 2019 23:25
@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label May 31, 2019
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver labels May 31, 2019
@fedebongio
Copy link
Contributor

/assign @logicalhan

@liggitt
Copy link
Member

liggitt commented Jun 3, 2019

/priority important-soon
/milestone v1.15

we would cherry-pick this into the 1.15 release even if it merged post 1.15

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 3, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.15 milestone Jun 3, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 3, 2019
Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@logicalhan
Copy link
Member

/retest

@logicalhan
Copy link
Member

/cc @brancz

What do you think about these bucket sizes?

@@ -33,7 +33,7 @@ const (

var (
// Use buckets ranging from 25 ms to ~2.5 seconds.
Copy link
Member

@logicalhan logicalhan Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current buckets generated by the expression underneath are: [0.025 0.0625 0.15625 0.390625 0.9765625] so the comment is actually still inaccurate.

Copy link
Member

@logicalhan logicalhan Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to stay within the constraints of five buckets then my suggestion is {0.025, 0.05, 0.25, 1.0, 5.0}. We should also update the comment to match the buckets.

@cheftako
Copy link
Member

cheftako commented Jun 4, 2019

I realize trace threshold is configurable.
However we do currently have it configured at 100 and 500 milliseconds (and 10 seconds for watch).
If we set the start to .02 rather than .025 it would better align with the trace thresholds, which might be useful.

Not sure exponential bucket are great as they seem to be causing some confusion. Maybe something like {0.025, 0.1, 0.25, 0.5, 1, 5}

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 4, 2019

I realize trace threshold is configurable.
However we do currently have it configured at 100 and 500 milliseconds (and 10 seconds for watch).
If we set the start to .02 rather than .025 it would better align with the trace thresholds, which might be useful.

Not sure exponential bucket are great as they seem to be causing some confusion. Maybe something like {0.025, 0.1, 0.25, 0.5, 1, 5}

I like this idea. Since the admission webhook default (and max allowed) timeout is 30s, maybe:

{0.025, 0.1, 0.5, 1, 5, 30}

?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@cheftako
Copy link
Member

cheftako commented Jun 4, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 4, 2019

Going with buckets: {.005, .025, 0.1, 0.5, 2.5}. This keeps out bucket count the same, but gives us a better fast bucket and aligns with the 100 and 500ms default trace thresholds.

@logicalhan
Copy link
Member

/lgtm

Thanks Joe!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2019
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2019
Copy link
Member

@cheftako cheftako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2019
Copy link
Contributor

@danielqsj danielqsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@brancz
Copy link
Member

brancz commented Jun 5, 2019

/lgtm

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 5, 2019

@liggitt or @deads2k Could we get approval?

@logicalhan
Copy link
Member

/retest

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 5, 2019

rebased

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2019
@liggitt
Copy link
Member

liggitt commented Jun 5, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit ca12f11 into kubernetes:master Jun 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants