Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix admission metrics bucket sizes #78608

Merged
merged 1 commit into from Jun 6, 2019

Conversation

@jpbetz
Copy link
Contributor

commented May 31, 2019

When #72343 fixed admission metrics to use seconds instead of microseconds as the measured unit (which was totally my fault), the bucket sizes didn't get updated and so are off by 6 orders of magnitude. This fixes them.

/kind bug

Fix admission metrics histogram bucket sizes to cover 25ms to ~2.5 seconds.

/sig api-machinery
/cc @logicalhan @danielqsj @brancz @sttts @liggitt @cheftako

@fedebongio

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

/assign @logicalhan

@liggitt

This comment has been minimized.

Copy link
Member

commented Jun 3, 2019

/priority important-soon
/milestone v1.15

we would cherry-pick this into the 1.15 release even if it merged post 1.15

@k8s-ci-robot k8s-ci-robot added this to the v1.15 milestone Jun 3, 2019

@jpbetz jpbetz force-pushed the jpbetz:admission-histogram-fix branch from 879f8e9 to 2f81a70 Jun 4, 2019

@logicalhan
Copy link
Contributor

left a comment

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jun 4, 2019

@logicalhan

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

/retest

@logicalhan

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

/cc @brancz

What do you think about these bucket sizes?

@@ -33,7 +33,7 @@ const (

var (
// Use buckets ranging from 25 ms to ~2.5 seconds.

This comment has been minimized.

Copy link
@logicalhan

logicalhan Jun 4, 2019

Contributor

The current buckets generated by the expression underneath are: [0.025 0.0625 0.15625 0.390625 0.9765625] so the comment is actually still inaccurate.

This comment has been minimized.

Copy link
@logicalhan

logicalhan Jun 4, 2019

Contributor

If we want to stay within the constraints of five buckets then my suggestion is {0.025, 0.05, 0.25, 1.0, 5.0}. We should also update the comment to match the buckets.

@cheftako

This comment has been minimized.

Copy link
Member

commented Jun 4, 2019

I realize trace threshold is configurable.
However we do currently have it configured at 100 and 500 milliseconds (and 10 seconds for watch).
If we set the start to .02 rather than .025 it would better align with the trace thresholds, which might be useful.

Not sure exponential bucket are great as they seem to be causing some confusion. Maybe something like {0.025, 0.1, 0.25, 0.5, 1, 5}

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

I realize trace threshold is configurable.
However we do currently have it configured at 100 and 500 milliseconds (and 10 seconds for watch).
If we set the start to .02 rather than .025 it would better align with the trace thresholds, which might be useful.

Not sure exponential bucket are great as they seem to be causing some confusion. Maybe something like {0.025, 0.1, 0.25, 0.5, 1, 5}

I like this idea. Since the admission webhook default (and max allowed) timeout is 30s, maybe:

{0.025, 0.1, 0.5, 1, 5, 30}

?

@jpbetz jpbetz force-pushed the jpbetz:admission-histogram-fix branch from 2f81a70 to 0e098cb Jun 4, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jun 4, 2019

@cheftako

This comment has been minimized.

Copy link
Member

commented Jun 4, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jun 4, 2019

@jpbetz jpbetz force-pushed the jpbetz:admission-histogram-fix branch from 0e098cb to 8fc51b5 Jun 4, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jun 4, 2019

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

Going with buckets: {.005, .025, 0.1, 0.5, 2.5}. This keeps out bucket count the same, but gives us a better fast bucket and aligns with the 100 and 500ms default trace thresholds.

@logicalhan

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

/lgtm

Thanks Joe!

@k8s-ci-robot k8s-ci-robot added the lgtm label Jun 4, 2019

@jpbetz jpbetz force-pushed the jpbetz:admission-histogram-fix branch from 8fc51b5 to a4f0487 Jun 5, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jun 5, 2019

@cheftako
Copy link
Member

left a comment

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jun 5, 2019

@danielqsj
Copy link
Member

left a comment

/lgtm

@brancz

This comment has been minimized.

Copy link
Member

commented Jun 5, 2019

/lgtm

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

@liggitt or @deads2k Could we get approval?

@logicalhan

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

/retest

@jpbetz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

rebased

@jpbetz jpbetz force-pushed the jpbetz:admission-histogram-fix branch from a4f0487 to 084c525 Jun 5, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jun 5, 2019

@liggitt

This comment has been minimized.

Copy link
Member

commented Jun 5, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

Copy link

commented Jun 5, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit ca12f11 into kubernetes:master Jun 6, 2019

21 checks passed

cla/linuxfoundation jpbetz authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.