Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValidatingAdmissionPolicy: fixes to metrics #124064

Open
3 tasks
jiahuif opened this issue Mar 26, 2024 · 5 comments
Open
3 tasks

ValidatingAdmissionPolicy: fixes to metrics #124064

jiahuif opened this issue Mar 26, 2024 · 5 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@jiahuif
Copy link
Member

jiahuif commented Mar 26, 2024

What would you like to be added?

As of 1.3, ValidatingAdmissionPolicy metrics are in alpha stability [1]. Currently the metrics have following problems:

  • there is no way to count total errors, the current implementation counts only "error but failurePolicy=Ignore" but not denials caused by errors;
  • there is no way to differentiate errors, especially, between transient errors like "params not ready" and persistent errors like runtime errors;
  • the current implementation does not record the evaluation duration for acceptance or denial due to errors. This issue makes any analysis of policy performance impossible.

This issue purpose the following enhancements to the metrics:

  • add a label of error to check_total and check_duration_seconds
    • the values are enum'ed with each possible error, namely RUNTIME, PARAMS, INTERNAL. The label can be empty for no errors;
  • add error handling to ObserveRejection, to record any denials coming from errors;
  • add error reporting to all places where ObserveRejection is called;

References:

  1. https://github.com/kubernetes/kubernetes/blob/v1.30.0-beta.0/staging/src/k8s.io/apiserver/pkg/admission/plugin/policy/validating/metrics/metrics.go

Why is this needed?

resolve the issues mentioned above.

Let's discuss the beta graduation of the metrics in another issue.

@jiahuif jiahuif added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 26, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 26, 2024
@jiahuif
Copy link
Member Author

jiahuif commented Mar 26, 2024

/assign
/sig api-machinery
cc @cici37

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 26, 2024
@jiahuif
Copy link
Member Author

jiahuif commented Mar 26, 2024

/sig instrumentation

@k8s-ci-robot k8s-ci-robot added the sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. label Mar 26, 2024
@seans3
Copy link
Contributor

seans3 commented Mar 26, 2024

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 26, 2024
@jiahuif jiahuif changed the title ValidatingAdmissionPolicy: metrics enhancements and graduation to beta ValidatingAdmissionPolicy: fixes to metrics Mar 26, 2024
@dgrisonnet
Copy link
Member

cc @logicalhan

@jiahuif
Copy link
Member Author

jiahuif commented Apr 16, 2024

cc @fedebongio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants