Skip to content

Commit

Permalink
More 1.10 docs (#852)
Browse files Browse the repository at this point in the history
* rework deny docs

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* rewrite preconditions

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* add message to verifyImages section

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* notes to variables

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* add Playground menu shortcut

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* foreach notes

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* update mutate existing

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* fix misspelling

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* generate updates

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* security updates

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* refresh and extend troubleshooting

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* tweak description

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* polish autogen

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* polish cleanup

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* polish jmespath

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* add more operations in match-exclude

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* minor tweak to mutate

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* new tip

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* fix

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

* render policies

Signed-off-by: Chip Zoller <chipzoller@gmail.com>

---------

Signed-off-by: Chip Zoller <chipzoller@gmail.com>
  • Loading branch information
chipzoller committed May 24, 2023
1 parent 10a39fb commit 5cb02c4
Show file tree
Hide file tree
Showing 22 changed files with 701 additions and 196 deletions.
7 changes: 6 additions & 1 deletion config/_default/menus/menu.en.toml
Expand Up @@ -21,8 +21,13 @@
url = "/resources"

[[main]]
name = "Community"
name = "Playground"
weight = -98
url = "https://playground.kyverno.io/"

[[main]]
name = "Community"
weight = -97
url = "/community"

[[main]]
Expand Down
47 changes: 26 additions & 21 deletions content/en/docs/Troubleshooting/_index.md
Expand Up @@ -8,9 +8,11 @@ Although Kyverno's goal is to make policy simple, sometimes trouble still strike

## API server is blocked

**Symptom**: Kyverno Pods are not running and the API server is timing out due to webhook timeouts. This can happen if the Kyverno Pods are not gracefully terminated, or if there is a cluster outage, and policies were configure to [fail-closed](/docs/writing-policies/policy-settings/).
**Symptom**: Kyverno Pods are not running and the API server is timing out due to webhook timeouts. My cluster appears "broken".

**Solution**: Delete the Kyverno validating and mutating webhook configurations and then restart Kyverno.
**Cause**: This can happen if all Kyverno Pods are down, due typically to a cluster outage or improper scaling/killing of full node groups, and policies were configure to [fail-closed](/docs/writing-policies/policy-settings/) while matching on Pods. This is usually only the case when the Kyverno Namespace has not been excluded (not the default behavior) or potentially system Namespaces which have cluster-critical components such as `kube-system`.

**Solution**: Delete the Kyverno validating and mutating webhook configurations. When Kyverno recovers, check your Namespace exclusions. Follow the steps below. Also consider running the admission controller component with 3 replicas.

1. Delete the validating and mutating webhook configurations that instruct the API server to forward requests to Kyverno:

Expand All @@ -19,22 +21,22 @@ kubectl delete validatingwebhookconfiguration kyverno-resource-validating-webhoo
kubectl delete mutatingwebhookconfiguration kyverno-resource-mutating-webhook-cfg
```

Note that these two webhook configurations are used for resources. Other Kyverno webhooks are for internal operations and typically do not need to be deleted.
Note that these two webhook configurations are used for resources. Other Kyverno webhooks are for internal operations and typically do not need to be deleted. When Kyverno recovers, its webhooks will be recreated based on the currently-installed policies.

2. Restart Kyverno

Either delete the Kyverno Pods or scale the Deployment down to zero and then up. For example, for an installation with three replicas in the default Namespace use:
This step is typically not necessary. In case it is, either delete the Kyverno Pods or scale the Deployment down to zero and then up. For example, for an installation with three replicas in the default Namespace use:

```sh
kubectl scale deploy kyverno -n kyverno --replicas 0
kubectl scale deploy kyverno -n kyverno --replicas 3
kubectl scale deploy kyverno-admission-controller -n kyverno --replicas 0
kubectl scale deploy kyverno-admission-controller -n kyverno --replicas 3
```

3. Consider excluding namespaces

Use [Namespace selectors](/docs/installation/customization/#namespace-selectors) to filter requests to system Namespaces. Note that this configuration bypasses all policy checks on select Namespaces and may violate security best practices. When excluding Namespaces, it is the user's responsibility to ensure other controls such as Kubernetes RBAC are configured since Kyverno cannot apply any policies to objects therein. For more information, see the [Security vs Operability](/docs/installation/#security-vs-operability) section. The Kyverno Namespace is excluded by default.
Use [Namespace selectors](/docs/installation/customization/#namespace-selectors) to filter requests to system Namespaces. Note that this configuration bypasses all policy checks on select Namespaces and may violate security best practices. When excluding Namespaces, it is your responsibility to ensure other controls such as Kubernetes RBAC are configured since Kyverno cannot apply any policies to objects therein. For more information, see the [Security vs Operability](/docs/installation/#security-vs-operability) section. The Kyverno Namespace is excluded by default. And if running Kyverno on certain PaaS platforms, additional Namespaces may need to be excluded as well, for example `kube-system`.

## Policies not applied
## Policies are not applied

**Symptom**: My policies are created but nothing seems to happen when I create a resource that should trigger them.

Expand All @@ -45,8 +47,8 @@ Use [Namespace selectors](/docs/installation/customization/#namespace-selectors)

```sh
$ kubectl get cpol,pol -A
NAME BACKGROUND VALIDATE ACTION READY AGE
clusterpolicy.kyverno.io/check-image-keyless true Enforce true 116s
NAME BACKGROUND VALIDATE ACTION READY AGE MESSAGE
inject-entrypoint true Audit True 15s Ready
```

3. Kyverno registers as two types of webhooks with Kubernetes. Check the status of registered webhooks to ensure Kyverno is among them.
Expand Down Expand Up @@ -84,27 +86,30 @@ Use [Namespace selectors](/docs/installation/customization/#namespace-selectors)

6. Check and ensure you aren't creating a resource that is either excluded from Kyverno's processing by default, or that it hasn't been created in an excluded Namespace. Kyverno uses a ConfigMap by default called `kyverno` in the Kyverno Namespace to filter out some of these things. The key name is `resourceFilters` and more details can be found [here](/docs/installation/customization/#resource-filters).
## Kyverno consumes a lot of resources
7. Check the same ConfigMap and ensure that the user/principal or group responsible for submission of your resource is not being excluded. Check the `excludeGroups` and `excludeUsernames` and others if they exist.
## Kyverno consumes a lot of resources or I see OOMKills
**Symptom**: Kyverno is using too much memory or CPU. How can I understand what is causing this?
**Solution**: Follow the steps on the [Kyverno wiki](https://github.com/kyverno/kyverno/wiki/Profiling-Kyverno-on-Kubernetes) for enabling memory and CPU profiling. Additionally, gather how many ConfigMap and Secret resources exist in your cluster by running the following command:
**Solution**: It is important to understand how Kyverno experiences and processes work to know if what you deem as "too much" is, in fact, too much. Kyverno dynamically configures its webhooks (by default but configurable) according the policies which are loaded and on what resources they match. There is no easy rubric to follow where resource requirements are directly proportional to, for example, number of Pods or Nodes in a cluster. The following questions need to be asked and answered to build a full picture of the resources consumed by Kyverno.
```sh
kubectl get cm,secret -A | wc -l
```
1. What policies are in the cluster and on what types of resources do they match? Policies which match on wildcards (`"*"`) cause a tremendous load on Kyverno and should be avoided if possible as they instruct the Kubernetes API server to send to Kyverno _every action on every resource_ in the cluster. Even if Kyverno does not have matching policies for most of these resources, it is _required_ to respond to every single one. If even one policy matches on a wildcard, expect the resources needed by Kyverno to easily double, triple, or more.
2. Which controller is experiencing the load? Each Kyverno controller has different responsibilities. See the [controller guide](/docs/high-availability/#controllers-in-kyverno) for more details. Each controller can be independently scaled, but before immediately scaling in any direction take the time to study the load.
3. Are the default requests and limits still in effect? It is possible the amount of load Kyverno (any of its controllers) is experiencing is beyond the capabilities of the default requests and limits. These defaults have been selected based on a good mix of real-world usage and feedback but **may not suit everyone**. In extremely large and active clusters, from Kyverno's perspective, you may need to increase these.
4. What do your monitoring metrics say? Kyverno is a critical piece of cluster infrastructure and must be monitored effectively just like other pieces. There are several metrics which give a sense of how active Kyverno is, the most important being [admission request count](/docs/monitoring/admission-requests/). Others include consumed memory and CPU utilization. Sizing should always be done based on peak consumption and not averages.

After gathering this information, [create an issue](https://github.com/kyverno/kyverno/issues/new/choose) in the Kyverno GitHub repository and reference it.
You can also follow the steps on the [Kyverno wiki](https://github.com/kyverno/kyverno/wiki/Profiling-Kyverno-on-Kubernetes) for enabling memory and CPU profiling.

**Symptom**: I'm using AKS and Kyverno is using too much memory or CPU
**Symptom**: I'm using AKS and Kyverno is using too much memory or CPU or produces many audit logs
**Solution**: On AKS the kyverno webhooks will be mutated by the AKS [Admissions Enforcer](https://learn.microsoft.com/en-us/azure/aks/faq#can-admission-controller-webhooks-impact-kube-system-and-internal-aks-namespaces) Plugin, that can lead to an endless update loop. To prevent that behaviour, you can set the annotation `"admissions.enforcer/disabled": true` to all kyverno webhooks. When installing via Helm, you can add the annotation with `config.webhookAnnotations`.
**Solution**: On AKS the Kyverno webhooks will be mutated by the AKS [Admissions Enforcer](https://learn.microsoft.com/en-us/azure/aks/faq#can-admission-controller-webhooks-impact-kube-system-and-internal-aks-namespaces) plugin, that can lead to an endless update loop. To prevent that behavior, set the annotation `"admissions.enforcer/disabled": true` to all Kyverno webhooks. When installing via Helm, the annotation can be added with `config.webhookAnnotations`.
## Kyverno is slow to respond
**Symptom**: Kyverno's operation seems slow in either mutating resources or validating them, causing additional time to create resources in the Kubernetes cluster.

**Solution**: Check the Kyverno logs for messages about throttling. If many are found, this indicates Kyverno is making too many API calls in too rapid a succession which the Kubernetes API server will throttle. Increase the values, or set the [flags](/docs/installation/customization/#container-flags), `--clientRateLimitQPS` and `--clientRateLimitBurst`. Try values `100` for each and increase as needed.
**Solution**: Check the Kyverno logs for messages about throttling. If many are found, this indicates Kyverno is making too many API calls in too rapid a succession which the Kubernetes API server will throttle. Increase the values, or set the [flags](/docs/installation/customization/#container-flags), `--clientRateLimitQPS` and `--clientRateLimitBurst`. While these flags have very sensible values after much field trials, in some cases they may need to be increased.

## Policies are partially applied

Expand All @@ -114,15 +119,15 @@ After gathering this information, [create an issue](https://github.com/kyverno/k

1. Check the Pod logs from Kyverno. Assuming Kyverno was installed into the default Namespace called `kyverno` use the command `kubectl -n kyverno logs <kyverno_pod_name>` to show the logs. To watch the logs live, add the `-f` switch for the "follow" option.

2. If no helpful information is being displayed at the default logging level, increase the level of verbosity by editing the Kyverno Deployment. To edit the Deployment, assuming Kyverno was installed into the default Namespace, use the command `kubectl -n kyverno edit deploy kyverno`. Find the `args` section for the container named `kyverno` and either add the `-v` switch or increase to a higher level. The flag `-v=6` will increase the logging level to its highest. Take care to revert this change once troubleshooting steps are concluded.
2. If no helpful information is being displayed at the default logging level, increase the level of verbosity by editing the Kyverno Deployment. To edit the Deployment, assuming Kyverno was installed into the default Namespace, use the command `kubectl -n kyverno edit deploy kyverno-<controller_type>-controller`. Find the `args` section for the container named `kyverno` and either add the `-v` switch or increase to a higher level. The flag `-v=6` will increase the logging level to its highest. Take care to revert this change once troubleshooting steps are concluded.

## Kyverno exits

**Symptom**: I have a large cluster with many objects and many Kyverno policies. Kyverno is seen to sometimes crash.

**Solution**: In cases of very large scale, it may be required to increase the memory limit of the Kyverno Pod so it can keep track of these objects.

1. Edit the Kyverno Deployment and increase the memory limit on the `kyverno` container by using the command `kubectl -n kyverno edit deploy kyverno`. Change the `resources.limits.memory` field to a larger value. Continue to monitor the memory usage by using something like the [Kubernetes metrics-server](https://github.com/kubernetes-sigs/metrics-server#installation).
1. First, see the [above troubleshooting section](#kyverno-consumes-a-lot-of-resources-or-i-see-oomkills). If changes are required, edit the necessary Kyverno Deployment and increase the memory limit on the container. Change the `resources.limits.memory` field to a larger value. Continue to monitor the memory usage by using something like the [Kubernetes metrics-server](https://github.com/kubernetes-sigs/metrics-server#installation).

## Kyverno fails on GKE

Expand Down
2 changes: 1 addition & 1 deletion content/en/docs/Writing policies/_index.md
Expand Up @@ -3,5 +3,5 @@ title: "Writing Policies"
linkTitle: "Writing Policies"
weight: 50
description: >
Create policies which can validate, mutate, and generate resources as well as perform verification of container images.
Create policies which can validate, mutate, generate, and clean up resources as well as perform verification of container images.
---
16 changes: 6 additions & 10 deletions content/en/docs/Writing policies/autogen.md
Expand Up @@ -7,10 +7,6 @@ weight: 110

Pods are one of the most common object types in Kubernetes and as such are the focus of most types of validation rules. But creation of Pods directly is almost never done as it is considered an anti-pattern. Instead, Kubernetes has many higher-level controllers that directly or indirectly manage Pods, namely the Deployment, DaemonSet, StatefulSet, Job, and CronJob resources. Writing policy that targets Pods but must be written for every one of these controllers would be tedious and inefficient. Kyverno solves this issue by supporting automatic generation of policy rules for higher-level controllers from a rule written exclusively for a Pod. For rules which match on Pods in addition to other kinds, auto-generation is not activated.

{{% alert title="Note" color="info" %}}
Kyverno 1.9 adds support for including ReplicaSets and ReplicationControllers to auto-gen rules. These two intermediary controllers share the same Pod template schema as DaemonSets, Deployments, StatefulSets, and Jobs. Although these intermediary controllers have rules auto-generated, the Kyverno ConfigMap may need to be updated to remove default [resource filters](/docs/installation/customization/#resource-filters) for them.
{{% /alert %}}

For example, when creating a validation policy like below which checks that all images come from an internal, trusted registry, the policy applies to all resources capable of generating Pods.

```yaml
Expand Down Expand Up @@ -92,9 +88,11 @@ status:
- image: registry.domain.com/*
```

This auto-generation behavior is controlled by the `pod-policies.kyverno.io/autogen-controllers` annotation.
{{% alert title="Note" color="info" %}}
Auto-gen rules also cover ReplicaSet and ReplicationControllers. These two intermediary controllers share the same Pod template schema as DaemonSets, Deployments, StatefulSets, and Jobs. Although these intermediary controllers have rules auto-generated, the Kyverno ConfigMap may need to be updated to remove default [resource filters](/docs/installation/customization/#resource-filters) for them.
{{% /alert %}}

You can change the annotation `pod-policies.kyverno.io/autogen-controllers` to customize the target Pod controllers for the auto-generated rules. For example, Kyverno generates a rule for a `Deployment` if the annotation of policy is defined as `pod-policies.kyverno.io/autogen-controllers=Deployment`.
Rule auto-generation behavior is controlled by the policy annotation `pod-policies.kyverno.io/autogen-controllers`. You can change the value of the annotation to customize the target Pod controllers for the auto-generated rules. For example, Kyverno generates rules for a `Deployment` and `Job` if the annotation is defined as `pod-policies.kyverno.io/autogen-controllers=Deployment,Job`. To disable auto-generating rules for Pod controllers entirely, set it to the value `none`.

Kyverno skips generating Pod controller rules whenever the following `resources` fields/objects are specified in a `match` or `exclude` block as these filters may not be applicable to Pod controllers:

Expand All @@ -104,9 +102,7 @@ Kyverno skips generating Pod controller rules whenever the following `resources`

Additionally, Kyverno only auto-generates rules when the resource kind specified in a combination of `match` and `exclude` is no more than `Pod`. Mutate rules which match on `Pod` and use a JSON patch are also excluded from rule auto-generation as noted [here](/docs/writing-policies/mutate/#rfc-6902-jsonpatch).

To disable auto-generating rules for Pod controllers set `pod-policies.kyverno.io/autogen-controllers` to the value `none`.

When disabling auto-generation rules for select Pod controllers, Kyverno still applies policy matching on Pods to those spawned by those controllers. To exempt these Pods, use [preconditions](/docs/writing-policies/preconditions/) with an expression similar to the below which may allow Pods created by a Job controller to pass.
When disabling auto-generation rules for select Pod controllers, or when auto-generation does not apply, Kyverno still applies policy matching on Pods to those spawned by those controllers. To exempt these Pods, use [preconditions](/docs/writing-policies/preconditions/) with an expression similar to the below which may allow Pods created by a Job controller to pass.

```yaml
- key: Job
Expand All @@ -116,7 +112,7 @@ When disabling auto-generation rules for select Pod controllers, Kyverno still a

## Exclusion by Metadata

In some cases it may be desirable to use an `exclude` block applied to Pods that uses either labels or annotations. For example, the following `match` and `exclude` statement may be written, the purpose of which would be to match any Pods except those that have the annotation `policy.test/require-requests-limits=skip`.
In some cases, it may be desirable to use an `exclude` block applied to Pods that uses either labels or annotations. For example, the following `match` and `exclude` statement may be written, the purpose of which would be to match any Pods except those that have the annotation `policy.test/require-requests-limits=skip`.

```yaml
rules:
Expand Down

0 comments on commit 5cb02c4

Please sign in to comment.