-
Notifications
You must be signed in to change notification settings - Fork 330
Description
Hello,
Issue description:
In EKS cluster we are deploying opa-gatekeeper with following mutatingwebhookconfiguration:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
annotations:
meta.helm.sh/release-name: gatekeeper
meta.helm.sh/release-namespace: gatekeeper-system
creationTimestamp: "2024-10-09T05:14:31Z"
generation: 18
labels:
app: gatekeeper
app.kubernetes.io/managed-by: Helm
chart: gatekeeper
gatekeeper.sh/system: "yes"
heritage: Helm
release: gatekeeper
name: gatekeeper-mutating-webhook-configuration
resourceVersion: "919873785"
uid: 6654cbb4-73e7-4858-8182-409db2a597ef
webhooks:
- admissionReviewVersions:
- v1
- v1beta1
clientConfig:
caBundle: **************************
service:
name: gatekeeper-webhook-service
namespace: gatekeeper-system
path: /v1/mutate
port: 443
failurePolicy: Fail
matchPolicy: Exact
name: mutation.gatekeeper.sh
namespaceSelector:
matchExpressions:
- key: admission.gatekeeper.sh/ignore
operator: DoesNotExist
objectSelector: {}
reinvocationPolicy: Never
rules:
- apiGroups:
- '*'
apiVersions:
- '*'
operations:
- CREATE
resources:
- '*'
scope: '*'
- apiGroups:
- '*'
apiVersions:
- '*'
operations:
- UPDATE
resources:
- services
scope: '*'
sideEffects: None
timeoutSeconds: 1
The gatekepeer-system namespace has the label admission.gatekeeper.sh/ignore: no-self-managing.
When opa-gatekeeper fargate profile is deployed using terraform with following configuration:
opa-gatekeeper = {
name = "opa-gatekeeper"
selectors = [
{
namespace = "gatekeeper-system"
labels = {
"app.kubernetes.io/name" = "gatekeeper"
"control-plane" = "controller-manager"
}
}
]
timeouts = {
create = "20m"
delete = "20m"
}
tags = local.common_tags
}
Problem: When gatekeeper-controller-manager is provisioned as fargate pod and number of replicas is 1 it cannot be provisioned once again.
Fargate node was nominated bot still hangs in Pending state:
kubectl get pods -n gatekeeper-system -w -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
gatekeeper-controller-manager-6c5c99fb84-tnrc2 0/1 Pending 0 5m4s <none> <none> ee1851c444-15e3d73c35244fc4a90fd176a93f9f8a <none>
I see following error in kubernetes events:
102s Warning FailedScheduling pod/gatekeeper-controller-manager-6c5c99fb84-tnrc2 Pod provisioning timed out (will retry) for pod: gatekeeper-system/gatekeeper-controller-manager-6c5c99fb84-tnrc2
PS1: Issue does not occur when application is deployed on EC2 instance.
PS2: Issue occurs with other applications provisioned as fargate - tested on karpenter.
PS3: failurePolicy had been set to Fail on purpose - it is business requirement