Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller keeps crashing with a panic error (v0.2.6) #463

Closed
black-mirror-1 opened this issue Jun 18, 2021 · 3 comments
Closed

Controller keeps crashing with a panic error (v0.2.6) #463

black-mirror-1 opened this issue Jun 18, 2021 · 3 comments

Comments

@black-mirror-1
Copy link

I have encounter this error with v0.2.6, where the controller keeps crashing with an invalid memory address error. Please find the logs below.

Karpenter image used: public.ecr.aws/karpenter/controller:v0.2.6@sha256:e5e41d5dcb6597cb3cba3c09451a1c6de1c4c6ac7f6216c0c2a0ed788fa8c362

kubectl get pods -n karpenter
NAME READY STATUS RESTARTS AGE
karpenter-controller-649c856cbb-n97dp 0/1 CrashLoopBackOff 8 26m
karpenter-webhook-866cdb7865-kw7hm 1/1 Running

E0618 20:55:19.602094 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 321 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1d1d5e0, 0x30ea700)
k8s.io/apimachinery@v0.19.7/pkg/util/runtime/runtime.go:74 +0xa6
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
k8s.io/apimachinery@v0.19.7/pkg/util/runtime/runtime.go:48 +0x89
panic(0x1d1d5e0, 0x30ea700)
runtime/panic.go:969 +0x1b9
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation.(*Utilization).markUnderutilized(0xc0001297c0, 0x23354c0, 0xc001c8fa40, 0xc000223c00, 0xc000477020, 0x1)
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation/utilization.go:84 +0x688
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation.(*Controller).Reconcile(0xc0014323a0, 0x23354c0, 0xc001c8fa40, 0x235e640, 0xc000223c00, 0x200c786, 0x5, 0xc000477020, 0x1)
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation/controller.go:72 +0x65
github.com/awslabs/karpenter/pkg/controllers.(*GenericController).Reconcile(0xc0014325e0, 0x23354c0, 0xc001c8fa40, 0xc001b55ae0, 0x7, 0xc001b55ad0, 0x7, 0xc001c8fa40, 0x7fdedb5e0db8, 0xc000411800, ...)
github.com/awslabs/karpenter/pkg/controllers/controller.go:58 +0x22b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000700c80, 0x2335400, 0xc001936c00, 0x1dc0420, 0xc001c89fa0)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:263 +0x2f5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000700c80, 0x2335400, 0xc001936c00, 0x0)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x2335400, 0xc001936c00)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:198 +0x4a
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:185 +0x37
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0005f5750)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001ceff50, 0x22f4900, 0xc001c8f980, 0xc001936c01, 0xc000739260)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:156 +0xad
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0005f5750, 0x3b9aca00, 0x0, 0x1, 0xc000739260)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2335400, 0xc001936c00, 0xc000476f90, 0x3b9aca00, 0x0, 0x20d5201)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:185 +0xa6
k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x2335400, 0xc001936c00, 0xc000476f90, 0x3b9aca00)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:99 +0x57
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:195 +0x4e7
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1a7c9e8]
goroutine 321 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
k8s.io/apimachinery@v0.19.7/pkg/util/runtime/runtime.go:55 +0x10c
panic(0x1d1d5e0, 0x30ea700)
runtime/panic.go:969 +0x1b9
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation.(*Utilization).markUnderutilized(0xc0001297c0, 0x23354c0, 0xc001c8fa40, 0xc000223c00, 0xc000477020, 0x1)
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation/utilization.go:84 +0x688
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation.(*Controller).Reconcile(0xc0014323a0, 0x23354c0, 0xc001c8fa40, 0x235e640, 0xc000223c00, 0x200c786, 0x5, 0xc000477020, 0x1)
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/reallocation/controller.go:72 +0x65
github.com/awslabs/karpenter/pkg/controllers.(*GenericController).Reconcile(0xc0014325e0, 0x23354c0, 0xc001c8fa40, 0xc001b55ae0, 0x7, 0xc001b55ad0, 0x7, 0xc001c8fa40, 0x7fdedb5e0db8, 0xc000411800, ...)
github.com/awslabs/karpenter/pkg/controllers/controller.go:58 +0x22b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000700c80, 0x2335400, 0xc001936c00, 0x1dc0420, 0xc001c89fa0)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:263 +0x2f5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000700c80, 0x2335400, 0xc001936c00, 0x0)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:235 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x2335400, 0xc001936c00)
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:198 +0x4a
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:185 +0x37
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0005f5750)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001ceff50, 0x22f4900, 0xc001c8f980, 0xc001936c01, 0xc000739260)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:156 +0xad
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0005f5750, 0x3b9aca00, 0x0, 0x1, 0xc000739260)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2335400, 0xc001936c00, 0xc000476f90, 0x3b9aca00, 0x0, 0x20d5201)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:185 +0xa6
k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x2335400, 0xc001936c00, 0xc000476f90, 0x3b9aca00)
k8s.io/apimachinery@v0.19.7/pkg/util/wait/wait.go:99 +0x57
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
sigs.k8s.io/controller-runtime@v0.7.0-alpha.3/pkg/internal/controller/controller.go:195 +0x4e7

@black-mirror-1 black-mirror-1 changed the title Controller keeps crashing with a panic error (v1.26) Controller keeps crashing with a panic error (v0.26) Jun 18, 2021
@black-mirror-1 black-mirror-1 changed the title Controller keeps crashing with a panic error (v0.26) Controller keeps crashing with a panic error (v0.2.6) Jun 18, 2021
@ellistarn
Copy link
Contributor

ellistarn commented Jun 18, 2021

Thanks for the bug report! It looks like your defaulting webhook (the component that applies defaults to resources) failed to apply TTLSeconds to your provisioner, which is resulting in a panic.

My webhooks look like this:

kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io defaulting.webhook.provisioners.karpenter.sh -ojson | jq -r ".webhooks[0].rules"
[
  {
    "apiGroups": [
      "provisioning.karpenter.sh"
    ],
    "apiVersions": [
      "v1alpha1"
    ],
    "operations": [
      "CREATE",
      "UPDATE"
    ],
    "resources": [
      "provisioners",
      "provisioners/status"
    ],
    "scope": "*"
  }
]

You can see that my defaulting webhook applies to provisioners on CREATE and UPDATE. If these settings don't exist, then the API Server won't execute defaulting logic. Our karpenter-webhook process injects these values on startup. I think a good path forward is to:

  1. Protect the karpenter controller by defaulting the in-memory provisioner object (i.e. enforce invariants) before each reconcile loop.
  2. Statically include provisioner in our webhooks yaml rather than relying on our karpenter-webhook process to inject them.

@njtran
Copy link
Contributor

njtran commented Jun 22, 2021

Both related PRs have now been merged. Closing this issue.

@njtran njtran closed this as completed Jun 22, 2021
@ellistarn
Copy link
Contributor

This will be released in v0.2.7

@ellistarn ellistarn added the v0.3 label Jun 22, 2021
gfcroft pushed a commit to gfcroft/karpenter-provider-aws that referenced this issue Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants