Skip to content

Conversation

@zhiying-lin
Copy link
Collaborator

Description of your changes

Check if the master resource snapshot is created in the rollout controller to fix the panic in e2e tests.

test failure

2025-11-28T00:03:22Z    ERROR   Observed a panic    {"controller": "resource-placement-rollout-controller", "namespace": "application-1", "name": "rp-1", "reconcileID": "cc323a34-c4f5-42b2-9d75-ac33caefb1ce", "panic": "runtime error: invalid memory address or nil pointer dereference", "panicGoValue": "\"invalid memory address or nil pointer dereference\"", "stacktrace": "goroutine 1907 [running]:\nk8s.io/apimachinery/pkg/util/runtime.logPanic({0x3718818, 0xc0032e4990}, {0x2fdcce0, 0x4b07c10})\n\t/go/pkg/mod/k8s.io/apimachinery@v0.34.1/pkg/util/runtime/runtime.go:132 +0xbc\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile.func1()\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:108 +0x112\npanic({0x2fdcce0?, 0x4b07c10?})\n\t/usr/local/go/src/runtime/panic.go:792 +0x132\ngithub.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.createUpdateInfo({0x3745198, 0xc00206c008}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0})\n\t/workspace/pkg/controllers/rollout/controller.go:316 +0xde\ngithub.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.(*Reconciler).pickBindingsToRoll(0xc0004fe4b0, {0x3718818, 0xc0032e4990}, {0xc000d148a0, 0x2, 0xc00287c960?}, {0x0, 0x0}, {0x3744f08, 0xc00131a3c0}, ...)\n\t/workspace/pkg/controllers/rollout/controller.go:424 +0x1fa5\ngithub.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.(*Reconciler).Reconcile(0xc0004fe4b0, {0x3718818, 0xc0032e4990}, {{{0xc002696130?, 0x33944c5?}, {0xc002696110?, 0x100?}}})\n\t/workspace/pkg/controllers/rollout/controller.go:168 +0x18eb\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile(0xc0032e4900?, {0x3718818?, 0xc0032e4990?}, {{{0xc002696130?, 0x0?}, {0xc002696110?, 0x0?}}})\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:119 +0xbf\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler(0x3743040, {0x3718850, 0xc0004ffa90}, {{{0xc002696130, 0xd}, {0xc002696110, 0x4}}}, 0x0)\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:340 +0x3ad\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem(0x3743040, {0x3718850, 0xc0004ffa90})\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:300 +0x21b\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.1()\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:202 +0x85\ncreated by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2 in goroutine 295\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:198 +0x28f\n"}
k8s.io/apimachinery/pkg/util/runtime.logPanic
    /go/pkg/mod/k8s.io/apimachinery@v0.34.1/pkg/util/runtime/runtime.go:142
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile.func1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:108
runtime.gopanic
    /usr/local/go/src/runtime/panic.go:792
runtime.panicmem
    /usr/local/go/src/runtime/panic.go:262
runtime.sigpanic
    /usr/local/go/src/runtime/signal_unix.go:925
github.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.createUpdateInfo
    /workspace/pkg/controllers/rollout/controller.go:316
github.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.(*Reconciler).pickBindingsToRoll
    /workspace/pkg/controllers/rollout/controller.go:424
github.com/kubefleet-dev/kubefleet/pkg/controllers/rollout.(*Reconciler).Reconcile
    /workspace/pkg/controllers/rollout/controller.go:168
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:340
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:300
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:202
2025-11-28T00:03:22Z    ERROR   Reconciler error    {"controller": "resource-placement-rollout-controller", "namespace": "application-1", "name": "rp-1", "reconcileID": "cc323a34-c4f5-42b2-9d75-ac33caefb1ce", "error": "panic: runtime error: invalid memory address or nil pointer dereference [recovered]"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:353
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:300
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.21.0/pkg/internal/controller/controller.go:202

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

Signed-off-by: Zhiying Lin <zhiyingl456@gmail.com>
@codecov
Copy link

codecov bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@weng271190436 weng271190436 merged commit b2198c0 into kubefleet-dev:main Dec 1, 2025
25 of 27 checks passed
"placement", placementObjRef)
return runtime.Result{}, err
}
if masterResourceSnapshot == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder when will this happen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we create the policy snapshot first and then resource snapshots.
The scheduler will create the binding and binding triggers the rollout controller.
Since it takes some time to create the resourceSnapshots for large resources, the rollout controller cannot find the snapshots at that time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we reverse the order then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants