fix(recipe): deep-copy validation config during merge#992
Conversation
|
Welcome to AICR, @kiwigitops! Thanks for your first pull request. Before review, please ensure:
A maintainer will review this soon. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds private helpers — cloneValidationConfig, cloneValidationPhase, and cloneNodeSelection — that deep-copy ValidationConfig and its nested fields (constraints, checks, node-selection maps/slices). RecipeMetadataSpec.Merge now uses these helpers instead of pointer-assignment for validation phases. A new table-driven test mutates the overlay after merging and asserts the merged result is unchanged, confirming there is no aliasing of slices/maps between overlay and merged copies. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
mchmarny
left a comment
There was a problem hiding this comment.
Thanks for digging into #984. The branch was opened before the fix landed on main as 05b6ae37, which already solves the same root cause: it adds cloneValidationConfig / cloneValidationPhase in pkg/recipe/validation.go and wires them into Merge. After a rebase onto main, the additions to metadata.go in this PR become duplicate declarations and the package won't compile (verified locally with go build ./pkg/recipe/...).
The PR description mentioned go wasn't on PATH in the Windows environment — that's why this slipped through; CI didn't run either because the gate workflow appears to require maintainer approval for new external contributors.
Recommendation: rebase onto main, remove the helper additions in metadata.go (they're already in validation.go), and keep only the per-phase aliasing test in metadata_test.go — that subtest is the one genuinely useful improvement here, since current main only exercises two of the four phases. Inline comments mark each location.
If a rebase ends up trivial enough that nothing remains beyond the test, closing this PR in favor of a small follow-up scoped to just the test addition would also work.
| func cloneValidationConfig(in *ValidationConfig) *ValidationConfig { | ||
| if in == nil { | ||
| return nil | ||
| } | ||
| return &ValidationConfig{ | ||
| Readiness: cloneValidationPhase(in.Readiness), | ||
| Deployment: cloneValidationPhase(in.Deployment), | ||
| Performance: cloneValidationPhase(in.Performance), | ||
| Conformance: cloneValidationPhase(in.Conformance), | ||
| } | ||
| } | ||
|
|
||
| func cloneValidationPhase(in *ValidationPhase) *ValidationPhase { | ||
| if in == nil { | ||
| return nil | ||
| } | ||
| out := *in | ||
| out.Constraints = append([]Constraint(nil), in.Constraints...) | ||
| out.Checks = append([]string(nil), in.Checks...) | ||
| out.NodeSelection = cloneNodeSelection(in.NodeSelection) | ||
| return &out | ||
| } |
There was a problem hiding this comment.
Both cloneValidationConfig and cloneValidationPhase already exist on main since 05b6ae37 (in pkg/recipe/validation.go, lines 69 and 84), which fixed #984 directly. Adding them here is a redeclaration — go build ./pkg/recipe/... on this branch fails:
pkg/recipe/validation.go:69:6: cloneValidationConfig redeclared in this block
pkg/recipe/metadata.go:620:6: other declaration of cloneValidationConfig
pkg/recipe/validation.go:84:6: cloneValidationPhase redeclared in this block
pkg/recipe/metadata.go:632:6: other declaration of cloneValidationPhase
The reason CI didn't catch this is that the gate workflow likely requires maintainer approval for first-time external contributors — the only checks that ran here were Label PRs and the skipped auto-merge job.
Please rebase onto main; once you do, you'll see the helpers (and the s.Validation = cloneValidationConfig(...) call sites in Merge) are already in place. Both helpers in this hunk should then be removed.
| func cloneNodeSelection(in *NodeSelection) *NodeSelection { | ||
| if in == nil { | ||
| return nil | ||
| } | ||
| out := *in | ||
| if in.Selector != nil { | ||
| out.Selector = make(map[string]string, len(in.Selector)) | ||
| for key, value := range in.Selector { | ||
| out.Selector[key] = value | ||
| } | ||
| } | ||
| out.ExcludeNodes = append([]string(nil), in.ExcludeNodes...) | ||
| return &out | ||
| } |
There was a problem hiding this comment.
cloneNodeSelection is a new helper, but main's cloneValidationPhase (validation.go:84) already clones NodeSelection inline — both the Selector map and ExcludeNodes slice. Extracting it as a named function isn't a bad idea on its own, but it should live in validation.go next to its siblings, not metadata.go. After rebasing, if you still want this extraction, open it as a small refactor PR against the already-fixed code.
| t.Run("merged validation does not alias overlay", func(t *testing.T) { | ||
| phases := []struct { | ||
| name string | ||
| set func(*ValidationConfig, *ValidationPhase) | ||
| get func(*ValidationConfig) *ValidationPhase | ||
| }{ | ||
| { | ||
| name: "readiness", | ||
| set: func(config *ValidationConfig, phase *ValidationPhase) { config.Readiness = phase }, | ||
| get: func(config *ValidationConfig) *ValidationPhase { return config.Readiness }, | ||
| }, | ||
| { | ||
| name: "deployment", | ||
| set: func(config *ValidationConfig, phase *ValidationPhase) { config.Deployment = phase }, | ||
| get: func(config *ValidationConfig) *ValidationPhase { return config.Deployment }, | ||
| }, | ||
| { | ||
| name: "performance", | ||
| set: func(config *ValidationConfig, phase *ValidationPhase) { config.Performance = phase }, | ||
| get: func(config *ValidationConfig) *ValidationPhase { return config.Performance }, | ||
| }, | ||
| { | ||
| name: "conformance", | ||
| set: func(config *ValidationConfig, phase *ValidationPhase) { config.Conformance = phase }, | ||
| get: func(config *ValidationConfig) *ValidationPhase { return config.Conformance }, | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range phases { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| base := RecipeMetadataSpec{} | ||
| overlay := RecipeMetadataSpec{Validation: &ValidationConfig{}} | ||
| tt.set(overlay.Validation, &ValidationPhase{ | ||
| Constraints: []Constraint{{Name: testK8sVersionConstant, Value: ">= 1.30"}}, | ||
| Checks: []string{"expected-resources"}, | ||
| NodeSelection: &NodeSelection{ | ||
| Selector: map[string]string{"accelerator": "h100"}, | ||
| MaxNodes: 2, | ||
| ExcludeNodes: []string{"node-a"}, | ||
| }, | ||
| }) | ||
| base.Merge(&overlay) | ||
|
|
||
| overlayPhase := tt.get(overlay.Validation) | ||
| overlayPhase.Constraints[0].Value = ">= 1.31" | ||
| overlayPhase.Checks[0] = "check-nvidia-smi" | ||
| overlayPhase.NodeSelection.Selector["accelerator"] = "b200" | ||
| overlayPhase.NodeSelection.ExcludeNodes[0] = "node-b" | ||
|
|
||
| mergedPhase := tt.get(base.Validation) | ||
| if mergedPhase.Constraints[0].Value != ">= 1.30" { | ||
| t.Error("constraints should not alias overlay") | ||
| } | ||
| if mergedPhase.Checks[0] != "expected-resources" { | ||
| t.Error("checks should not alias overlay") | ||
| } | ||
| if mergedPhase.NodeSelection.Selector["accelerator"] != "h100" { | ||
| t.Error("node selector should not alias overlay") | ||
| } | ||
| if mergedPhase.NodeSelection.ExcludeNodes[0] != "node-a" { | ||
| t.Error("excluded nodes should not alias overlay") | ||
| } | ||
| }) | ||
| } |
There was a problem hiding this comment.
This is the salvageable part of the PR — the per-phase aliasing matrix (readiness/deployment/performance/conformance) is genuinely better coverage than what landed with 05b6ae37, which only exercises the deployment+conformance paths in t.Run("merge does not alias source validation", ...) at the bottom of TestMergeValidationConfig. After dropping the duplicate helpers in metadata.go, this table-driven subtest is worth keeping. Consider trimming the focused PR to just this test addition.
|
Closing this because main now includes the same validation deep-copy fix via #989. That should cover the production aliasing issue without keeping duplicate PRs open. |
Summary
Closes #984.
Testing
git diff --checkI could not run
go test ./pkg/recipelocally because this Windows environment does not havegoon PATH.