Conversation
📝 WalkthroughWalkthroughAdded a KVM failover evacuation weigher to multiple Nova pipeline configs, increased default failover reservation requirement and reconciliation cadence/throughput, and added multi-context Kubernetes client support to the reservations visualization tool. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Tool as visualize-reservations
participant K8sHV as K8s (Hypervisors)
participant K8sRes as K8s (Reservations)
participant K8sPGSecret as K8s (Postgres Secret)
participant Postgres
User->>Tool: run CLI with --hypervisor-context, --reservation-context, --postgres-context
Tool->>K8sHV: getClientForContext(hypervisor-context) then List Hypervisors
Tool->>K8sRes: getClientForContext(reservation-context) then List Reservations
Tool->>K8sPGSecret: getClientForContext(postgres-context) then Get Secret
K8sPGSecret-->>Tool: return secret (or not found)
Tool->>Postgres: connect using secret (if present) and query server records
Postgres-->>Tool: return server map
Tool->>Tool: reconcile hypervisors vs reservations vs postgres, compute statuses
Tool-->>User: print summary with resolved context names and statuses
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
helm/bundles/cortex-nova/templates/pipelines_kvm.yaml (1)
540-547: This weigher is effectively a no-op here.The pipeline description says acknowledgement validates only the reservation host, so host ranking never changes the outcome. If that contract is accurate, keep
weighers: []; otherwise, update the description to explain which additional candidates can reach this pipeline.♻️ Suggested cleanup
- weighers: - - name: kvm_failover_evacuation - description: | - This weigher prefers hosts with active failover reservations during - evacuation requests. Hosts matching a failover reservation where the - VM is allocated get a higher weight, encouraging placement on - pre-reserved failover capacity. For non-evacuation requests, this - weigher has no effect. + weighers: []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@helm/bundles/cortex-nova/templates/pipelines_kvm.yaml` around lines 540 - 547, The kvm_failover_evacuation weigher is a no-op given the pipeline's acknowledgement semantics: either remove it by setting weighers: [] or update the description for the weigher (kvm_failover_evacuation) to clearly state which additional candidate hosts can be considered by this pipeline and how its weighting affects ranking; choose one: replace the current weigher entry with weighers: [] if acknowledgements truly short-circuit host selection, or revise the description text for kvm_failover_evacuation to explain the non-acknowledgement cases and which candidates can reach this pipeline so the weigher has visible effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@helm/bundles/cortex-nova/templates/pipelines_kvm.yaml`:
- Around line 540-547: The kvm_failover_evacuation weigher is a no-op given the
pipeline's acknowledgement semantics: either remove it by setting weighers: []
or update the description for the weigher (kvm_failover_evacuation) to clearly
state which additional candidate hosts can be considered by this pipeline and
how its weighting affects ranking; choose one: replace the current weigher entry
with weighers: [] if acknowledgements truly short-circuit host selection, or
revise the description text for kvm_failover_evacuation to explain the
non-acknowledgement cases and which candidates can reach this pipeline so the
weigher has visible effect.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2298b546-e814-45f3-b2f1-93199ff0d4c6
📒 Files selected for processing (3)
helm/bundles/cortex-nova/templates/pipelines_kvm.yamlhelm/bundles/cortex-nova/values.yamltools/visualize-reservations/main.go
Test Coverage ReportTest Coverage 📊: 68.3% |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/visualize-reservations/main.go`:
- Around line 292-293: The error output for hvClient.List (and similar early
failures) must include the active cluster/context so failures aren’t ambiguous
in multi-context mode; update the fmt.Fprintf(os.Stderr, ...) calls around
hvClient.List and the Postgres ping recovery hint to append the selected context
name (the value passed via --postgres-context or the resolved context variable
used to build ctx) or use a context-aware logger that injects that context;
search for hvClient.List, fmt.Fprintf(os.Stderr, ...), and the Postgres ping
recovery hint blocks (also at the other mentioned locations) and modify their
error messages to include the context identifier variable (e.g.,
postgresContext/selectedContext) so port-forward/rerun instructions and error
lines clearly reference the correct cluster.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a7bee820-3045-4f6d-b92a-32f2bfcdff2b
📒 Files selected for processing (1)
tools/visualize-reservations/main.go
| if err := hvClient.List(ctx, &allHypervisors); err != nil { | ||
| fmt.Fprintf(os.Stderr, "Error listing hypervisors: %v\n", err) |
There was a problem hiding this comment.
Thread the selected context through all failure output.
The new summary/secret-read messages are context-aware, but the early List failures and the Postgres ping recovery hint still are not. In multi-context mode that leaves the failing cluster ambiguous, and the port-forward/rerun instructions can point people at the wrong cluster when --postgres-context is set.
🛠️ Suggested fix
+ contextLabel := func(name string) string {
+ if name == "" {
+ return "(current context)"
+ }
+ return name
+ }
+
// Get all hypervisors to find all VMs (use hvClient)
var allHypervisors hv1.HypervisorList
if err := hvClient.List(ctx, &allHypervisors); err != nil {
- fmt.Fprintf(os.Stderr, "Error listing hypervisors: %v\n", err)
+ fmt.Fprintf(os.Stderr, "Error listing hypervisors from %s: %v\n", contextLabel(*hypervisorContext), err)
return
}
@@
// Get all reservations (both failover and committed) (use resClient)
var allReservations v1alpha1.ReservationList
if err := resClient.List(ctx, &allReservations); err != nil {
- fmt.Fprintf(os.Stderr, "Error listing reservations: %v\n", err)
+ fmt.Fprintf(os.Stderr, "Error listing reservations from %s: %v\n", contextLabel(*reservationContext), err)
return
} if err := db.PingContext(ctx); err != nil {
fmt.Fprintf(os.Stderr, "Warning: Could not ping postgres at %s:%s: %v\n", host, port, err)
fmt.Fprintf(os.Stderr, " If running locally, use kubectl port-forward:\n")
- fmt.Fprintf(os.Stderr, " kubectl port-forward svc/%s %s:%s -n %s\n", host, port, port, namespace)
- fmt.Fprintf(os.Stderr, " ./visualize-reservations --postgres-host=localhost --postgres-port=%s\n\n", port)
+ if contextName != "" {
+ fmt.Fprintf(os.Stderr, " kubectl --context=%s port-forward svc/%s %s:%s -n %s\n", contextName, host, port, port, namespace)
+ fmt.Fprintf(os.Stderr, " ./visualize-reservations --postgres-context=%s --postgres-host=localhost --postgres-port=%s\n\n", contextName, port)
+ } else {
+ fmt.Fprintf(os.Stderr, " kubectl port-forward svc/%s %s:%s -n %s\n", host, port, port, namespace)
+ fmt.Fprintf(os.Stderr, " ./visualize-reservations --postgres-host=localhost --postgres-port=%s\n\n", port)
+ }
db.Close()
return nil, nil, nil
}Also applies to: 338-339, 1349-1365, 1416-1420
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/visualize-reservations/main.go` around lines 292 - 293, The error
output for hvClient.List (and similar early failures) must include the active
cluster/context so failures aren’t ambiguous in multi-context mode; update the
fmt.Fprintf(os.Stderr, ...) calls around hvClient.List and the Postgres ping
recovery hint to append the selected context name (the value passed via
--postgres-context or the resolved context variable used to build ctx) or use a
context-aware logger that injects that context; search for hvClient.List,
fmt.Fprintf(os.Stderr, ...), and the Postgres ping recovery hint blocks (also at
the other mentioned locations) and modify their error messages to include the
context identifier variable (e.g., postgresContext/selectedContext) so
port-forward/rerun instructions and error lines clearly reference the correct
cluster.
No description provided.