fix(bundler): synthesize GKE ResourceQuota for critical-priority pods by mchmarny · Pull Request #921 · NVIDIA/aicr

mchmarny · 2026-05-15T14:35:18Z

Summary

GKE Standard ships a kube-system ResourceQuota scoped to system-*-critical PriorityClasses. Per the Kubernetes spec, once any cluster-wide quota scopes by PriorityClass for those values, pods that request a matching priority class can only be created in namespaces that have a matching quota. AICR bundles emit gpu-operator (and any chart that defaults priorityClassName to system-node-critical) into per-component namespaces, so admission blocks every pod and helmfile apply --wait times out after ~10 minutes with an opaque Replicas: 0/1 error.

Fixes: #915
Related: N/A

Type of Change

Bug fix (non-breaking change that fixes an issue)
Documentation update (demo workaround removed)

Component(s) Affected

Bundlers (pkg/bundler, pkg/component/*)
Recipe engine / data (pkg/recipe)
Docs/examples (docs/, examples/)

Implementation Notes

Schema: new gkeCriticalPriority field on registry component entries (pkg/recipe/components.go). gpu-operator is marked today; future components that emit system-*-critical pods opt in by setting the same field.
Synthesis seam: pkg/bundler/bundler.go::injectGKECriticalPriorityQuotas runs inside collectComponentPreManifests, so every deployer (helm, helmfile, argocd, argocd-helm, flux) benefits without per-deployer branching. The synthesized YAML is added under filename aicr/synthesized/gke-critical-pods-quota.yaml in the per-component pre-manifest map; the directory prefix prevents collision with any real PreManifestFiles path declared by an overlay.
Pods cap formula: max(criteria.Nodes × 32, 32). The 32-per-node multiplier covers gpu-operator's ~8–10 critical-priority DaemonSets per GPU node (driver, toolkit, device-plugin, GFD, DCGM, DCGM exporter, MIG manager, validator) plus the controller Deployment, with headroom for rolling-update churn. The 32-pod floor handles recipes that did not declare --nodes (e.g., demo configs). For a 2,000-node cluster the cap is 64,000 — still an admission allowlist, not a real capacity cap.
Phase choice: PreManifestFiles, not ManifestFiles. The chart's pods are rejected by admission until the quota exists, so the quota must land first. (The prior recipes/components/gpu-operator/manifests/gke-resource-quota.yaml was in ManifestFiles, applied AFTER the chart — that's why the demo workaround had users kubectl apply the quota manually before helmfile apply.)
Non-GKE recipes are unchanged: injectGKECriticalPriorityQuotas short-circuits when criteria.Service \!= gke, so EKS / AKS / OKE / bare-metal / kind bundles never see the synthesized manifest.
Removed superseded artifacts: recipes/components/gpu-operator/manifests/gke-resource-quota.yaml (file) and the corresponding manifestFiles entry in recipes/overlays/gke-cos.yaml. The bundler now handles this for every GKE overlay (including future gke-ubuntu etc.), not just gke-cos.

Testing

make qualify

New tests in pkg/bundler/bundler_gke_quota_test.go:

TestComputeGKECriticalPriorityQuotaPods (6 sub-cases): floor for 0/negative, identity at 1 node, computed for 8/100/2000 nodes.
TestRenderGKECriticalPriorityQuota: shape pin on apiVersion/kind/metadata/spec/scopeSelector.
TestInjectGKECriticalPriorityQuotas (6 sub-cases): gke+marked+nodes → cap; gke+marked+zero → floor; non-gke+marked → no-op; gke+unmarked → no-op; gke+marked+empty-namespace → skipped with warning; gke+mixed → only marked component synthesized.
TestInjectGKECriticalPriorityQuotas_CoexistsWithExistingPreManifests: additive merge with existing PreManifestFiles.
TestInjectGKECriticalPriorityQuotas_NilInputs: nil-tolerant contract.

Coverage on pkg/bundler: stable, full project coverage 76.4% (over 75% threshold).

Risk Assessment

Low — Isolated change, well-tested, easy to revert.

Rollout notes: Additive on non-GKE recipes (short-circuits). For GKE recipes, the synthesized quota is applied before the chart, fixing the previously-broken default flow. No new CLI flags, no API surface changes. Existing GKE-COS overlays no longer need the manual manifestFiles entry — removed in the same PR.

Checklist

Tests pass locally (make test with -race)
Linter passes (make lint)
I did not skip/disable tests to make CI green
I added/updated tests for new functionality
I updated docs if user-facing behavior changed
Changes follow existing patterns in the codebase
Commits are cryptographically signed (git commit -S)

GKE Standard ships a kube-system ResourceQuota scoped to system-*-critical PriorityClasses. Per the Kubernetes spec, once any cluster-wide quota scopes by PriorityClass for those values, pods that request a matching priority class can only be created in namespaces that have a matching quota. AICR bundles emit gpu-operator (and any other chart that defaults priorityClassName to system-node-critical) into per-component namespaces, so admission blocks every pod and `helmfile apply --wait` times out after ~10 minutes with an opaque "Replicas: 0/1" error. Fix: add a `gkeCriticalPriority` field to the registry schema. When the recipe's criteria.service is "gke" and a referenced component declares it, the bundler synthesizes a permissive ResourceQuota (pods = max(criteria.Nodes × 32, 32)) into the component's namespace via PreManifestFiles so the quota exists before the chart's pods attempt admission. Marks gpu-operator today; supersedes the overlay-driven manifestFiles entry in gke-cos.yaml (deleted). Demo cuj1-gke-config.md: drops the manual kubectl apply workaround. Fixes #915

coderabbitai · 2026-05-15T14:39:03Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 86eb8345-716a-4a1e-9093-725286a7faaa

📥 Commits

Reviewing files that changed from the base of the PR and between fccadce and abfb291.

📒 Files selected for processing (2)

pkg/bundler/bundler.go
pkg/bundler/bundler_gke_quota_test.go

📝 Walkthrough

Walkthrough

This PR automates GKE ResourceQuota injection for components that declare system-*-critical priority classes. The bundler now synthesizes per-namespace ResourceQuota manifests when a recipe targets GKE and components are marked gkeCriticalPriority: true; pod limits are computed from node count with a floor of 32. YAML rendering is deterministic. The static gke-resource-quota manifest and the demo workaround instructions were removed. Tests cover computation, rendering, injection gating, coexistence, and nil-handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

NVIDIA/aicr#917: Main PR’s bundler now synthesizes/injects the GKE “critical priority” ResourceQuota automatically and the updated demos/cuj1-gke-config.md removes the manual ResourceQuota workaround that retrieved PR #917 documented as workaround #915.

Suggested labels

area/tests, size/L

Suggested reviewers

lalitadithya

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: synthesizing GKE ResourceQuota for critical-priority pods, which is the primary objective of this PR.
Description check	✅ Passed	The description is detailed and directly related to the changeset, covering the problem statement, implementation approach, testing, and risks in a way that clearly ties to the code changes.
Linked Issues check	✅ Passed	The PR fully addresses issue `#915` by implementing option A (bundler-side synthesis). It adds the gkeCriticalPriority field, synthesizes ResourceQuota via injectGKECriticalPriorityQuotas, uses the specified pods formula, and ensures GKE-only application with comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue `#915`. The removal of gke-resource-quota.yaml, updates to registry and overlays, new bundler logic, schema additions, and comprehensive tests are all necessary to implement the synthesized quota solution.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/915-gke-critical-priority-quota

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/bundler/bundler.go`:
- Around line 1305-1330: The current renderGKECriticalPriorityQuota builds the
quota map and calls yaml.Marshal(quota) which yields non-deterministic YAML;
replace that call with serializer.MarshalYAMLDeterministic(quota) and update
imports to include the serializer package used elsewhere in the repo (add the
serializer import to the file's import block). Keep the same quota variable and
return signature so only the call site and imports change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: b43d9ccd-bb65-4a69-bb8d-a48cfb386836

📥 Commits

Reviewing files that changed from the base of the PR and between 9776d9f and fccadce.

📒 Files selected for processing (7)

demos/cuj1-gke-config.md
pkg/bundler/bundler.go
pkg/bundler/bundler_gke_quota_test.go
pkg/recipe/components.go
recipes/components/gpu-operator/manifests/gke-resource-quota.yaml
recipes/overlays/gke-cos.yaml
recipes/registry.yaml

💤 Files with no reviewable changes (2)

recipes/components/gpu-operator/manifests/gke-resource-quota.yaml
demos/cuj1-gke-config.md

github-actions · 2026-05-15T14:44:57Z

Coverage Report ✅

Metric	Value
Coverage	76.3%
Threshold	75%
Status	Pass

Coverage Badge

![Coverage](https://img.shields.io/badge/coverage-76.3%25-green)

Merging this branch will increase overall coverage

Impacted Packages	Coverage Δ	🤖
github.com/NVIDIA/aicr/pkg/bundler	63.21% (+1.26%)	👍
github.com/NVIDIA/aicr/pkg/recipe	90.23% (ø)

Coverage by file

Changed files (no unit tests)

Changed File	Coverage Δ	Total	Covered	Missed	🤖
github.com/NVIDIA/aicr/pkg/bundler/bundler.go	59.87% (+1.93%)	461 (+33)	276 (+28)	185 (+5)	👍
github.com/NVIDIA/aicr/pkg/recipe/components.go	86.54% (ø)	104	90	14

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Per CodeRabbit review on #921. Switches the synthesized ResourceQuota from yaml.Marshal to serializer.MarshalYAMLDeterministic so the bytes are stable across runs — the manifest lands in the bundle artifact (checksummed and optionally attested), and yaml.v3 walks randomized Go map order. Adds TestRenderGKECriticalPriorityQuota_Deterministic to guard against accidental regression.

…al-aware dependsOn The flux deployer never consumed `ComponentPreManifests`, so the synthesized GKE critical-priority `ResourceQuota` from PR NVIDIA#921 was silently dropped on `--deployer flux` bundles and the original symptom from NVIDIA#915 reproduced. The same gap also blocked the os-talos mixin from working on flux. Add `ComponentPreManifests` to `flux.Generator`, wire it through `bundler.buildDeployer`, and emit a `<name>-pre` HelmRelease ahead of the primary when pre-manifests exist. Rewire the primary's `dependsOn` to point at `<name>-pre` so the chain is `previous → <name>-pre → <name> → <name>-post → next`. Also fix a pre-existing ordering bug: the next component used to depend on the previous component's primary name, not its terminal (`<prev>-post` for mixed components). New `terminalReleaseNameFor` helper resolves the correct tail; `buildPrimaryDependsOn` and the README renderer both use it. Without this fix, Flux could reconcile the next component in parallel with the previous component's post manifests. Add a `<name>-pre` collision guard mirroring the rule in `pkg/bundler/deployer/localformat/writer.go`. Tests: pre-only, pre+post with terminal-aware next-component dependsOn (regression guard), collision rejection, and a refreshed `TestBuildPrimaryDependsOn` covering mixed, manifest-only, and chart-only previous components. Docs: `pkg/bundler/deployer/flux/doc.go` describes the chain and the example tree now includes `<name>-pre/`; `docs/user/cli-reference.md` updates the Flux Deployment Order bullet. Fixes: NVIDIA#923

mchmarny requested review from a team as code owners May 15, 2026 14:35

github-actions Bot added area/recipes area/bundler size/XL labels May 15, 2026

mchmarny added the bug label May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread pkg/bundler/bundler.go Outdated

mchmarny self-assigned this May 15, 2026

lalitadithya previously approved these changes May 15, 2026

View reviewed changes

mchmarny dismissed lalitadithya’s stale review via abfb291 May 15, 2026 14:46

lalitadithya approved these changes May 15, 2026

View reviewed changes

mchmarny enabled auto-merge (squash) May 15, 2026 14:52

mchmarny merged commit 6ffbd88 into main May 15, 2026
89 checks passed

mchmarny deleted the fix/915-gke-critical-priority-quota branch May 15, 2026 14:57

This was referenced May 15, 2026

[Bug]: --deployer flux drops PreManifestFiles, so GKE ResourceQuota fix (#921) does not reach flux bundles #923

Closed

fix(bundler): wire PreManifestFiles through flux deployer with terminal-aware dependsOn #924

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bundler): synthesize GKE ResourceQuota for critical-priority pods#921

fix(bundler): synthesize GKE ResourceQuota for critical-priority pods#921
mchmarny merged 2 commits into
mainfrom
fix/915-gke-critical-priority-quota

mchmarny commented May 15, 2026

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented May 15, 2026 •

edited

Loading

Changed files (no unit tests)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mchmarny commented May 15, 2026

Summary

Type of Change

Component(s) Affected

Implementation Notes

Testing

Risk Assessment

Checklist

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report ✅

Merging this branch will increase overall coverage

Changed files (no unit tests)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 15, 2026 •

edited

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading