Skip to content

fix(bundler): tolerate trailing slash on repoURL; quote Chart.yaml version#1038

Merged
mchmarny merged 2 commits into
NVIDIA:mainfrom
yuanchen8911:followup/argocd-helm-nits
May 26, 2026
Merged

fix(bundler): tolerate trailing slash on repoURL; quote Chart.yaml version#1038
mchmarny merged 2 commits into
NVIDIA:mainfrom
yuanchen8911:followup/argocd-helm-nits

Conversation

@yuanchen8911
Copy link
Copy Markdown
Contributor

Summary

Follow-up to PR #1035 (merged at d638d009) addressing two non-blocking review nits from @mchmarny that weren't included in the squash:

  1. injectValuesIntoSingleSource now pipes .Values.repoURL through trimSuffix "/" before appending /{{ .Chart.Name }}, so --set repoURL=oci://reg/org/ (easy copy-paste from a registry UI) renders as oci://reg/org/aicr-bundle instead of a silently-broken oci://reg/org//aicr-bundle.
  2. TestWriteChartYAML_QuotesYAMLReservedScalarsAsName now takes an explicit (chartName, chartVersion) pair and exercises the version branch with "1.0" and "123" rows — proves the %q wrap on version is load-bearing (without it "1.0" reparses as the YAML float 1), not just cosmetic.

Motivation / Context

Mark's inline review on PR #1035 flagged both as non-blocking nits; they're small but worth the durability. The original #1035 was approved and merged before these landed, so they need a separate PR.

Fixes: N/A (cosmetic / defense-in-depth)
Related: #1034 (the bug #1035 fixed), #1035 (where the nits were raised)

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Component(s) Affected

  • Bundlers (pkg/bundler, pkg/component/*) — pkg/bundler/deployer/argocdhelm/argocdhelm.go

Implementation Notes

Nit 1 — trailing-slash robustness. Single one-line template change. Goldens (nodewright-customizations.yaml, gpu-operator-post.yaml) regenerated to reflect the new | trimSuffix "/" }}/{{ .Chart.Name }} shape.

Nit 2 — exercise the version branch. Restructured the table from (name, chartName, wantUnmarsh) (which assumed wantUnmarsh == chartName and hardcoded version "1.0.0") to (name, chartName, chartVersion) with the assertion checking each input round-trips back to its exact string via yaml.Unmarshal. Added two version-side rows ("1.0", "123").

Drive-by. Relaxed two pre-existing assertions in TestInjectValuesIntoSingleSource that hardcoded .Values.repoURL }} (the closing brace) to just .Values.repoURL so they survive the new trimSuffix pipeline insertion. The load-bearing intent — the directive references the caller's value — is preserved; the exact pipeline shape is documented elsewhere.

Testing

unset GITLAB_TOKEN ; make qualify
# All 22 chainsaw tests pass.
# Coverage: 76.9% > 75% floor.
# Vulnerability scan: clean.

GOFLAGS="-mod=vendor" go test ./pkg/bundler/deployer/argocdhelm/...
# ok

Risk Assessment

  • Low — Single-line template change + test refactor. No behavioral change for clean --set repoURL values (existing renders pass through trimSuffix "/" unchanged); only operator-supplied trailing slashes are normalized. Bundles generated before this PR are unaffected; bundles generated after produce slightly more forgiving rendered output.

Rollout notes: None — the change is additive within the template pipeline and goldens.

Checklist

  • Tests pass locally (make test with -race) — via make qualify
  • Linter passes (make lint) — via make qualify
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed — no user-facing surface change
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

…rsion

Follow-up to PR NVIDIA#1035 (merged) addressing two non-blocking nits from
the review that weren't included in the squash:

1. Trailing-slash robustness on path-based child Applications

   --set repoURL=oci://reg/org/ (an easy copy-paste from a registry
   UI) used to render as oci://reg/org//aicr-bundle, which Argo CD
   silently fails to sync against — the double slash hits the OCI
   resolver as an empty path segment with no operator-facing hint.

   Pipe .Values.repoURL through trimSuffix "/" before appending
   /{{ .Chart.Name }} in injectValuesIntoSingleSource so the rendered
   value is always a clean single-slash artifact reference.

   Golden fixtures regenerated.

2. Exercise the version branch of writeChartYAML's %q quoting

   The TestWriteChartYAML_QuotesYAMLReservedScalarsAsName table only
   covered chart-name reserved scalars. A user-supplied version like
   "1.0" reparses as the YAML float 1 without the %q wrap on version,
   so the version-side quoting is load-bearing, not cosmetic.

   Restructured the table to take an explicit (chartName,
   chartVersion) pair and added float-looking ("1.0") and
   numeric-looking ("123") version rows alongside the existing
   reserved-scalar name rows. Each row now round-trips both fields
   through yaml.Unmarshal and asserts the exact input string is
   recovered, so writeChartYAML's version branch is exercised by the
   same test that documents the rationale.

Drive-by: relaxed two pre-existing substring assertions in
TestInjectValuesIntoSingleSource that hardcoded `.Values.repoURL }}`
to just `.Values.repoURL` so they survive the new trimSuffix pipeline
insertion without losing the load-bearing intent (they verify the
directive references the caller's value; the exact pipeline shape is
documented elsewhere).

Related: NVIDIA#1034 (the bug the original NVIDIA#1035 fixed), NVIDIA#1035 (where
Mark's review suggested these nits)
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 3f14a388-9b73-425c-a2a9-44294df48d15

📥 Commits

Reviewing files that changed from the base of the PR and between 1924c07 and 5d33800.

📒 Files selected for processing (2)
  • pkg/bundler/deployer/argocdhelm/argocdhelm.go
  • pkg/bundler/deployer/argocdhelm/argocdhelm_test.go

📝 Walkthrough

Walkthrough

This PR trims trailing slashes from user-supplied .Values.repoURL values in Helm templates for path-based child Applications. The core change adds trimSuffix "/" to .Values.repoURL before concatenating with /.Chart.Name in the injectValuesIntoSingleSource function's generated Helm template string. The same pattern is applied to two Helm template fixtures in testdata/. Unit tests are updated to match the revised template output, including loosened pattern matching for the repoURL assertion and expanded chart YAML roundtrip test cases with improved unmarshal failure diagnostics.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/aicr#1035: Modifies the same injectValuesIntoSingleSource Helm templating for spec.source.repoURL in path-based child applications, changing how {{ .Chart.Name }} is appended to .Values.repoURL.

Suggested labels

size/M

Suggested reviewers

  • lockwobr
  • mchmarny
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the two main changes: trailing slash handling for repoURL and version quoting in Chart.yaml.
Description check ✅ Passed The description is detailed and directly related to the changeset, explaining both nits from PR #1035 review and their implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@mchmarny mchmarny enabled auto-merge (squash) May 26, 2026 19:58
@mchmarny mchmarny merged commit 91a4552 into NVIDIA:main May 26, 2026
30 checks passed
yuanchen8911 added a commit to yuanchen8911/aicr that referenced this pull request May 26, 2026
The argocd-helm-oci wrapper script was passing the FULL bundle URL to
`helm install --set repoURL=…` (including the per-recipe chart name
at the end). That matched the pre-PR-NVIDIA#1032 contract where the parent
Application's `source.chart` was hardcoded to `aicr-bundle`.

PR NVIDIA#1032 (and NVIDIA#1035's reinforcement) changed the parent App template
to expect the parent-namespace-only repoURL and to append .Chart.Name
itself via the separate `source.chart` field. The wrapper script
wasn't updated to match. Result on every PR with argocd-helm-oci
Tier-1 KWOK coverage: the parent App resolves to
`oci://registry.aicr-registry.svc.cluster.local:5000/aicr/<recipe>/<recipe>:<tag>`,
the OCI artifact lookup 404s, gpu-operator-post's Application can
never sync, and the whole stack times out on
`GitOps sync timeout strike 1/3`.

The failure was masked on `main` because the most-recent KWOK
Cluster Validation run on `main` (#26469449378 at 0d3e62d, success)
ran *before* PR NVIDIA#1035 merged. After NVIDIA#1035 / NVIDIA#1036 / NVIDIA#1038 all landed
on main, no fresh KWOK run has triggered on `main` yet — but the
next one will fail the same way every open PR's argocd-helm-oci
Tier-1 jobs are currently failing.

Fix is a one-line drop of the per-recipe suffix from
OCI_IN_CLUSTER_REF in the argocd-helm-oci branch of generate_bundle.
The flux branch keeps the per-recipe suffix because flux's
OCIRepository CR consumes the FULL artifact URL (recipe segment
included). Updated the surrounding comment to point at the
post-NVIDIA#1032 contract so the next reader understands the asymmetry.

End-to-end check (verified from PR NVIDIA#1030's debug artifact at
b3f2296): repo-server log shows
`registry.aicr-registry.svc.cluster.local:5000/aicr/<recipe>/<recipe>:<tag>: not found`,
caused by the same double-append. With the recipe suffix dropped,
Argo's resolution `<repoURL>/<chart>:<tag>` aligns with the pushed
artifact at `oci://…/aicr/<recipe>:<tag>`.

Refs PR NVIDIA#1030 (where this surfaced), PR NVIDIA#1032 (contract change),
PR NVIDIA#1035 (parent App template enforcement).
yuanchen8911 added a commit to yuanchen8911/aicr that referenced this pull request May 26, 2026
The argocd-helm-oci wrapper script was passing the FULL bundle URL to
`helm install --set repoURL=…` (including the per-recipe chart name
at the end). That matched the pre-PR-NVIDIA#1032 contract where the parent
Application's `source.chart` was hardcoded to `aicr-bundle`.

PR NVIDIA#1032 (and NVIDIA#1035's reinforcement) changed the parent App template
to expect the parent-namespace-only repoURL and to append .Chart.Name
itself via the separate `source.chart` field. The wrapper script
wasn't updated to match. Result on every PR with argocd-helm-oci
Tier-1 KWOK coverage: the parent App resolves to
`oci://registry.aicr-registry.svc.cluster.local:5000/aicr/<recipe>/<recipe>:<tag>`,
the OCI artifact lookup 404s, gpu-operator-post's Application can
never sync, and the whole stack times out on
`GitOps sync timeout strike 1/3`.

The failure was masked on `main` because the most-recent KWOK
Cluster Validation run on `main` (#26469449378 at 0d3e62d, success)
ran *before* PR NVIDIA#1035 merged. After NVIDIA#1035 / NVIDIA#1036 / NVIDIA#1038 all landed
on main, no fresh KWOK run has triggered on `main` yet — but the
next one will fail the same way every open PR's argocd-helm-oci
Tier-1 jobs are currently failing.

Fix is a one-line drop of the per-recipe suffix from
OCI_IN_CLUSTER_REF in the argocd-helm-oci branch of generate_bundle.
The flux branch keeps the per-recipe suffix because flux's
OCIRepository CR consumes the FULL artifact URL (recipe segment
included). Updated the surrounding comment to point at the
post-NVIDIA#1032 contract so the next reader understands the asymmetry.

End-to-end check (verified from PR NVIDIA#1030's debug artifact at
b3f2296): repo-server log shows
`registry.aicr-registry.svc.cluster.local:5000/aicr/<recipe>/<recipe>:<tag>: not found`,
caused by the same double-append. With the recipe suffix dropped,
Argo's resolution `<repoURL>/<chart>:<tag>` aligns with the pushed
artifact at `oci://…/aicr/<recipe>:<tag>`.

Refs PR NVIDIA#1030 (where this surfaced), PR NVIDIA#1032 (contract change),
PR NVIDIA#1035 (parent App template enforcement).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants