Skip to content

feat: Add OCI Helm chart registry support and upgrade to Helm v4#147

Open
zacharyblasczyk wants to merge 22 commits intomainfrom
feat/oci-helm-chart-support
Open

feat: Add OCI Helm chart registry support and upgrade to Helm v4#147
zacharyblasczyk wants to merge 22 commits intomainfrom
feat/oci-helm-chart-support

Conversation

@zacharyblasczyk
Copy link
Copy Markdown
Contributor

@zacharyblasczyk zacharyblasczyk commented Mar 30, 2026

Summary

  • Upgrade Helm SDK from v3.19.2 to v4.1.3, adapting to all breaking API changes (import path restructuring, Config.Init signature, Entry field rename, ContentCache)
  • Disable Server-Side Apply (SSA) to preserve Helm v3 client-side apply semantics. Helm v4 defaults to SSA (HIP-0023), but SSA cannot remove API-server-defaulted fields (e.g., strategy.rollingUpdate) when chart upgrades change discriminator fields (e.g., strategy.type: Recreate), causing validation failures. See commit 7e19a96 for the full analysis.
  • Use HookOnlyStrategy for Install/Upgrade/Rollback wait behavior. The operator's own reconcile loop monitors resource readiness, so Helm doesn't need to block. StatusWatcherStrategy caused "another operation is in progress" errors when the reconciler re-entered during a long Helm wait.
  • Add OCIRelease chart type that pulls Helm charts from OCI-based registries using Helm's native registry.Client.Pull() with in-memory chart loading via loader.LoadArchive()
  • Use release.Accessor interface in isInstalled() instead of type-asserting to *releasev1.Release, making it resilient to future Helm release type versions
  • Remove dead code (DownloadChart() function that was never called)

Users can now configure OCI chart sources:

spec:
  chart:
    url: oci://ghcr.io/wandb/helm-charts/operator-wandb
    version: "1.0.0"
    credentialSecret:
      name: oci-registry-creds

Helm v4 decisions

Decision Rationale
ServerSideApply = false SSA field-merge can't remove API-server-defaulted fields (e.g., rollingUpdate), breaking chart upgrades that change deployment strategy. CSA does full object replacement.
HookOnlyStrategy Operator reconcile loop already checks readiness. Helm blocking on StatusWatcherStrategy conflicts with the reconciler's requeue interval.
ContentCache New required field in Helm v4's ChartDownloader.

Test plan

  • 61 Ginkgo specs pass in pkg/wandb/spec/charts/ (40 existing + 21 new OCI)
  • All unit test packages pass
  • Fresh install of chart v0.41.3 on clean Orbstack cluster via Tilt
  • Validated SSA=true breaks on strategy change (RollingUpdate→Recreate), SSA=false works
  • go build ./... clean
  • go vet ./... clean

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Support for pulling and deploying Helm charts from OCI registries; OCI-based manifests supported.
  • Chores

    • Upgraded Helm/Kubernetes libraries and refreshed core dependencies.
    • Added the "orbstack" Kubernetes context to Tilt settings and README.
    • Updated operator-wandb chart reference to v0.41.3.
    • Added local OCI registry helpers for testing and publishing charts.
  • Tests

    • Added tests for OCI chart handling, credential scenarios, apply/prune flows.

Enable the operator to pull Helm charts from OCI-based registries via
a new OCIRelease chart type, and upgrade the Helm SDK from v3.19.2 to
v4.1.3 to align with the current Helm ecosystem.

Helm v4 upgrade:
- Update all import paths (chart/v2, release/v1, repo/v1, etc.)
- Adapt to v4 API changes: Config.Init log param removed,
  Uninstall.Wait replaced by WaitStrategy, Entry field rename
- Remove dead DownloadChart() code
- Regenerate counterfeiter fakes

OCI support:
- New OCIRelease type in pkg/wandb/spec/charts/oci.go using Helm's
  native registry.Client.Pull() with in-memory chart loading
- Custom ociurl validator using registry.IsOCI()
- Supports authentication via direct credentials or K8s Secrets
- PlainHTTP option for insecure/local registries
- Registered in chart dispatcher between LocalRelease and RepoRelease
- 21 new Ginkgo test specs covering validation, dispatcher routing,
  credential secrets, and error paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Helm v4 migration, broad dependency upgrades, OCI-backed Helm chart support (new OCIRelease with pull/apply/prune and tests), local OCI registry test tooling and manifests, and permits the "orbstack" Kubernetes context in Tilt configuration.

Changes

Cohort / File(s) Summary
Tilt & sample
Tiltfile, tilt-settings.sample.json, README.md
Added "orbstack" to allowed Tilt contexts and updated documentation formatting.
Testing manifests
hack/testing-manifests/wandb/default.yaml, hack/testing-manifests/wandb/oci.yaml
Bumped chart version, fixed YAML indentation, added new OCI-based WeightsAndBiases manifest referencing registry credentials and resource values.
Go module deps
go.mod
Large dependency bump including Helm v3→v4, Kubernetes/controller-runtime, validator, logging, and many transitive updates/removals.
Helm core integration
pkg/helm/chart.go
Migrated to helm.sh/helm/v4 APIs, changed return types from concrete *release.Release to release.Releaser, removed DownloadChart and noop logger, and adjusted action client WaitStrategy/ServerSideApply semantics.
Chart API & helpers
pkg/wandb/spec/spec.go, pkg/wandb/spec/config.go, pkg/wandb/spec/specfakes/fake_chart.go
Switched chart/chartutil/strvals imports to Helm v4 v2 and updated fake chart mock signatures to v4 types.
Local/Repo chart handling
pkg/wandb/spec/charts/local.go, pkg/wandb/spec/charts/repo.go, pkg/wandb/spec/charts/repo_test.go
Ported loader/repo/downloader code to Helm v4; added nonociurl validator to prevent OCI URLs for repo downloader; propagated TLS skip-verify flag and content cache to downloader.
OCI registry feature & tests
pkg/wandb/spec/charts/oci.go, pkg/wandb/spec/charts/oci_test.go, pkg/wandb/spec/charts/charts.go
Added exported OCIRelease type with Validate/Chart/Apply/Prune (pulling charts from OCI registries with credential handling, plainHTTP/debug), inserted OCI candidate into dispatcher ordering, and added comprehensive tests.
Makefile local registry tooling
Makefile
Added variables and phony targets to run a local authenticated OCI registry, push charts, create in-cluster registry secret, and stop the registry.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant OCI as OCIRelease
    participant K8s as Kubernetes
    participant Reg as OCI_Registry
    participant Helm as Helm_Action

    Client->>OCI: Apply(ctx, k8sClient, wandb, scheme, values)
    OCI->>K8s: Get Secret (CredentialSecret?) 
    K8s-->>OCI: Secret data / not found
    OCI->>Reg: Build registry client (auth/plainHTTP/debug)
    OCI->>Reg: Pull chart (strip oci://, append :version)
    Reg-->>OCI: Chart archive bytes / error
    OCI->>OCI: Load chart from bytes
    OCI->>Helm: Create actionable chart (name/namespace)
    Helm->>K8s: Install/Upgrade/Uninstall release
    K8s-->>Helm: Release status
    Helm-->>OCI: Result / error
    OCI-->>Client: Return result/error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • danielpanzella
  • j7m4

"I’m a rabbit with a quiet hop,
I fetched OCI charts from a local shop.
Helm jumped to v4, I nudged the creds,
Tilt let Orbstack join our devled threads.
Secrets, pushes, tests — I did a happy hop!" 🐰

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the two main changes: adding OCI Helm chart registry support and upgrading to Helm v4, both of which are substantive changes throughout the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/oci-helm-chart-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@w-b-hivemind
Copy link
Copy Markdown

w-b-hivemind Bot commented Mar 30, 2026

HiveMind Sessions

8 sessions · 3h 4m · $99

Session Agent Duration Tokens Cost Lines
Fix Failing Ci Tests Before Committing
fa12ba6b-0761-49d0-bc6e-8555e2755032
claude 3m 2.4K $0.76 +0 -0
Github Pull Request 147 Code Review
019d6e05-7f21-7280-9846-e90d1494c85d
codex 5m 240.4K $1.27 +0 -0
Oci Chart Pr Code Review And Fix
c91a8baa-456f-42e2-824f-9da7cfcd987e
claude 10m 18.3K $2.92 +37 -10
Fixing Ginkgo Test Environment Variable Scoping
019d4f37-879c-7cb0-a208-03963248f108
codex 1m 41.9K $0.23 +0 -0
Fix Parallel Test Environment Variable Safety
22a29ceb-5854-414d-a627-93565dcc301b
claude 1h 36m 98.7K $78 +235 -90
Fix Oci Chart Pull And Nil Dereference Bugs
019d4e62-a363-74a2-bf38-e067c44d8f39
codex 5m 125.0K $0.68 +0 -0
Pr Accuracy Review And Report
019d4e63-752e-79c0-9393-d1a56b9d7841
codex 10m 288.7K $1.32 +0 -0
Enable Operator Pull OCI Helm
4527dbe4-f9ad-4ae1-91a5-9b4ee33f8119
claude 51m 46.0K $14 +525 -88
Total 3h 4m 861.4K $99 +797 -188
Screenshots

View all sessions in HiveMind →

Run claude --resume fa12ba6b-0761-49d0-bc6e-8555e2755032 or codex resume 019d6e05-7f21-7280-9846-e90d1494c85d to pickup where you left off.

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Mar 30, 2026

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Mar 30, 2026

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
License policy violation: golang github.com/cyphar/filepath-securejoin under BSD-3-Clause AND MPL-2.0

Location: Package overview

From: ?golang/helm.sh/helm/v4@v4.1.4golang/github.com/cyphar/filepath-securejoin@v0.6.1

ℹ Read more on: This package | This alert | What is a license policy violation?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Find a package that does not violate your license policy or adjust your policy to allow this package's license.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore golang/github.com/cyphar/filepath-securejoin@v0.6.1. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

zacharyblasczyk and others added 2 commits April 2, 2026 08:44
Reject tagless OCI refs at validation time instead of letting them fail
at runtime on every reconcile. Also add a nil check on the pull result
to prevent a panic on corrupted registry responses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lease

- Move result.Chart nil guard above the debug log that dereferences it
  to prevent a nil-pointer panic when Debug is enabled.
- Add nonociurl validator to RepoRelease.URL so oci:// URLs are
  explicitly rejected at validation time, not just by dispatcher ordering.
- Fix misleading test: include name field so the test actually verifies
  scheme-level rejection rather than passing due to missing required field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
zacharyblasczyk and others added 8 commits April 2, 2026 12:39
…ests

Use release.NewAccessor() instead of type-asserting to *releasev1.Release
in isInstalled(), making it resilient to future Helm release types.
Replace os.Setenv with GinkgoT().Setenv() for parallel-safe test execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Helm v4's ChartDownloader requires the ContentCache field to be set,
separate from RepositoryCache. Without it, DownloadTo() fails with
"content cache must be set". Pass settings.ContentCache through from
the cli.EnvSettings, which defaults to $HELM_CONTENT_CACHE or
~/.cache/helm/content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Helm v4 defaults to Server-Side Apply (SSA) for new installs (HIP-0023).
SSA uses field-level merge instead of full object replacement, which
breaks when upgrading charts that change a Deployment's strategy.

Specifically: when a chart creates a Deployment without specifying
strategy (defaulting to RollingUpdate), the API server auto-populates
strategy.rollingUpdate with defaults that no field manager owns. A
subsequent SSA upgrade setting strategy.type=Recreate cannot remove the
unowned rollingUpdate field, and the API server rejects the result:
"spec.strategy.rollingUpdate: Forbidden: may not be specified when
strategy type is 'Recreate'"

Disable SSA (ServerSideApply=false) on Install, Upgrade, and Rollback
to use client-side apply (full object replacement), which cleans up
stale fields and matches Helm v3 behavior.

Use HookOnlyStrategy instead of StatusWatcherStrategy/LegacyStrategy.
The operator's own reconcile loop already monitors resource readiness.
Having Helm also wait via StatusWatcherStrategy causes the reconciler
to re-enter while a Helm wait is in progress, producing "another
operation (install/upgrade/rollback) is in progress" errors.
HookOnlyStrategy waits only for pre/post hooks, then returns control
to the operator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix app.resources.memory indentation (was sibling of requests, not
child). Bump chart to v0.41.3 (v0.32.4 has duplicate env keys rejected
by Helm v4). Add glue strategy and resource limits. Disable
settingsMigrationJob.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update controller-runtime v0.23.1→v0.23.3, validator v10.26.0→v10.30.2,
gomega v1.39.0→v1.39.1, and transitive dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HookOnlyStrategy's WaitForDelete() is a no-op, which means Uninstall
returns success while resources are still terminating. This could cause
Prune() to report success prematurely. Use LegacyStrategy for Uninstall
to actually wait for resources to be deleted, while keeping
HookOnlyStrategy for Install/Upgrade/Rollback where the operator's
reconcile loop handles readiness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zacharyblasczyk zacharyblasczyk marked this pull request as ready for review April 8, 2026 17:54
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
pkg/wandb/spec/charts/oci_test.go (2)

188-190: Use per-spec temp directories for Helm homes.

Line 188–190 currently use shared os.TempDir() paths; this can cause cross-spec interference in parallel runs. Prefer GinkgoT().TempDir() (or unique subdirs) per spec.

♻️ Suggested change
- GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(os.TempDir(), "oci-test-cache"))
- GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(os.TempDir(), "oci-test-config"))
- GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(os.TempDir(), "oci-test-data"))
+ helmHome := GinkgoT().TempDir()
+ GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(helmHome, "cache"))
+ GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(helmHome, "config"))
+ GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(helmHome, "data"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 188 - 190, The test currently
sets HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME to shared
os.TempDir() locations which can collide across specs; change the calls that use
GinkgoT().Setenv to instead create per-spec temp directories via
GinkgoT().TempDir() (or GinkgoT().TempDir()+unique subpaths) and assign those
paths to the three env vars so each spec gets isolated Helm home directories;
update the three Setenv invocations that reference HELM_CACHE_HOME,
HELM_CONFIG_HOME, and HELM_DATA_HOME accordingly.

218-223: Strengthen credential-path assertions to avoid false positives.

These specs only assert that one credential error string is absent. Assert the expected downstream failure (e.g., pull/connect failure) so the tests prove credentials were accepted before pull.

Also applies to: 247-251, 281-286

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 218 - 223, The tests
currently only assert the absence of the credential error string after calling
ociRelease.Apply, which can produce false positives; update the It blocks that
call ociRelease.Apply (the "should retrieve credentials from secret before pull"
tests) to also assert that the returned error indicates the expected downstream
failure (e.g., a pull or connect failure message) and still assert it does not
contain "Failed to get credentials from secret"; in other words, after err :=
ociRelease.Apply(...), Expect(err).To(HaveOccurred()),
Expect(err.Error()).To(ContainSubstring("<expected pull/connect failure text>"))
and Expect(err.Error()).NotTo(ContainSubstring("Failed to get credentials from
secret")) so the spec proves credentials were accepted prior to the pull; apply
the same change to the other similar test blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 145: Update the vulnerable grpc dependency in go.mod by bumping
google.golang.org/grpc from v1.78.0 to v1.79.3 or later; run `go get
google.golang.org/grpc@v1.79.3` (or a newer patch), then run `go mod tidy` and
rebuild/run tests to ensure the new version is applied and go.sum updated so the
authorization bypass vulnerability (GO-2026-4762 / GHSA-p77j-4mvh-x3m3) is
mitigated.

In `@pkg/helm/chart.go`:
- Around line 126-130: The Rollback(version int) method on ActionableChart
ignores the incoming version; update the code in Rollback to set the Helm
rollback target by assigning the parameter to the action.Rollback client's
Version field (i.e., after creating client := action.NewRollback(c.config) set
client.Version = version) before calling client.Run(c.releaseName); this ensures
callers can target a specific release revision.

In `@pkg/wandb/spec/charts/oci.go`:
- Around line 109-113: Add explicit nil guards to prevent panics: in
OCIRelease.getActionableChart check wandb for nil and return a descriptive error
if nil before calling wandb.GetNamespace/GetName; likewise, wherever the
Kubernetes client variable c is used (particularly the code path that calls
c.Get when CredentialSecret is set), validate that c is not nil and return an
error if it is rather than calling c.Get. Ensure error messages reference the
function/field (e.g., getActionableChart, wandb, CredentialSecret, c.Get) so
callers can debug easily.
- Around line 136-143: When loading credentials after c.Get(...) for
r.CredentialSecret, verify that secret.Data contains both
r.CredentialSecret.UsernameKey and r.CredentialSecret.PasswordKey before
assigning r.Username and r.Password; if either key is missing, return a clear
error (include the missing key name and the secret reference) instead of
allowing empty strings to be set on r.Username/r.Password so credential loading
fails fast and provides a helpful message.

In `@pkg/wandb/spec/charts/repo.go`:
- Around line 174-175: The code currently hard-codes
getter.WithInsecureSkipVerifyTLS(true) which unconditionally disables TLS
verification; change this so the TLS skip flag is derived from the final
resolved chart URL instead of always true. After chartURL is resolved (the
variable named chartURL used when resolving relative paths), compute a boolean
like insecure := parsedResolvedURL.Scheme == "http" (or use the resolved URL's
Scheme) and pass getter.WithInsecureSkipVerifyTLS(insecure) to the chart
downloader; also keep entry.InsecureSkipTLSVerify in sync with that same
resolved URL decision and remove the unconditional
WithInsecureSkipVerifyTLS(true) usage. Ensure you reference and update the
places that set entry.InsecureSkipTLSVerify and the getter configuration so both
use the resolved chartURL's scheme.

---

Nitpick comments:
In `@pkg/wandb/spec/charts/oci_test.go`:
- Around line 188-190: The test currently sets HELM_CACHE_HOME,
HELM_CONFIG_HOME, and HELM_DATA_HOME to shared os.TempDir() locations which can
collide across specs; change the calls that use GinkgoT().Setenv to instead
create per-spec temp directories via GinkgoT().TempDir() (or
GinkgoT().TempDir()+unique subpaths) and assign those paths to the three env
vars so each spec gets isolated Helm home directories; update the three Setenv
invocations that reference HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME
accordingly.
- Around line 218-223: The tests currently only assert the absence of the
credential error string after calling ociRelease.Apply, which can produce false
positives; update the It blocks that call ociRelease.Apply (the "should retrieve
credentials from secret before pull" tests) to also assert that the returned
error indicates the expected downstream failure (e.g., a pull or connect failure
message) and still assert it does not contain "Failed to get credentials from
secret"; in other words, after err := ociRelease.Apply(...),
Expect(err).To(HaveOccurred()),
Expect(err.Error()).To(ContainSubstring("<expected pull/connect failure text>"))
and Expect(err.Error()).NotTo(ContainSubstring("Failed to get credentials from
secret")) so the spec proves credentials were accepted prior to the pull; apply
the same change to the other similar test blocks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d69359c1-8fa7-4c8f-b737-f63c3d453fe5

📥 Commits

Reviewing files that changed from the base of the PR and between f3abfed and 02f43aa.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (13)
  • Tiltfile
  • go.mod
  • hack/testing-manifests/wandb/default.yaml
  • pkg/helm/chart.go
  • pkg/wandb/spec/charts/charts.go
  • pkg/wandb/spec/charts/local.go
  • pkg/wandb/spec/charts/oci.go
  • pkg/wandb/spec/charts/oci_test.go
  • pkg/wandb/spec/charts/repo.go
  • pkg/wandb/spec/charts/repo_test.go
  • pkg/wandb/spec/config.go
  • pkg/wandb/spec/spec.go
  • pkg/wandb/spec/specfakes/fake_chart.go

Comment thread go.mod Outdated
Comment thread pkg/helm/chart.go
Comment on lines 126 to 130
func (c *ActionableChart) Rollback(version int) error {
client := action.NewRollback(c.config)
client.WaitStrategy = kube.HookOnlyStrategy
client.ServerSideApply = "false"
return client.Run(c.releaseName)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the imports in the file to understand the Helm version
head -50 pkg/helm/chart.go | grep -E "import|helm"

Repository: wandb/operator

Length of output: 414


🏁 Script executed:

# Look for the Rollback function context and any version parameter usage
cat -n pkg/helm/chart.go | sed -n '120,135p'

Repository: wandb/operator

Length of output: 686


🏁 Script executed:

# Search for other uses of action.Rollback in the codebase to understand the API
rg "action\.Rollback|action\.NewRollback" -A 5

Repository: wandb/operator

Length of output: 321


🏁 Script executed:

# Check if there are any tests for the Rollback function that might show expected behavior
rg "Rollback" pkg/helm/ -B 2 -A 5

Repository: wandb/operator

Length of output: 874


🏁 Script executed:

# Check go.mod to see the exact Helm v4 version
grep "helm.sh/helm" go.mod

Repository: wandb/operator

Length of output: 81


🏁 Script executed:

# Look for any other files that use action.Rollback to see if there's a pattern
find . -name "*.go" -type f -exec grep -l "action.Rollback" {} \;

Repository: wandb/operator

Length of output: 40


🏁 Script executed:

# Check if Version field is used anywhere else in the codebase with Helm actions
rg "client\.Version\s*=" -B 2 -A 2

Repository: wandb/operator

Length of output: 40


🏁 Script executed:

# Look at the NewRollback and Rollback struct definition by checking imports and usage patterns
rg "Rollback" pkg/helm/chart.go -B 5 -A 10

Repository: wandb/operator

Length of output: 1272


🏁 Script executed:

# Check Helm v4 API documentation or package info for action.Rollback
# First, let's see if there's vendor directory or go.sum that might give us more info
find . -path "*/vendor/helm.sh/helm/v4/pkg/action*" -name "*.go" 2>/dev/null | head -5

Repository: wandb/operator

Length of output: 40


🏁 Script executed:

# Let's search for any documentation or comments about Rollback in the codebase
rg -i "rollback.*version|version.*rollback" --type md --type txt

Repository: wandb/operator

Length of output: 40


🏁 Script executed:

# Check if there's a go.sum entry that might help us understand the module structure
grep -A 2 "helm.sh/helm/v4" go.sum

Repository: wandb/operator

Length of output: 347


🏁 Script executed:

# Let's check the Helm v4 source code by looking for the action package structure
# Since vendor might not exist, let's check if we can access go package info
python3 << 'EOF'
import subprocess
import json

# Try to get package info from go
try:
    result = subprocess.run(
        ["go", "list", "-m", "-json", "helm.sh/helm/v4"],
        capture_output=True,
        text=True,
        cwd="."
    )
    if result.returncode == 0:
        data = json.loads(result.stdout)
        print("Helm v4 Module Info:")
        print(f"Path: {data.get('Path')}")
        print(f"Version: {data.get('Version')}")
        print(f"Dir: {data.get('Dir')}")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: wandb/operator

Length of output: 170


🏁 Script executed:

# Let's look at the Helm v4 repository online or check if go can tell us the module path
# Looking for clues in how other Helm actions are used in the codebase
rg "action\.New" pkg/helm/ -A 3

Repository: wandb/operator

Length of output: 968


🏁 Script executed:

# Check the Rollback struct definition in the Helm v4 module
cat /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go | head -100

Repository: wandb/operator

Length of output: 3321


🏁 Script executed:

# Let's look for the NewRollback function and Rollback struct definition
grep -n "type Rollback struct\|func NewRollback\|Version" /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go | head -20

Repository: wandb/operator

Length of output: 1028


🏁 Script executed:

# Get the full rollback.go file if it's not too large
wc -l /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go

Repository: wandb/operator

Length of output: 133


🏁 Script executed:

# Let's look at the prepareRollback method more carefully to understand Version usage
sed -n '110,160p' /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go

Repository: wandb/operator

Length of output: 1534


Assign the requested version to the Helm rollback action.

The Rollback(version int) method receives a version parameter but never assigns it to the rollback client. This causes the action to ignore the parameter and always use the default behavior (rolling back to the previous version). The Version field in Helm v4's action.Rollback struct controls the target revision, and it must be set for callers to target a specific release.

🛠️ Proposed fix
 func (c *ActionableChart) Rollback(version int) error {
 	client := action.NewRollback(c.config)
+	client.Version = version
 	client.WaitStrategy = kube.HookOnlyStrategy
 	client.ServerSideApply = "false"
 	return client.Run(c.releaseName)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (c *ActionableChart) Rollback(version int) error {
client := action.NewRollback(c.config)
client.WaitStrategy = kube.HookOnlyStrategy
client.ServerSideApply = "false"
return client.Run(c.releaseName)
func (c *ActionableChart) Rollback(version int) error {
client := action.NewRollback(c.config)
client.Version = version
client.WaitStrategy = kube.HookOnlyStrategy
client.ServerSideApply = "false"
return client.Run(c.releaseName)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helm/chart.go` around lines 126 - 130, The Rollback(version int) method
on ActionableChart ignores the incoming version; update the code in Rollback to
set the Helm rollback target by assigning the parameter to the action.Rollback
client's Version field (i.e., after creating client :=
action.NewRollback(c.config) set client.Version = version) before calling
client.Run(c.releaseName); this ensures callers can target a specific release
revision.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing issue — Rollback() is never called anywhere in the codebase:

$ grep -r '\.Rollback(' pkg/ internal/ --include='*.go'
(no results)

The version parameter is part of the ActionableChartInterface but has no callers. Not fixing in this PR to avoid scope creep on dead code.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zacharyblasczyk, understood — since Rollback() has no callers, fixing the unused version parameter is indeed out of scope here. Would you like me to open a follow-up GitHub issue to track this so it doesn't get lost?

Comment thread pkg/wandb/spec/charts/oci.go
Comment thread pkg/wandb/spec/charts/oci.go
Comment thread pkg/wandb/spec/charts/repo.go
Add Makefile targets for local OCI registry with basic auth:
- make local-registry: starts registry:2 on port 5050 with htpasswd auth
- make local-registry-push: pulls chart from charts.wandb.ai and pushes
  to local registry (version read from oci.yaml manifest)
- make local-registry-secret: creates K8s secret with registry creds
- make local-registry-stop: tears down the registry

Add hack/testing-manifests/wandb/oci.yaml for testing OCI chart pulls
with credential secret authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zacharyblasczyk
Copy link
Copy Markdown
Contributor Author

OCI Auth E2E Testing

Tested the full OCI chart pull path with authenticated registry on a local Orbstack cluster via Tilt.

Setup

tilt-settings.json:

{
  "allowedContexts": [
    "docker-desktop",
    "minikube",
    "kind-kind",
    "orbstack"
  ],
  "installMinio": true,
  "installWandb": true,
  "wandbCRD": "oci"
}

Start the local OCI registry and push the chart:

make local-registry-push    # starts auth registry on :5050, pushes chart
make local-registry-secret  # creates K8s secret with creds
tilt up                     # deploys operator + wandb CR using oci.yaml

What was tested

  1. OCI pull with basic auth — operator reads oci-registry-creds secret, authenticates to host.internal:5050, pulls operator-wandb:0.41.3
  2. Plain HTTP registryplainHTTP: true in the CR spec
  3. Credential secret defaultingcredentialSecret.name without explicit key overrides defaults to HELM_USERNAME/HELM_PASSWORD
  4. Full install lifecycle — chart pulled → Helm install → 20 pods running, release status deployed

Results

  • Registry correctly rejects unauthenticated requests (401 UNAUTHORIZED)
  • Operator retrieves credentials from secret and authenticates successfully
  • Chart installs cleanly with ServerSideApply=false and HookOnlyStrategy
  • No errors in operator logs

Cleanup

make local-registry-stop

Fixes GO-2026-4762 / GHSA-p77j-4mvh-x3m3: authorization bypass in
gRPC-Go caused by improper validation of the HTTP/2 :path pseudo-header.
All versions before v1.79.3 are affected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
Makefile (2)

101-101: Make CHART_VERSION parsing robust.

Current parsing can pick the wrong version: key if the manifest gains additional version fields, causing pushes with the wrong chart tag.

Suggested diff
-CHART_VERSION ?= $(shell grep 'version:' $(OCI_MANIFEST) | head -1 | sed 's/.*"\(.*\)"/\1/')
+CHART_VERSION ?= $(shell awk '\
+  /^[[:space:]]*chart:[[:space:]]*$$/{in_chart=1; next} \
+  in_chart && /^[[:space:]]*version:[[:space:]]*/ { \
+    gsub(/"/,"",$$2); print $$2; exit \
+  }' $(OCI_MANIFEST))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 101, The CHART_VERSION assignment is brittle and can pick
the wrong "version:" key from $(OCI_MANIFEST); update the Makefile so
CHART_VERSION reads the chart version using a proper YAML-aware extractor (e.g.,
use yq to eval the top-level .version field from $(OCI_MANIFEST)) or at minimum
change the shell parsing to match only a top-level "version:" line (anchor the
regex to line start) to ensure the correct chart version is retrieved. Modify
the CHART_VERSION variable assignment to use the chosen YAML parser or anchored
pattern while still referencing $(OCI_MANIFEST).

123-127: Ensure temp directory cleanup on failure paths.

rm -rf $$tmpdir only executes on success. A failing helm pull/helm push leaves temp dirs behind.

Suggested diff
 local-registry-push: local-registry ## Push a chart from the wandb helm repo to the local OCI registry.
-	`@tmpdir`=$$(mktemp -d) && \
-	helm pull $(CHART_NAME) --repo $(CHART_REPO) --version $(CHART_VERSION) -d $$tmpdir && \
-	helm push $$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz oci://localhost:$(REGISTRY_PORT)/wandb \
-		--plain-http --username $(REGISTRY_USER) --password $(REGISTRY_PASS) && \
-	rm -rf $$tmpdir
+	`@tmpdir`=$$(mktemp -d); trap 'rm -rf "$$tmpdir"' EXIT; \
+	helm pull $(CHART_NAME) --repo $(CHART_REPO) --version $(CHART_VERSION) -d $$tmpdir; \
+	helm push $$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz oci://localhost:$(REGISTRY_PORT)/wandb \
+		--plain-http --username $(REGISTRY_USER) --password $(REGISTRY_PASS)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 123 - 127, The current Makefile snippet creates a
temporary directory in a shell line (tmpdir=$$(mktemp -d)) but only removes it
on the success path, leaving temp dirs behind if helm pull or helm push fail;
fix by registering a shell trap immediately after creating the temp dir to
always remove it on exit (e.g., after tmpdir=$$(mktemp -d) add trap 'rm -rf
"$$tmpdir"' EXIT) and then run the existing helm pull/helm push commands (ensure
the trap uses the same $$tmpdir variable and that the trap is set before any
command that can fail); reference the tmpdir variable and the helm pull/helm
push commands in your change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 105-113: The existing check only looks for running containers and
will fail if a stopped container named $(REGISTRY_NAME) exists; update the
Makefile logic around the block that uses $(CONTAINER_TOOL) ps --filter
name=$(REGISTRY_NAME) to also detect stopped containers (use ps -a) and, if a
container with that name exists but is not running, remove it before invoking
$(CONTAINER_TOOL) run --name $(REGISTRY_NAME); specifically, add a branch to
call $(CONTAINER_TOOL) rm -f $(REGISTRY_NAME) (or the equivalent remove command)
when a non-running container is found so the subsequent $(CONTAINER_TOOL) run -d
--name $(REGISTRY_NAME) will succeed.

---

Nitpick comments:
In `@Makefile`:
- Line 101: The CHART_VERSION assignment is brittle and can pick the wrong
"version:" key from $(OCI_MANIFEST); update the Makefile so CHART_VERSION reads
the chart version using a proper YAML-aware extractor (e.g., use yq to eval the
top-level .version field from $(OCI_MANIFEST)) or at minimum change the shell
parsing to match only a top-level "version:" line (anchor the regex to line
start) to ensure the correct chart version is retrieved. Modify the
CHART_VERSION variable assignment to use the chosen YAML parser or anchored
pattern while still referencing $(OCI_MANIFEST).
- Around line 123-127: The current Makefile snippet creates a temporary
directory in a shell line (tmpdir=$$(mktemp -d)) but only removes it on the
success path, leaving temp dirs behind if helm pull or helm push fail; fix by
registering a shell trap immediately after creating the temp dir to always
remove it on exit (e.g., after tmpdir=$$(mktemp -d) add trap 'rm -rf "$$tmpdir"'
EXIT) and then run the existing helm pull/helm push commands (ensure the trap
uses the same $$tmpdir variable and that the trap is set before any command that
can fail); reference the tmpdir variable and the helm pull/helm push commands in
your change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ba87872-b11d-4baf-bcbd-2acfc869ef2d

📥 Commits

Reviewing files that changed from the base of the PR and between 02f43aa and 801749c.

📒 Files selected for processing (2)
  • Makefile
  • hack/testing-manifests/wandb/oci.yaml
✅ Files skipped from review due to trivial changes (1)
  • hack/testing-manifests/wandb/oci.yaml

Comment thread Makefile
Comment thread Makefile Outdated
zacharyblasczyk and others added 3 commits April 8, 2026 13:31
The getter option WithInsecureSkipVerifyTLS was hard-coded to true,
disabling TLS certificate validation for all chart downloads including
HTTPS. Use the already-computed entry.InsecureSkipTLSVerify which is
true only for HTTP URLs, preserving TLS verification for HTTPS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When loading credentials from a Kubernetes secret, check that the
expected keys exist and are non-empty. Previously, missing keys would
silently produce empty username/password strings, leading to opaque
auth failures when pulling from the registry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cret

Remove stopped containers before starting a new registry to prevent
name conflicts. Add explicit namespace (WANDB_NAMESPACE, default:
default) to the credential secret creation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
pkg/wandb/spec/charts/repo.go (2)

123-129: ⚠️ Potential issue | 🟠 Major

Fail fast on missing/empty credential keys in repo secret path.

Lines 128-129 directly index secret.Data; missing keys silently become empty credentials and cause opaque auth failures downstream. Please mirror the explicit key existence/non-empty checks already used elsewhere.

Proposed fix
 		secret := &corev1.Secret{}
 		err := c.Get(ctx, client.ObjectKey{Name: r.CredentialSecret.Name, Namespace: wandb.Namespace}, secret)
 		if err != nil {
 			log.Error(err, "Failed to get credentials from secret")
 			return err
 		}
-		r.Username = string(secret.Data[r.CredentialSecret.UsernameKey])
-		r.Password = string(secret.Data[r.CredentialSecret.PasswordKey])
+		usernameBytes, ok := secret.Data[r.CredentialSecret.UsernameKey]
+		if !ok || len(usernameBytes) == 0 {
+			return fmt.Errorf("credential secret %s/%s missing key %q",
+				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.UsernameKey)
+		}
+		passwordBytes, ok := secret.Data[r.CredentialSecret.PasswordKey]
+		if !ok || len(passwordBytes) == 0 {
+			return fmt.Errorf("credential secret %s/%s missing key %q",
+				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.PasswordKey)
+		}
+		r.Username = string(usernameBytes)
+		r.Password = string(passwordBytes)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 123 - 129, The current code reads
secret.Data directly into r.Username and r.Password after c.Get (using
r.CredentialSecret.Name) which lets missing keys become empty credentials;
change this to explicitly check secret.Data for r.CredentialSecret.UsernameKey
and r.CredentialSecret.PasswordKey, ensure each key exists and the byte slice is
non-empty, and if either check fails log a descriptive error and return an error
instead of assigning an empty string to r.Username/r.Password; reference the
c.Get call, secret.Data map, r.CredentialSecret.UsernameKey,
r.CredentialSecret.PasswordKey, r.Username and r.Password when locating where to
add the checks.

174-175: ⚠️ Potential issue | 🟠 Major

Compute TLS-skip from resolved chartURL, not only repo URL scheme.

Line 206 currently inherits the flag from Lines 174-175 (r.URL scheme). If the index resolves to an HTTPS chart URL while the repo URL is HTTP, cert verification can be unintentionally disabled for that HTTPS download. Derive the insecure flag from the final resolved chart URL after Lines 259-262.

For Helm v4.1.3, confirm whether getter.WithInsecureSkipVerifyTLS(...) on downloader.ChartDownloader applies to HTTPS chart artifact downloads based on the final chart URL, and whether mixed-scheme (HTTP repo index -> HTTPS chart URL) is supported/recommended.

Also applies to: 206-206, 259-262

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 174 - 175, Compute and set
TLS-skip from the final resolved chart URL instead of the repo URL: after
resolving the index entry to the actual chart URL (the variable holding the
resolved chart URL used around the code that follows lines 259-262), parse that
final chart URL and set entry.InsecureSkipTLSVerify = parsedFinalChartURL.Scheme
== "http"; also ensure the ChartDownloader/getter is instantiated with
getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify) so downloads honor
the flag (keep existing r.Debug logging around the condition). This replaces the
current use of r.URL scheme and ensures the insecure flag reflects the actual
chart artifact URL.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/wandb/spec/charts/oci.go`:
- Around line 59-61: The current logic in the block that appends
registry.ClientOptBasicAuth when r.Username and r.Password are both non-empty
silently allows anonymous auth if only one of r.Username or r.Password is set;
change this to validate that both are provided or neither: check the
exclusive-or condition (r.Username == "" XOR r.Password == "") and if true
return a clear error (e.g. "both username and password must be provided
together") instead of proceeding, otherwise continue to append
registry.ClientOptBasicAuth(r.Username, r.Password) when both are non-empty;
update the function's error return path accordingly.

---

Duplicate comments:
In `@pkg/wandb/spec/charts/repo.go`:
- Around line 123-129: The current code reads secret.Data directly into
r.Username and r.Password after c.Get (using r.CredentialSecret.Name) which lets
missing keys become empty credentials; change this to explicitly check
secret.Data for r.CredentialSecret.UsernameKey and
r.CredentialSecret.PasswordKey, ensure each key exists and the byte slice is
non-empty, and if either check fails log a descriptive error and return an error
instead of assigning an empty string to r.Username/r.Password; reference the
c.Get call, secret.Data map, r.CredentialSecret.UsernameKey,
r.CredentialSecret.PasswordKey, r.Username and r.Password when locating where to
add the checks.
- Around line 174-175: Compute and set TLS-skip from the final resolved chart
URL instead of the repo URL: after resolving the index entry to the actual chart
URL (the variable holding the resolved chart URL used around the code that
follows lines 259-262), parse that final chart URL and set
entry.InsecureSkipTLSVerify = parsedFinalChartURL.Scheme == "http"; also ensure
the ChartDownloader/getter is instantiated with
getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify) so downloads honor
the flag (keep existing r.Debug logging around the condition). This replaces the
current use of r.URL scheme and ensures the insecure flag reflects the actual
chart artifact URL.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0a3da00a-de1b-4ef9-85a8-4fc618aaa855

📥 Commits

Reviewing files that changed from the base of the PR and between 801749c and 9bb9f61.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod
  • pkg/wandb/spec/charts/oci.go
  • pkg/wandb/spec/charts/repo.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod

Comment thread pkg/wandb/spec/charts/oci.go
When only one of username or password is provided, the code silently
falls back to anonymous pull. Fail fast with a clear error so
misconfiguration is immediately visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zacharyblasczyk
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
pkg/wandb/spec/charts/repo.go (1)

174-206: ⚠️ Potential issue | 🟠 Major

Derive the chart-download TLS policy from the resolved chartURL.

entry.InsecureSkipTLSVerify is the right setting for fetching the repo index, but reusing it in getter.WithInsecureSkipVerifyTLS(...) is still unsafe when the index returns an absolute chart URL. An http:// index can legitimately point at an absolute https:// tarball URL, and this code would then disable certificate validation for the chart download. Keep the repo-level flag scoped to the index fetch and compute a separate downloader flag after chartURL has been normalized. Based on learnings, relative chart URLs are already resolved against the repository base URL, so the remaining gap here is specifically absolute URLs from the index.

🔒 Proposed fix
-	getterOpts := []getter.Option{
-		getter.WithBasicAuth(r.Username, r.Password),
-		getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify),
-	}
+	getterOpts := []getter.Option{
+		getter.WithBasicAuth(r.Username, r.Password),
+	}
@@
 	chartURL := cv.URLs[0]
 	if !strings.HasPrefix(chartURL, "http://") && !strings.HasPrefix(chartURL, "https://") {
 		chartURL = fmt.Sprintf("%s/%s", strings.TrimSuffix(r.URL, "/"), chartURL)
 	}
+	parsedChartURL, err := url.Parse(chartURL)
+	if err != nil {
+		return "", fmt.Errorf("failed to parse chart URL %q: %w", chartURL, err)
+	}
+	getterOpts = append(getterOpts, getter.WithInsecureSkipVerifyTLS(parsedChartURL.Scheme == "http"))

Also applies to: 259-262

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 174 - 206, The repo-level TLS
flag entry.InsecureSkipTLSVerify should only control index fetches; after you
normalize/resolve chartURL you must compute a separate downloader flag (e.g.,
downloaderInsecure := parsedChartURL.Scheme == "http" or parsedChartURL.Scheme
!= "https") and pass that to getter.WithInsecureSkipVerifyTLS instead of reusing
entry.InsecureSkipTLSVerify. Keep entry.InsecureSkipTLSVerify for the index
fetch logic (where you set it now), then parse/normalize chartURL, derive the
downloader flag from the parsed chartURL's scheme, and replace the call to
getter.WithInsecureSkipVerifyTLS(...) so it uses the new downloaderInsecure
value (apply the same change for the other occurrence referenced around lines
259-262).
🧹 Nitpick comments (3)
Makefile (1)

101-101: Make CHART_VERSION extraction deterministic.

Line [101] parses YAML via grep/sed, which is brittle (first version: wins and quoted format is assumed). Use a YAML-aware query to avoid silent mismatches.

Proposed refactor
-CHART_VERSION ?= $(shell grep 'version:' $(OCI_MANIFEST) | head -1 | sed 's/.*"\(.*\)"/\1/')
+CHART_VERSION ?= $(shell yq -r '.spec.chart.version' $(OCI_MANIFEST))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 101, The CHART_VERSION assignment is brittle because it
greps/seds YAML from $(OCI_MANIFEST); replace that pipeline with a YAML-aware
extraction (e.g., use yq or a small python one-liner) to read the version key
reliably from the manifest. Update the CHART_VERSION definition to call yq (or
python yaml.safe_load) against $(OCI_MANIFEST) and extract the top-level version
field deterministically (handle quoted/unquoted values and multi-document
manifests), so CHART_VERSION is set from the parsed YAML rather than
text-matching.
pkg/wandb/spec/charts/oci_test.go (1)

188-190: Use a per-spec Helm home instead of fixed temp paths.

These env vars point every spec at the same os.TempDir()/oci-test-* directories while OCIRelease.pullChart() enables Helm caching. Shared cache/config/data paths leak state across specs and, if the suite is run with ginkgo -p, across nodes. Prefer a fresh GinkgoT().TempDir() and derive the three homes from that.

🧪 Suggested cleanup
 import (
 	"context"
-	"os"
 	"path/filepath"
@@
-			GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(os.TempDir(), "oci-test-cache"))
-			GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(os.TempDir(), "oci-test-config"))
-			GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(os.TempDir(), "oci-test-data"))
+			helmHome := GinkgoT().TempDir()
+			GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(helmHome, "cache"))
+			GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(helmHome, "config"))
+			GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(helmHome, "data"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 188 - 190, The test sets
HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME to fixed os.TempDir()
subpaths which shares state across specs and parallel nodes; update the test in
pkg/wandb/spec/charts/oci_test.go to create a fresh per-spec base directory
using GinkgoT().TempDir() and derive the three Helm homes from that base (e.g.,
filepath.Join(base, "cache"), "config", "data") before calling
OCIRelease.pullChart() so each spec gets isolated Helm cache/config/data
directories.
pkg/wandb/spec/charts/oci.go (1)

126-131: Keep credential-key defaults local to Apply().

Even with a value receiver, r.CredentialSecret.UsernameKey = ... and PasswordKey = ... mutate the shared pointed struct. That makes Apply() silently default the release config in place, and those changes can leak into later pkg/wandb/spec/spec.go Spec.IsEqual() comparisons. Use local usernameKey / passwordKey variables instead of writing back into the config object.

♻️ Suggested refactor
 	if r.CredentialSecret != nil {
-		if r.CredentialSecret.UsernameKey == "" {
-			r.CredentialSecret.UsernameKey = CredentialUsernameKey
-		}
-		if r.CredentialSecret.PasswordKey == "" {
-			r.CredentialSecret.PasswordKey = CredentialPasswordKey
-		}
+		usernameKey := r.CredentialSecret.UsernameKey
+		if usernameKey == "" {
+			usernameKey = CredentialUsernameKey
+		}
+		passwordKey := r.CredentialSecret.PasswordKey
+		if passwordKey == "" {
+			passwordKey = CredentialPasswordKey
+		}
 		log.Info("Retrieving credentials from secret",
 			"name", r.CredentialSecret.Name,
-			"usernameKey", r.CredentialSecret.UsernameKey,
-			"passwordKey", r.CredentialSecret.PasswordKey)
+			"usernameKey", usernameKey,
+			"passwordKey", passwordKey)
@@
-		usernameBytes, ok := secret.Data[r.CredentialSecret.UsernameKey]
+		usernameBytes, ok := secret.Data[usernameKey]
 		if !ok || len(usernameBytes) == 0 {
 			return fmt.Errorf("credential secret %s/%s missing key %q",
-				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.UsernameKey)
+				wandb.Namespace, r.CredentialSecret.Name, usernameKey)
 		}
-		passwordBytes, ok := secret.Data[r.CredentialSecret.PasswordKey]
+		passwordBytes, ok := secret.Data[passwordKey]
 		if !ok || len(passwordBytes) == 0 {
 			return fmt.Errorf("credential secret %s/%s missing key %q",
-				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.PasswordKey)
+				wandb.Namespace, r.CredentialSecret.Name, passwordKey)
 		}

Also applies to: 144-155

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci.go` around lines 126 - 131, In Apply(), avoid
mutating r.CredentialSecret by reading usernameKey and passwordKey into local
variables (e.g., usernameKey := r.CredentialSecret.UsernameKey; if usernameKey
== "" { usernameKey = CredentialUsernameKey } and similarly for passwordKey) and
use those locals when building the chart/secret, leaving
r.CredentialSecret.UsernameKey and .PasswordKey untouched; apply the same
local-variable refactor for the other identical block handling credentials (the
one that currently sets r.CredentialSecret.PasswordKey/UsernameKey) so
Spec.IsEqual comparisons aren’t affected by Apply() side effects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 124-128: The command chain creates tmpdir with tmpdir=$$(mktemp
-d) but only deletes it at the end of the && chain so failures leave residue;
change the recipe to create the temp dir, register a cleanup (e.g., trap 'rm -rf
"$$tmpdir"' EXIT) immediately after tmpdir creation, then run helm pull/helm
push using $$tmpdir (helm pull $(CHART_NAME) ... -d $$tmpdir and helm push
$$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz ...), ensuring the trap guarantees
rm -rf $$tmpdir runs on both success and failure (and optionally remove the trap
on successful exit).

---

Duplicate comments:
In `@pkg/wandb/spec/charts/repo.go`:
- Around line 174-206: The repo-level TLS flag entry.InsecureSkipTLSVerify
should only control index fetches; after you normalize/resolve chartURL you must
compute a separate downloader flag (e.g., downloaderInsecure :=
parsedChartURL.Scheme == "http" or parsedChartURL.Scheme != "https") and pass
that to getter.WithInsecureSkipVerifyTLS instead of reusing
entry.InsecureSkipTLSVerify. Keep entry.InsecureSkipTLSVerify for the index
fetch logic (where you set it now), then parse/normalize chartURL, derive the
downloader flag from the parsed chartURL's scheme, and replace the call to
getter.WithInsecureSkipVerifyTLS(...) so it uses the new downloaderInsecure
value (apply the same change for the other occurrence referenced around lines
259-262).

---

Nitpick comments:
In `@Makefile`:
- Line 101: The CHART_VERSION assignment is brittle because it greps/seds YAML
from $(OCI_MANIFEST); replace that pipeline with a YAML-aware extraction (e.g.,
use yq or a small python one-liner) to read the version key reliably from the
manifest. Update the CHART_VERSION definition to call yq (or python
yaml.safe_load) against $(OCI_MANIFEST) and extract the top-level version field
deterministically (handle quoted/unquoted values and multi-document manifests),
so CHART_VERSION is set from the parsed YAML rather than text-matching.

In `@pkg/wandb/spec/charts/oci_test.go`:
- Around line 188-190: The test sets HELM_CACHE_HOME, HELM_CONFIG_HOME, and
HELM_DATA_HOME to fixed os.TempDir() subpaths which shares state across specs
and parallel nodes; update the test in pkg/wandb/spec/charts/oci_test.go to
create a fresh per-spec base directory using GinkgoT().TempDir() and derive the
three Helm homes from that base (e.g., filepath.Join(base, "cache"), "config",
"data") before calling OCIRelease.pullChart() so each spec gets isolated Helm
cache/config/data directories.

In `@pkg/wandb/spec/charts/oci.go`:
- Around line 126-131: In Apply(), avoid mutating r.CredentialSecret by reading
usernameKey and passwordKey into local variables (e.g., usernameKey :=
r.CredentialSecret.UsernameKey; if usernameKey == "" { usernameKey =
CredentialUsernameKey } and similarly for passwordKey) and use those locals when
building the chart/secret, leaving r.CredentialSecret.UsernameKey and
.PasswordKey untouched; apply the same local-variable refactor for the other
identical block handling credentials (the one that currently sets
r.CredentialSecret.PasswordKey/UsernameKey) so Spec.IsEqual comparisons aren’t
affected by Apply() side effects.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc8279cc-31ec-494e-8d57-42c520b1df68

📥 Commits

Reviewing files that changed from the base of the PR and between f3abfed and 3fd599f.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • Makefile
  • Tiltfile
  • go.mod
  • hack/testing-manifests/wandb/default.yaml
  • hack/testing-manifests/wandb/oci.yaml
  • pkg/helm/chart.go
  • pkg/wandb/spec/charts/charts.go
  • pkg/wandb/spec/charts/local.go
  • pkg/wandb/spec/charts/oci.go
  • pkg/wandb/spec/charts/oci_test.go
  • pkg/wandb/spec/charts/repo.go
  • pkg/wandb/spec/charts/repo_test.go
  • pkg/wandb/spec/config.go
  • pkg/wandb/spec/spec.go
  • pkg/wandb/spec/specfakes/fake_chart.go

Comment thread Makefile
zacharyblasczyk and others added 2 commits April 8, 2026 15:52
Add orbstack to the allowed Kubernetes contexts in README and
tilt-settings.sample.json. Minor formatting cleanup in README.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@casey-coreweave casey-coreweave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validated for local OCI paths 👍 Added a commit to change parsing on the OCI tags so we can handle all the formats (like hash refs) and allow optional versions.

@casey-coreweave casey-coreweave force-pushed the feat/oci-helm-chart-support branch from bd3014f to f76ba75 Compare April 17, 2026 21:59
Removes stale entries for older grpc-gateway, otelhttp, otel, and
x/net/oauth2 versions that were no longer referenced after recent
dependency bumps, resolving the Dependency Check CI job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants