feat: Add OCI Helm chart registry support and upgrade to Helm v4 by zacharyblasczyk · Pull Request #147 · wandb/operator

zacharyblasczyk · 2026-03-30T15:49:27Z

Summary

Upgrade Helm SDK from v3.19.2 to v4.1.3, adapting to all breaking API changes (import path restructuring, Config.Init signature, Entry field rename, ContentCache)
Disable Server-Side Apply (SSA) to preserve Helm v3 client-side apply semantics. Helm v4 defaults to SSA (HIP-0023), but SSA cannot remove API-server-defaulted fields (e.g., strategy.rollingUpdate) when chart upgrades change discriminator fields (e.g., strategy.type: Recreate), causing validation failures. See commit 7e19a96 for the full analysis.
Use HookOnlyStrategy for Install/Upgrade/Rollback wait behavior. The operator's own reconcile loop monitors resource readiness, so Helm doesn't need to block. StatusWatcherStrategy caused "another operation is in progress" errors when the reconciler re-entered during a long Helm wait.
Add OCIRelease chart type that pulls Helm charts from OCI-based registries using Helm's native registry.Client.Pull() with in-memory chart loading via loader.LoadArchive()
Use release.Accessor interface in isInstalled() instead of type-asserting to *releasev1.Release, making it resilient to future Helm release type versions
Remove dead code (DownloadChart() function that was never called)

Users can now configure OCI chart sources:

spec:
  chart:
    url: oci://ghcr.io/wandb/helm-charts/operator-wandb
    version: "1.0.0"
    credentialSecret:
      name: oci-registry-creds

Helm v4 decisions

Decision	Rationale
`ServerSideApply = false`	SSA field-merge can't remove API-server-defaulted fields (e.g., `rollingUpdate`), breaking chart upgrades that change deployment strategy. CSA does full object replacement.
`HookOnlyStrategy`	Operator reconcile loop already checks readiness. Helm blocking on `StatusWatcherStrategy` conflicts with the reconciler's requeue interval.
`ContentCache`	New required field in Helm v4's `ChartDownloader`.

Test plan

61 Ginkgo specs pass in pkg/wandb/spec/charts/ (40 existing + 21 new OCI)
All unit test packages pass
Fresh install of chart v0.41.3 on clean Orbstack cluster via Tilt
Validated SSA=true breaks on strategy change (RollingUpdate→Recreate), SSA=false works
go build ./... clean
go vet ./... clean

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Support for pulling and deploying Helm charts from OCI registries; OCI-based manifests supported.
Chores
- Upgraded Helm/Kubernetes libraries and refreshed core dependencies.
- Added the "orbstack" Kubernetes context to Tilt settings and README.
- Updated operator-wandb chart reference to v0.41.3.
- Added local OCI registry helpers for testing and publishing charts.
Tests
- Added tests for OCI chart handling, credential scenarios, apply/prune flows.

Enable the operator to pull Helm charts from OCI-based registries via a new OCIRelease chart type, and upgrade the Helm SDK from v3.19.2 to v4.1.3 to align with the current Helm ecosystem. Helm v4 upgrade: - Update all import paths (chart/v2, release/v1, repo/v1, etc.) - Adapt to v4 API changes: Config.Init log param removed, Uninstall.Wait replaced by WaitStrategy, Entry field rename - Remove dead DownloadChart() code - Regenerate counterfeiter fakes OCI support: - New OCIRelease type in pkg/wandb/spec/charts/oci.go using Helm's native registry.Client.Pull() with in-memory chart loading - Custom ociurl validator using registry.IsOCI() - Supports authentication via direct credentials or K8s Secrets - PlainHTTP option for insecure/local registries - Registered in chart dispatcher between LocalRelease and RepoRelease - 21 new Ginkgo test specs covering validation, dispatcher routing, credential secrets, and error paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-30T15:49:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Helm v4 migration, broad dependency upgrades, OCI-backed Helm chart support (new OCIRelease with pull/apply/prune and tests), local OCI registry test tooling and manifests, and permits the "orbstack" Kubernetes context in Tilt configuration.

Changes

Cohort / File(s)	Summary
Tilt & sample `Tiltfile`, `tilt-settings.sample.json`, `README.md`	Added `"orbstack"` to allowed Tilt contexts and updated documentation formatting.
Testing manifests `hack/testing-manifests/wandb/default.yaml`, `hack/testing-manifests/wandb/oci.yaml`	Bumped chart version, fixed YAML indentation, added new OCI-based WeightsAndBiases manifest referencing registry credentials and resource values.
Go module deps `go.mod`	Large dependency bump including Helm v3→v4, Kubernetes/controller-runtime, validator, logging, and many transitive updates/removals.
Helm core integration `pkg/helm/chart.go`	Migrated to `helm.sh/helm/v4` APIs, changed return types from concrete `*release.Release` to `release.Releaser`, removed DownloadChart and noop logger, and adjusted action client WaitStrategy/ServerSideApply semantics.
Chart API & helpers `pkg/wandb/spec/spec.go`, `pkg/wandb/spec/config.go`, `pkg/wandb/spec/specfakes/fake_chart.go`	Switched chart/chartutil/strvals imports to Helm v4 v2 and updated fake chart mock signatures to v4 types.
Local/Repo chart handling `pkg/wandb/spec/charts/local.go`, `pkg/wandb/spec/charts/repo.go`, `pkg/wandb/spec/charts/repo_test.go`	Ported loader/repo/downloader code to Helm v4; added `nonociurl` validator to prevent OCI URLs for repo downloader; propagated TLS skip-verify flag and content cache to downloader.
OCI registry feature & tests `pkg/wandb/spec/charts/oci.go`, `pkg/wandb/spec/charts/oci_test.go`, `pkg/wandb/spec/charts/charts.go`	Added exported `OCIRelease` type with Validate/Chart/Apply/Prune (pulling charts from OCI registries with credential handling, plainHTTP/debug), inserted OCI candidate into dispatcher ordering, and added comprehensive tests.
Makefile local registry tooling `Makefile`	Added variables and phony targets to run a local authenticated OCI registry, push charts, create in-cluster registry secret, and stop the registry.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant OCI as OCIRelease
    participant K8s as Kubernetes
    participant Reg as OCI_Registry
    participant Helm as Helm_Action

    Client->>OCI: Apply(ctx, k8sClient, wandb, scheme, values)
    OCI->>K8s: Get Secret (CredentialSecret?) 
    K8s-->>OCI: Secret data / not found
    OCI->>Reg: Build registry client (auth/plainHTTP/debug)
    OCI->>Reg: Pull chart (strip oci://, append :version)
    Reg-->>OCI: Chart archive bytes / error
    OCI->>OCI: Load chart from bytes
    OCI->>Helm: Create actionable chart (name/namespace)
    Helm->>K8s: Install/Upgrade/Uninstall release
    K8s-->>Helm: Release status
    Helm-->>OCI: Result / error
    OCI-->>Client: Return result/error

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

fix: Bump deps #70 — Also bumps Go module dependency versions (related dependency updates).
fix: Add Tilt configs for local development #53 — Introduced Tilt settings / sample file; overlaps with Tiltfile/tilt-settings changes.
feat: Add application controller and types #107 — Same Tiltfile allowedContexts change (orbstack) noted in history.

Suggested reviewers

danielpanzella
j7m4

"I’m a rabbit with a quiet hop,
I fetched OCI charts from a local shop.
Helm jumped to v4, I nudged the creds,
Tilt let Orbstack join our devled threads.
Secrets, pushes, tests — I did a happy hop!" 🐰

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the two main changes: adding OCI Helm chart registry support and upgrading to Helm v4, both of which are substantive changes throughout the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/oci-helm-chart-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

w-b-hivemind · 2026-03-30T15:53:05Z

HiveMind Sessions

8 sessions · 3h 4m · $99

Session	Agent	Duration	Tokens	Cost	Lines
Fix Failing Ci Tests Before Committing `fa12ba6b-0761-49d0-bc6e-8555e2755032`	claude	3m	2.4K	$0.76	+0 -0
Github Pull Request 147 Code Review `019d6e05-7f21-7280-9846-e90d1494c85d`	codex	5m	240.4K	$1.27	+0 -0
Oci Chart Pr Code Review And Fix `c91a8baa-456f-42e2-824f-9da7cfcd987e`	claude	10m	18.3K	$2.92	+37 -10
Fixing Ginkgo Test Environment Variable Scoping `019d4f37-879c-7cb0-a208-03963248f108`	codex	1m	41.9K	$0.23	+0 -0
Fix Parallel Test Environment Variable Safety `22a29ceb-5854-414d-a627-93565dcc301b`	claude	1h 36m	98.7K	$78	+235 -90
Fix Oci Chart Pull And Nil Dereference Bugs `019d4e62-a363-74a2-bf38-e067c44d8f39`	codex	5m	125.0K	$0.68	+0 -0
Pr Accuracy Review And Report `019d4e63-752e-79c0-9393-d1a56b9d7841`	codex	10m	288.7K	$1.32	+0 -0
Enable Operator Pull OCI Helm `4527dbe4-f9ad-4ae1-91a5-9b4ee33f8119`	claude	51m	46.0K	$14	+525 -88
Total		3h 4m	861.4K	$99	+797 -188

Screenshots

View all sessions in HiveMind →

Run claude --resume fa12ba6b-0761-49d0-bc6e-8555e2755032 or codex resume 019d6e05-7f21-7280-9846-e90d1494c85d to pickup where you left off.

socket-security · 2026-03-30T15:53:20Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	github.com/onsi/gomega@v1.36.1 ⏵ v1.39.1	⁺¹
	sigs.k8s.io/controller-runtime@v0.20.0 ⏵ v0.23.3	⁺¹
	helm.sh/helm/v4@v4.1.4
	k8s.io/apimachinery@v0.34.0 ⏵ v0.35.1	⁺¹
	k8s.io/client-go@v0.34.0 ⏵ v0.35.1	⁺¹
	k8s.io/api@v0.34.0 ⏵ v0.35.1
	github.com/onsi/ginkgo/v2@v2.21.0 ⏵ v2.28.1	^-2
	github.com/go-playground/validator/v10@v10.26.0 ⏵ v10.30.2	^-1
	github.com/go-logr/logr@v1.4.2 ⏵ v1.4.3	⁺¹

View full report

socket-security · 2026-03-30T15:53:24Z

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action	Severity	Alert (click "▶" to expand/collapse)
Warn		License policy violation: golang `github.com/cyphar/filepath-securejoin` under BSD-3-Clause AND MPL-2.0 Location: Package overview From: `?` → `golang/helm.sh/helm/v4@v4.1.4` → `golang/github.com/cyphar/filepath-securejoin@v0.6.1` ℹ Read more on: This package \| This alert \| What is a license policy violation? Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at `support@socket.dev`. Suggestion: Find a package that does not violate your license policy or adjust your policy to allow this package's license. Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment `@SocketSecurity ignore golang/github.com/cyphar/filepath-securejoin@v0.6.1`. You can also ignore all packages with `@SocketSecurity ignore-all`. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

Reject tagless OCI refs at validation time instead of letting them fail at runtime on every reconcile. Also add a nil check on the pull result to prevent a panic on corrupted registry responses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…lease - Move result.Chart nil guard above the debug log that dereferences it to prevent a nil-pointer panic when Debug is enabled. - Add nonociurl validator to RepoRelease.URL so oci:// URLs are explicitly rejected at validation time, not just by dispatcher ordering. - Fix misleading test: include name field so the test actually verifies scheme-level rejection rather than passing due to missing required field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ests Use release.NewAccessor() instead of type-asserting to *releasev1.Release in isInstalled(), making it resilient to future Helm release types. Replace os.Setenv with GinkgoT().Setenv() for parallel-safe test execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Helm v4's ChartDownloader requires the ContentCache field to be set, separate from RepositoryCache. Without it, DownloadTo() fails with "content cache must be set". Pass settings.ContentCache through from the cli.EnvSettings, which defaults to $HELM_CONTENT_CACHE or ~/.cache/helm/content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Helm v4 defaults to Server-Side Apply (SSA) for new installs (HIP-0023). SSA uses field-level merge instead of full object replacement, which breaks when upgrading charts that change a Deployment's strategy. Specifically: when a chart creates a Deployment without specifying strategy (defaulting to RollingUpdate), the API server auto-populates strategy.rollingUpdate with defaults that no field manager owns. A subsequent SSA upgrade setting strategy.type=Recreate cannot remove the unowned rollingUpdate field, and the API server rejects the result: "spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy type is 'Recreate'" Disable SSA (ServerSideApply=false) on Install, Upgrade, and Rollback to use client-side apply (full object replacement), which cleans up stale fields and matches Helm v3 behavior. Use HookOnlyStrategy instead of StatusWatcherStrategy/LegacyStrategy. The operator's own reconcile loop already monitors resource readiness. Having Helm also wait via StatusWatcherStrategy causes the reconciler to re-enter while a Helm wait is in progress, producing "another operation (install/upgrade/rollback) is in progress" errors. HookOnlyStrategy waits only for pre/post hooks, then returns control to the operator. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix app.resources.memory indentation (was sibling of requests, not child). Bump chart to v0.41.3 (v0.32.4 has duplicate env keys rejected by Helm v4). Add glue strategy and resource limits. Disable settingsMigrationJob. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update controller-runtime v0.23.1→v0.23.3, validator v10.26.0→v10.30.2, gomega v1.39.0→v1.39.1, and transitive dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

HookOnlyStrategy's WaitForDelete() is a no-op, which means Uninstall returns success while resources are still terminating. This could cause Prune() to report success prematurely. Use LegacyStrategy for Uninstall to actually wait for resources to be deleted, while keeping HookOnlyStrategy for Install/Upgrade/Rollback where the operator's reconcile loop handles readiness. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

pkg/wandb/spec/charts/oci_test.go (2)

188-190: Use per-spec temp directories for Helm homes.

Line 188–190 currently use shared os.TempDir() paths; this can cause cross-spec interference in parallel runs. Prefer GinkgoT().TempDir() (or unique subdirs) per spec.

♻️ Suggested change

- GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(os.TempDir(), "oci-test-cache"))
- GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(os.TempDir(), "oci-test-config"))
- GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(os.TempDir(), "oci-test-data"))
+ helmHome := GinkgoT().TempDir()
+ GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(helmHome, "cache"))
+ GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(helmHome, "config"))
+ GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(helmHome, "data"))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 188 - 190, The test currently
sets HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME to shared
os.TempDir() locations which can collide across specs; change the calls that use
GinkgoT().Setenv to instead create per-spec temp directories via
GinkgoT().TempDir() (or GinkgoT().TempDir()+unique subpaths) and assign those
paths to the three env vars so each spec gets isolated Helm home directories;
update the three Setenv invocations that reference HELM_CACHE_HOME,
HELM_CONFIG_HOME, and HELM_DATA_HOME accordingly.

218-223: Strengthen credential-path assertions to avoid false positives.

These specs only assert that one credential error string is absent. Assert the expected downstream failure (e.g., pull/connect failure) so the tests prove credentials were accepted before pull.

Also applies to: 247-251, 281-286

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 218 - 223, The tests
currently only assert the absence of the credential error string after calling
ociRelease.Apply, which can produce false positives; update the It blocks that
call ociRelease.Apply (the "should retrieve credentials from secret before pull"
tests) to also assert that the returned error indicates the expected downstream
failure (e.g., a pull or connect failure message) and still assert it does not
contain "Failed to get credentials from secret"; in other words, after err :=
ociRelease.Apply(...), Expect(err).To(HaveOccurred()),
Expect(err.Error()).To(ContainSubstring("<expected pull/connect failure text>"))
and Expect(err.Error()).NotTo(ContainSubstring("Failed to get credentials from
secret")) so the spec proves credentials were accepted prior to the pull; apply
the same change to the other similar test blocks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 145: Update the vulnerable grpc dependency in go.mod by bumping
google.golang.org/grpc from v1.78.0 to v1.79.3 or later; run `go get
google.golang.org/grpc@v1.79.3` (or a newer patch), then run `go mod tidy` and
rebuild/run tests to ensure the new version is applied and go.sum updated so the
authorization bypass vulnerability (GO-2026-4762 / GHSA-p77j-4mvh-x3m3) is
mitigated.

In `@pkg/helm/chart.go`:
- Around line 126-130: The Rollback(version int) method on ActionableChart
ignores the incoming version; update the code in Rollback to set the Helm
rollback target by assigning the parameter to the action.Rollback client's
Version field (i.e., after creating client := action.NewRollback(c.config) set
client.Version = version) before calling client.Run(c.releaseName); this ensures
callers can target a specific release revision.

In `@pkg/wandb/spec/charts/oci.go`:
- Around line 109-113: Add explicit nil guards to prevent panics: in
OCIRelease.getActionableChart check wandb for nil and return a descriptive error
if nil before calling wandb.GetNamespace/GetName; likewise, wherever the
Kubernetes client variable c is used (particularly the code path that calls
c.Get when CredentialSecret is set), validate that c is not nil and return an
error if it is rather than calling c.Get. Ensure error messages reference the
function/field (e.g., getActionableChart, wandb, CredentialSecret, c.Get) so
callers can debug easily.
- Around line 136-143: When loading credentials after c.Get(...) for
r.CredentialSecret, verify that secret.Data contains both
r.CredentialSecret.UsernameKey and r.CredentialSecret.PasswordKey before
assigning r.Username and r.Password; if either key is missing, return a clear
error (include the missing key name and the secret reference) instead of
allowing empty strings to be set on r.Username/r.Password so credential loading
fails fast and provides a helpful message.

In `@pkg/wandb/spec/charts/repo.go`:
- Around line 174-175: The code currently hard-codes
getter.WithInsecureSkipVerifyTLS(true) which unconditionally disables TLS
verification; change this so the TLS skip flag is derived from the final
resolved chart URL instead of always true. After chartURL is resolved (the
variable named chartURL used when resolving relative paths), compute a boolean
like insecure := parsedResolvedURL.Scheme == "http" (or use the resolved URL's
Scheme) and pass getter.WithInsecureSkipVerifyTLS(insecure) to the chart
downloader; also keep entry.InsecureSkipTLSVerify in sync with that same
resolved URL decision and remove the unconditional
WithInsecureSkipVerifyTLS(true) usage. Ensure you reference and update the
places that set entry.InsecureSkipTLSVerify and the getter configuration so both
use the resolved chartURL's scheme.

---

Nitpick comments:
In `@pkg/wandb/spec/charts/oci_test.go`:
- Around line 188-190: The test currently sets HELM_CACHE_HOME,
HELM_CONFIG_HOME, and HELM_DATA_HOME to shared os.TempDir() locations which can
collide across specs; change the calls that use GinkgoT().Setenv to instead
create per-spec temp directories via GinkgoT().TempDir() (or
GinkgoT().TempDir()+unique subpaths) and assign those paths to the three env
vars so each spec gets isolated Helm home directories; update the three Setenv
invocations that reference HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME
accordingly.
- Around line 218-223: The tests currently only assert the absence of the
credential error string after calling ociRelease.Apply, which can produce false
positives; update the It blocks that call ociRelease.Apply (the "should retrieve
credentials from secret before pull" tests) to also assert that the returned
error indicates the expected downstream failure (e.g., a pull or connect failure
message) and still assert it does not contain "Failed to get credentials from
secret"; in other words, after err := ociRelease.Apply(...),
Expect(err).To(HaveOccurred()),
Expect(err.Error()).To(ContainSubstring("<expected pull/connect failure text>"))
and Expect(err.Error()).NotTo(ContainSubstring("Failed to get credentials from
secret")) so the spec proves credentials were accepted prior to the pull; apply
the same change to the other similar test blocks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d69359c1-8fa7-4c8f-b737-f63c3d453fe5

📥 Commits

Reviewing files that changed from the base of the PR and between f3abfed and 02f43aa.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (13)

Tiltfile
go.mod
hack/testing-manifests/wandb/default.yaml
pkg/helm/chart.go
pkg/wandb/spec/charts/charts.go
pkg/wandb/spec/charts/local.go
pkg/wandb/spec/charts/oci.go
pkg/wandb/spec/charts/oci_test.go
pkg/wandb/spec/charts/repo.go
pkg/wandb/spec/charts/repo_test.go
pkg/wandb/spec/config.go
pkg/wandb/spec/spec.go
pkg/wandb/spec/specfakes/fake_chart.go

coderabbitai · 2026-04-08T18:05:42Z

 func (c *ActionableChart) Rollback(version int) error {
 	client := action.NewRollback(c.config)
+	client.WaitStrategy = kube.HookOnlyStrategy
+	client.ServerSideApply = "false"
 	return client.Run(c.releaseName)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the imports in the file to understand the Helm version head -50 pkg/helm/chart.go | grep -E "import|helm"

Repository: wandb/operator

Length of output: 414

🏁 Script executed:

# Look for the Rollback function context and any version parameter usage cat -n pkg/helm/chart.go | sed -n '120,135p'

Repository: wandb/operator

Length of output: 686

🏁 Script executed:

# Search for other uses of action.Rollback in the codebase to understand the API rg "action\.Rollback|action\.NewRollback" -A 5

Repository: wandb/operator

Length of output: 321

🏁 Script executed:

# Check if there are any tests for the Rollback function that might show expected behavior rg "Rollback" pkg/helm/ -B 2 -A 5

Repository: wandb/operator

Length of output: 874

🏁 Script executed:

# Check go.mod to see the exact Helm v4 version grep "helm.sh/helm" go.mod

Repository: wandb/operator

Length of output: 81

🏁 Script executed:

# Look for any other files that use action.Rollback to see if there's a pattern find . -name "*.go" -type f -exec grep -l "action.Rollback" {} \;

Repository: wandb/operator

Length of output: 40

🏁 Script executed:

# Check if Version field is used anywhere else in the codebase with Helm actions rg "client\.Version\s*=" -B 2 -A 2

Repository: wandb/operator

Length of output: 40

🏁 Script executed:

# Look at the NewRollback and Rollback struct definition by checking imports and usage patterns rg "Rollback" pkg/helm/chart.go -B 5 -A 10

Repository: wandb/operator

Length of output: 1272

🏁 Script executed:

# Check Helm v4 API documentation or package info for action.Rollback # First, let's see if there's vendor directory or go.sum that might give us more info find . -path "*/vendor/helm.sh/helm/v4/pkg/action*" -name "*.go" 2>/dev/null | head -5

Repository: wandb/operator

Length of output: 40

🏁 Script executed:

# Let's search for any documentation or comments about Rollback in the codebase rg -i "rollback.*version|version.*rollback" --type md --type txt

Repository: wandb/operator

Length of output: 40

🏁 Script executed:

# Check if there's a go.sum entry that might help us understand the module structure grep -A 2 "helm.sh/helm/v4" go.sum

Repository: wandb/operator

Length of output: 347

🏁 Script executed:

# Let's check the Helm v4 source code by looking for the action package structure # Since vendor might not exist, let's check if we can access go package info python3 << 'EOF' import subprocess import json # Try to get package info from go try: result = subprocess.run( ["go", "list", "-m", "-json", "helm.sh/helm/v4"], capture_output=True, text=True, cwd="." ) if result.returncode == 0: data = json.loads(result.stdout) print("Helm v4 Module Info:") print(f"Path: {data.get('Path')}") print(f"Version: {data.get('Version')}") print(f"Dir: {data.get('Dir')}") except Exception as e: print(f"Error: {e}") EOF

Repository: wandb/operator

Length of output: 170

🏁 Script executed:

# Let's look at the Helm v4 repository online or check if go can tell us the module path # Looking for clues in how other Helm actions are used in the codebase rg "action\.New" pkg/helm/ -A 3

Repository: wandb/operator

Length of output: 968

🏁 Script executed:

# Check the Rollback struct definition in the Helm v4 module cat /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go | head -100

Repository: wandb/operator

Length of output: 3321

🏁 Script executed:

# Let's look for the NewRollback function and Rollback struct definition grep -n "type Rollback struct\|func NewRollback\|Version" /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go | head -20

Repository: wandb/operator

Length of output: 1028

🏁 Script executed:

# Get the full rollback.go file if it's not too large wc -l /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go

Repository: wandb/operator

Length of output: 133

🏁 Script executed:

# Let's look at the prepareRollback method more carefully to understand Version usage sed -n '110,160p' /home/jailuser/go/pkg/mod/helm.sh/helm/v4@v4.1.3/pkg/action/rollback.go

Repository: wandb/operator

Length of output: 1534

Assign the requested version to the Helm rollback action.

The Rollback(version int) method receives a version parameter but never assigns it to the rollback client. This causes the action to ignore the parameter and always use the default behavior (rolling back to the previous version). The Version field in Helm v4's action.Rollback struct controls the target revision, and it must be set for callers to target a specific release.

🛠️ Proposed fix

func (c *ActionableChart) Rollback(version int) error { client := action.NewRollback(c.config) + client.Version = version client.WaitStrategy = kube.HookOnlyStrategy client.ServerSideApply = "false" return client.Run(c.releaseName) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (c *ActionableChart) Rollback(version int) error {

client := action.NewRollback(c.config)

client.WaitStrategy = kube.HookOnlyStrategy

client.ServerSideApply = "false"

return client.Run(c.releaseName)

func (c *ActionableChart) Rollback(version int) error {

client := action.NewRollback(c.config)

client.Version = version

client.WaitStrategy = kube.HookOnlyStrategy

client.ServerSideApply = "false"

return client.Run(c.releaseName)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/helm/chart.go` around lines 126 - 130, The Rollback(version int) method on ActionableChart ignores the incoming version; update the code in Rollback to set the Helm rollback target by assigning the parameter to the action.Rollback client's Version field (i.e., after creating client := action.NewRollback(c.config) set client.Version = version) before calling client.Run(c.releaseName); this ensures callers can target a specific release revision.

Pre-existing issue — Rollback() is never called anywhere in the codebase:

$ grep -r '\.Rollback(' pkg/ internal/ --include='*.go' (no results)

The version parameter is part of the ActionableChartInterface but has no callers. Not fixing in this PR to avoid scope creep on dead code.

@zacharyblasczyk, understood — since Rollback() has no callers, fixing the unused version parameter is indeed out of scope here. Would you like me to open a follow-up GitHub issue to track this so it doesn't get lost?

Add Makefile targets for local OCI registry with basic auth: - make local-registry: starts registry:2 on port 5050 with htpasswd auth - make local-registry-push: pulls chart from charts.wandb.ai and pushes to local registry (version read from oci.yaml manifest) - make local-registry-secret: creates K8s secret with registry creds - make local-registry-stop: tears down the registry Add hack/testing-manifests/wandb/oci.yaml for testing OCI chart pulls with credential secret authentication. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

zacharyblasczyk · 2026-04-08T18:26:30Z

OCI Auth E2E Testing

Tested the full OCI chart pull path with authenticated registry on a local Orbstack cluster via Tilt.

Setup

tilt-settings.json:

{
  "allowedContexts": [
    "docker-desktop",
    "minikube",
    "kind-kind",
    "orbstack"
  ],
  "installMinio": true,
  "installWandb": true,
  "wandbCRD": "oci"
}

Start the local OCI registry and push the chart:

make local-registry-push    # starts auth registry on :5050, pushes chart
make local-registry-secret  # creates K8s secret with creds
tilt up                     # deploys operator + wandb CR using oci.yaml

What was tested

OCI pull with basic auth — operator reads oci-registry-creds secret, authenticates to host.internal:5050, pulls operator-wandb:0.41.3
Plain HTTP registry — plainHTTP: true in the CR spec
Credential secret defaulting — credentialSecret.name without explicit key overrides defaults to HELM_USERNAME/HELM_PASSWORD
Full install lifecycle — chart pulled → Helm install → 20 pods running, release status deployed

Results

Registry correctly rejects unauthenticated requests (401 UNAUTHORIZED)
Operator retrieves credentials from secret and authenticates successfully
Chart installs cleanly with ServerSideApply=false and HookOnlyStrategy
No errors in operator logs

Cleanup

make local-registry-stop

Fixes GO-2026-4762 / GHSA-p77j-4mvh-x3m3: authorization bypass in gRPC-Go caused by improper validation of the HTTP/2 :path pseudo-header. All versions before v1.79.3 are affected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

Makefile (2)

101-101: Make CHART_VERSION parsing robust.

Current parsing can pick the wrong version: key if the manifest gains additional version fields, causing pushes with the wrong chart tag.

Suggested diff

-CHART_VERSION ?= $(shell grep 'version:' $(OCI_MANIFEST) | head -1 | sed 's/.*"\(.*\)"/\1/')
+CHART_VERSION ?= $(shell awk '\
+  /^[[:space:]]*chart:[[:space:]]*$$/{in_chart=1; next} \
+  in_chart && /^[[:space:]]*version:[[:space:]]*/ { \
+    gsub(/"/,"",$$2); print $$2; exit \
+  }' $(OCI_MANIFEST))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 101, The CHART_VERSION assignment is brittle and can pick
the wrong "version:" key from $(OCI_MANIFEST); update the Makefile so
CHART_VERSION reads the chart version using a proper YAML-aware extractor (e.g.,
use yq to eval the top-level .version field from $(OCI_MANIFEST)) or at minimum
change the shell parsing to match only a top-level "version:" line (anchor the
regex to line start) to ensure the correct chart version is retrieved. Modify
the CHART_VERSION variable assignment to use the chosen YAML parser or anchored
pattern while still referencing $(OCI_MANIFEST).

123-127: Ensure temp directory cleanup on failure paths.

rm -rf $$tmpdir only executes on success. A failing helm pull/helm push leaves temp dirs behind.

Suggested diff

 local-registry-push: local-registry ## Push a chart from the wandb helm repo to the local OCI registry.
-	`@tmpdir`=$$(mktemp -d) && \
-	helm pull $(CHART_NAME) --repo $(CHART_REPO) --version $(CHART_VERSION) -d $$tmpdir && \
-	helm push $$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz oci://localhost:$(REGISTRY_PORT)/wandb \
-		--plain-http --username $(REGISTRY_USER) --password $(REGISTRY_PASS) && \
-	rm -rf $$tmpdir
+	`@tmpdir`=$$(mktemp -d); trap 'rm -rf "$$tmpdir"' EXIT; \
+	helm pull $(CHART_NAME) --repo $(CHART_REPO) --version $(CHART_VERSION) -d $$tmpdir; \
+	helm push $$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz oci://localhost:$(REGISTRY_PORT)/wandb \
+		--plain-http --username $(REGISTRY_USER) --password $(REGISTRY_PASS)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 123 - 127, The current Makefile snippet creates a
temporary directory in a shell line (tmpdir=$$(mktemp -d)) but only removes it
on the success path, leaving temp dirs behind if helm pull or helm push fail;
fix by registering a shell trap immediately after creating the temp dir to
always remove it on exit (e.g., after tmpdir=$$(mktemp -d) add trap 'rm -rf
"$$tmpdir"' EXIT) and then run the existing helm pull/helm push commands (ensure
the trap uses the same $$tmpdir variable and that the trap is set before any
command that can fail); reference the tmpdir variable and the helm pull/helm
push commands in your change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 105-113: The existing check only looks for running containers and
will fail if a stopped container named $(REGISTRY_NAME) exists; update the
Makefile logic around the block that uses $(CONTAINER_TOOL) ps --filter
name=$(REGISTRY_NAME) to also detect stopped containers (use ps -a) and, if a
container with that name exists but is not running, remove it before invoking
$(CONTAINER_TOOL) run --name $(REGISTRY_NAME); specifically, add a branch to
call $(CONTAINER_TOOL) rm -f $(REGISTRY_NAME) (or the equivalent remove command)
when a non-running container is found so the subsequent $(CONTAINER_TOOL) run -d
--name $(REGISTRY_NAME) will succeed.

---

Nitpick comments:
In `@Makefile`:
- Line 101: The CHART_VERSION assignment is brittle and can pick the wrong
"version:" key from $(OCI_MANIFEST); update the Makefile so CHART_VERSION reads
the chart version using a proper YAML-aware extractor (e.g., use yq to eval the
top-level .version field from $(OCI_MANIFEST)) or at minimum change the shell
parsing to match only a top-level "version:" line (anchor the regex to line
start) to ensure the correct chart version is retrieved. Modify the
CHART_VERSION variable assignment to use the chosen YAML parser or anchored
pattern while still referencing $(OCI_MANIFEST).
- Around line 123-127: The current Makefile snippet creates a temporary
directory in a shell line (tmpdir=$$(mktemp -d)) but only removes it on the
success path, leaving temp dirs behind if helm pull or helm push fail; fix by
registering a shell trap immediately after creating the temp dir to always
remove it on exit (e.g., after tmpdir=$$(mktemp -d) add trap 'rm -rf "$$tmpdir"'
EXIT) and then run the existing helm pull/helm push commands (ensure the trap
uses the same $$tmpdir variable and that the trap is set before any command that
can fail); reference the tmpdir variable and the helm pull/helm push commands in
your change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ba87872-b11d-4baf-bcbd-2acfc869ef2d

📥 Commits

Reviewing files that changed from the base of the PR and between 02f43aa and 801749c.

📒 Files selected for processing (2)

Makefile
hack/testing-manifests/wandb/oci.yaml

✅ Files skipped from review due to trivial changes (1)

hack/testing-manifests/wandb/oci.yaml

The getter option WithInsecureSkipVerifyTLS was hard-coded to true, disabling TLS certificate validation for all chart downloads including HTTPS. Use the already-computed entry.InsecureSkipTLSVerify which is true only for HTTP URLs, preserving TLS verification for HTTPS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When loading credentials from a Kubernetes secret, check that the expected keys exist and are non-empty. Previously, missing keys would silently produce empty username/password strings, leading to opaque auth failures when pulling from the registry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…cret Remove stopped containers before starting a new registry to prevent name conflicts. Add explicit namespace (WANDB_NAMESPACE, default: default) to the credential secret creation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

pkg/wandb/spec/charts/repo.go (2)

123-129: ⚠️ Potential issue | 🟠 Major

Fail fast on missing/empty credential keys in repo secret path.

Lines 128-129 directly index secret.Data; missing keys silently become empty credentials and cause opaque auth failures downstream. Please mirror the explicit key existence/non-empty checks already used elsewhere.

Proposed fix

 		secret := &corev1.Secret{}
 		err := c.Get(ctx, client.ObjectKey{Name: r.CredentialSecret.Name, Namespace: wandb.Namespace}, secret)
 		if err != nil {
 			log.Error(err, "Failed to get credentials from secret")
 			return err
 		}
-		r.Username = string(secret.Data[r.CredentialSecret.UsernameKey])
-		r.Password = string(secret.Data[r.CredentialSecret.PasswordKey])
+		usernameBytes, ok := secret.Data[r.CredentialSecret.UsernameKey]
+		if !ok || len(usernameBytes) == 0 {
+			return fmt.Errorf("credential secret %s/%s missing key %q",
+				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.UsernameKey)
+		}
+		passwordBytes, ok := secret.Data[r.CredentialSecret.PasswordKey]
+		if !ok || len(passwordBytes) == 0 {
+			return fmt.Errorf("credential secret %s/%s missing key %q",
+				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.PasswordKey)
+		}
+		r.Username = string(usernameBytes)
+		r.Password = string(passwordBytes)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 123 - 129, The current code reads
secret.Data directly into r.Username and r.Password after c.Get (using
r.CredentialSecret.Name) which lets missing keys become empty credentials;
change this to explicitly check secret.Data for r.CredentialSecret.UsernameKey
and r.CredentialSecret.PasswordKey, ensure each key exists and the byte slice is
non-empty, and if either check fails log a descriptive error and return an error
instead of assigning an empty string to r.Username/r.Password; reference the
c.Get call, secret.Data map, r.CredentialSecret.UsernameKey,
r.CredentialSecret.PasswordKey, r.Username and r.Password when locating where to
add the checks.

174-175: ⚠️ Potential issue | 🟠 Major

Compute TLS-skip from resolved chartURL, not only repo URL scheme.

Line 206 currently inherits the flag from Lines 174-175 (r.URL scheme). If the index resolves to an HTTPS chart URL while the repo URL is HTTP, cert verification can be unintentionally disabled for that HTTPS download. Derive the insecure flag from the final resolved chart URL after Lines 259-262.

For Helm v4.1.3, confirm whether getter.WithInsecureSkipVerifyTLS(...) on downloader.ChartDownloader applies to HTTPS chart artifact downloads based on the final chart URL, and whether mixed-scheme (HTTP repo index -> HTTPS chart URL) is supported/recommended.

Also applies to: 206-206, 259-262

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 174 - 175, Compute and set
TLS-skip from the final resolved chart URL instead of the repo URL: after
resolving the index entry to the actual chart URL (the variable holding the
resolved chart URL used around the code that follows lines 259-262), parse that
final chart URL and set entry.InsecureSkipTLSVerify = parsedFinalChartURL.Scheme
== "http"; also ensure the ChartDownloader/getter is instantiated with
getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify) so downloads honor
the flag (keep existing r.Debug logging around the condition). This replaces the
current use of r.URL scheme and ensures the insecure flag reflects the actual
chart artifact URL.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/wandb/spec/charts/oci.go`:
- Around line 59-61: The current logic in the block that appends
registry.ClientOptBasicAuth when r.Username and r.Password are both non-empty
silently allows anonymous auth if only one of r.Username or r.Password is set;
change this to validate that both are provided or neither: check the
exclusive-or condition (r.Username == "" XOR r.Password == "") and if true
return a clear error (e.g. "both username and password must be provided
together") instead of proceeding, otherwise continue to append
registry.ClientOptBasicAuth(r.Username, r.Password) when both are non-empty;
update the function's error return path accordingly.

---

Duplicate comments:
In `@pkg/wandb/spec/charts/repo.go`:
- Around line 123-129: The current code reads secret.Data directly into
r.Username and r.Password after c.Get (using r.CredentialSecret.Name) which lets
missing keys become empty credentials; change this to explicitly check
secret.Data for r.CredentialSecret.UsernameKey and
r.CredentialSecret.PasswordKey, ensure each key exists and the byte slice is
non-empty, and if either check fails log a descriptive error and return an error
instead of assigning an empty string to r.Username/r.Password; reference the
c.Get call, secret.Data map, r.CredentialSecret.UsernameKey,
r.CredentialSecret.PasswordKey, r.Username and r.Password when locating where to
add the checks.
- Around line 174-175: Compute and set TLS-skip from the final resolved chart
URL instead of the repo URL: after resolving the index entry to the actual chart
URL (the variable holding the resolved chart URL used around the code that
follows lines 259-262), parse that final chart URL and set
entry.InsecureSkipTLSVerify = parsedFinalChartURL.Scheme == "http"; also ensure
the ChartDownloader/getter is instantiated with
getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify) so downloads honor
the flag (keep existing r.Debug logging around the condition). This replaces the
current use of r.URL scheme and ensures the insecure flag reflects the actual
chart artifact URL.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0a3da00a-de1b-4ef9-85a8-4fc618aaa855

📥 Commits

Reviewing files that changed from the base of the PR and between 801749c and 9bb9f61.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (3)

go.mod
pkg/wandb/spec/charts/oci.go
pkg/wandb/spec/charts/repo.go

🚧 Files skipped from review as they are similar to previous changes (1)

go.mod

When only one of username or password is provided, the code silently falls back to anonymous pull. Fail fast with a clear error so misconfiguration is immediately visible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

zacharyblasczyk · 2026-04-08T19:03:15Z

@coderabbitai full review

coderabbitai · 2026-04-08T19:03:21Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

pkg/wandb/spec/charts/repo.go (1)
174-206: ⚠️ Potential issue | 🟠 Major

Derive the chart-download TLS policy from the resolved chartURL.

entry.InsecureSkipTLSVerify is the right setting for fetching the repo index, but reusing it in getter.WithInsecureSkipVerifyTLS(...) is still unsafe when the index returns an absolute chart URL. An http:// index can legitimately point at an absolute https:// tarball URL, and this code would then disable certificate validation for the chart download. Keep the repo-level flag scoped to the index fetch and compute a separate downloader flag after chartURL has been normalized. Based on learnings, relative chart URLs are already resolved against the repository base URL, so the remaining gap here is specifically absolute URLs from the index.
🔒 Proposed fix
-	getterOpts := []getter.Option{
-		getter.WithBasicAuth(r.Username, r.Password),
-		getter.WithInsecureSkipVerifyTLS(entry.InsecureSkipTLSVerify),
-	}
+	getterOpts := []getter.Option{
+		getter.WithBasicAuth(r.Username, r.Password),
+	}
@@
 	chartURL := cv.URLs[0]
 	if !strings.HasPrefix(chartURL, "http://") && !strings.HasPrefix(chartURL, "https://") {
 		chartURL = fmt.Sprintf("%s/%s", strings.TrimSuffix(r.URL, "/"), chartURL)
 	}
+	parsedChartURL, err := url.Parse(chartURL)
+	if err != nil {
+		return "", fmt.Errorf("failed to parse chart URL %q: %w", chartURL, err)
+	}
+	getterOpts = append(getterOpts, getter.WithInsecureSkipVerifyTLS(parsedChartURL.Scheme == "http"))
Also applies to: 259-262
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/repo.go` around lines 174 - 206, The repo-level TLS
flag entry.InsecureSkipTLSVerify should only control index fetches; after you
normalize/resolve chartURL you must compute a separate downloader flag (e.g.,
downloaderInsecure := parsedChartURL.Scheme == "http" or parsedChartURL.Scheme
!= "https") and pass that to getter.WithInsecureSkipVerifyTLS instead of reusing
entry.InsecureSkipTLSVerify. Keep entry.InsecureSkipTLSVerify for the index
fetch logic (where you set it now), then parse/normalize chartURL, derive the
downloader flag from the parsed chartURL's scheme, and replace the call to
getter.WithInsecureSkipVerifyTLS(...) so it uses the new downloaderInsecure
value (apply the same change for the other occurrence referenced around lines
259-262).

🧹 Nitpick comments (3)

Makefile (1)

101-101: Make CHART_VERSION extraction deterministic.

Line [101] parses YAML via grep/sed, which is brittle (first version: wins and quoted format is assumed). Use a YAML-aware query to avoid silent mismatches.

Proposed refactor

-CHART_VERSION ?= $(shell grep 'version:' $(OCI_MANIFEST) | head -1 | sed 's/.*"\(.*\)"/\1/')
+CHART_VERSION ?= $(shell yq -r '.spec.chart.version' $(OCI_MANIFEST))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 101, The CHART_VERSION assignment is brittle because it
greps/seds YAML from $(OCI_MANIFEST); replace that pipeline with a YAML-aware
extraction (e.g., use yq or a small python one-liner) to read the version key
reliably from the manifest. Update the CHART_VERSION definition to call yq (or
python yaml.safe_load) against $(OCI_MANIFEST) and extract the top-level version
field deterministically (handle quoted/unquoted values and multi-document
manifests), so CHART_VERSION is set from the parsed YAML rather than
text-matching.

pkg/wandb/spec/charts/oci_test.go (1)

188-190: Use a per-spec Helm home instead of fixed temp paths.

These env vars point every spec at the same os.TempDir()/oci-test-* directories while OCIRelease.pullChart() enables Helm caching. Shared cache/config/data paths leak state across specs and, if the suite is run with ginkgo -p, across nodes. Prefer a fresh GinkgoT().TempDir() and derive the three homes from that.

🧪 Suggested cleanup

 import (
 	"context"
-	"os"
 	"path/filepath"
@@
-			GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(os.TempDir(), "oci-test-cache"))
-			GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(os.TempDir(), "oci-test-config"))
-			GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(os.TempDir(), "oci-test-data"))
+			helmHome := GinkgoT().TempDir()
+			GinkgoT().Setenv("HELM_CACHE_HOME", filepath.Join(helmHome, "cache"))
+			GinkgoT().Setenv("HELM_CONFIG_HOME", filepath.Join(helmHome, "config"))
+			GinkgoT().Setenv("HELM_DATA_HOME", filepath.Join(helmHome, "data"))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci_test.go` around lines 188 - 190, The test sets
HELM_CACHE_HOME, HELM_CONFIG_HOME, and HELM_DATA_HOME to fixed os.TempDir()
subpaths which shares state across specs and parallel nodes; update the test in
pkg/wandb/spec/charts/oci_test.go to create a fresh per-spec base directory
using GinkgoT().TempDir() and derive the three Helm homes from that base (e.g.,
filepath.Join(base, "cache"), "config", "data") before calling
OCIRelease.pullChart() so each spec gets isolated Helm cache/config/data
directories.

pkg/wandb/spec/charts/oci.go (1)

126-131: Keep credential-key defaults local to Apply().

Even with a value receiver, r.CredentialSecret.UsernameKey = ... and PasswordKey = ... mutate the shared pointed struct. That makes Apply() silently default the release config in place, and those changes can leak into later pkg/wandb/spec/spec.go Spec.IsEqual() comparisons. Use local usernameKey / passwordKey variables instead of writing back into the config object.

♻️ Suggested refactor

 	if r.CredentialSecret != nil {
-		if r.CredentialSecret.UsernameKey == "" {
-			r.CredentialSecret.UsernameKey = CredentialUsernameKey
-		}
-		if r.CredentialSecret.PasswordKey == "" {
-			r.CredentialSecret.PasswordKey = CredentialPasswordKey
-		}
+		usernameKey := r.CredentialSecret.UsernameKey
+		if usernameKey == "" {
+			usernameKey = CredentialUsernameKey
+		}
+		passwordKey := r.CredentialSecret.PasswordKey
+		if passwordKey == "" {
+			passwordKey = CredentialPasswordKey
+		}
 		log.Info("Retrieving credentials from secret",
 			"name", r.CredentialSecret.Name,
-			"usernameKey", r.CredentialSecret.UsernameKey,
-			"passwordKey", r.CredentialSecret.PasswordKey)
+			"usernameKey", usernameKey,
+			"passwordKey", passwordKey)
@@
-		usernameBytes, ok := secret.Data[r.CredentialSecret.UsernameKey]
+		usernameBytes, ok := secret.Data[usernameKey]
 		if !ok || len(usernameBytes) == 0 {
 			return fmt.Errorf("credential secret %s/%s missing key %q",
-				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.UsernameKey)
+				wandb.Namespace, r.CredentialSecret.Name, usernameKey)
 		}
-		passwordBytes, ok := secret.Data[r.CredentialSecret.PasswordKey]
+		passwordBytes, ok := secret.Data[passwordKey]
 		if !ok || len(passwordBytes) == 0 {
 			return fmt.Errorf("credential secret %s/%s missing key %q",
-				wandb.Namespace, r.CredentialSecret.Name, r.CredentialSecret.PasswordKey)
+				wandb.Namespace, r.CredentialSecret.Name, passwordKey)
 		}

Also applies to: 144-155

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/wandb/spec/charts/oci.go` around lines 126 - 131, In Apply(), avoid
mutating r.CredentialSecret by reading usernameKey and passwordKey into local
variables (e.g., usernameKey := r.CredentialSecret.UsernameKey; if usernameKey
== "" { usernameKey = CredentialUsernameKey } and similarly for passwordKey) and
use those locals when building the chart/secret, leaving
r.CredentialSecret.UsernameKey and .PasswordKey untouched; apply the same
local-variable refactor for the other identical block handling credentials (the
one that currently sets r.CredentialSecret.PasswordKey/UsernameKey) so
Spec.IsEqual comparisons aren’t affected by Apply() side effects.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 124-128: The command chain creates tmpdir with tmpdir=$$(mktemp
-d) but only deletes it at the end of the && chain so failures leave residue;
change the recipe to create the temp dir, register a cleanup (e.g., trap 'rm -rf
"$$tmpdir"' EXIT) immediately after tmpdir creation, then run helm pull/helm
push using $$tmpdir (helm pull $(CHART_NAME) ... -d $$tmpdir and helm push
$$tmpdir/$(CHART_NAME)-$(CHART_VERSION).tgz ...), ensuring the trap guarantees
rm -rf $$tmpdir runs on both success and failure (and optionally remove the trap
on successful exit).

---

Duplicate comments:
In `@pkg/wandb/spec/charts/repo.go`:
- Around line 174-206: The repo-level TLS flag entry.InsecureSkipTLSVerify
should only control index fetches; after you normalize/resolve chartURL you must
compute a separate downloader flag (e.g., downloaderInsecure :=
parsedChartURL.Scheme == "http" or parsedChartURL.Scheme != "https") and pass
that to getter.WithInsecureSkipVerifyTLS instead of reusing
entry.InsecureSkipTLSVerify. Keep entry.InsecureSkipTLSVerify for the index
fetch logic (where you set it now), then parse/normalize chartURL, derive the
downloader flag from the parsed chartURL's scheme, and replace the call to
getter.WithInsecureSkipVerifyTLS(...) so it uses the new downloaderInsecure
value (apply the same change for the other occurrence referenced around lines
259-262).

---

Nitpick comments:
In `@Makefile`:
- Line 101: The CHART_VERSION assignment is brittle because it greps/seds YAML
from $(OCI_MANIFEST); replace that pipeline with a YAML-aware extraction (e.g.,
use yq or a small python one-liner) to read the version key reliably from the
manifest. Update the CHART_VERSION definition to call yq (or python
yaml.safe_load) against $(OCI_MANIFEST) and extract the top-level version field
deterministically (handle quoted/unquoted values and multi-document manifests),
so CHART_VERSION is set from the parsed YAML rather than text-matching.

In `@pkg/wandb/spec/charts/oci_test.go`:
- Around line 188-190: The test sets HELM_CACHE_HOME, HELM_CONFIG_HOME, and
HELM_DATA_HOME to fixed os.TempDir() subpaths which shares state across specs
and parallel nodes; update the test in pkg/wandb/spec/charts/oci_test.go to
create a fresh per-spec base directory using GinkgoT().TempDir() and derive the
three Helm homes from that base (e.g., filepath.Join(base, "cache"), "config",
"data") before calling OCIRelease.pullChart() so each spec gets isolated Helm
cache/config/data directories.

In `@pkg/wandb/spec/charts/oci.go`:
- Around line 126-131: In Apply(), avoid mutating r.CredentialSecret by reading
usernameKey and passwordKey into local variables (e.g., usernameKey :=
r.CredentialSecret.UsernameKey; if usernameKey == "" { usernameKey =
CredentialUsernameKey } and similarly for passwordKey) and use those locals when
building the chart/secret, leaving r.CredentialSecret.UsernameKey and
.PasswordKey untouched; apply the same local-variable refactor for the other
identical block handling credentials (the one that currently sets
r.CredentialSecret.PasswordKey/UsernameKey) so Spec.IsEqual comparisons aren’t
affected by Apply() side effects.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc8279cc-31ec-494e-8d57-42c520b1df68

📥 Commits

Reviewing files that changed from the base of the PR and between f3abfed and 3fd599f.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (15)

Makefile
Tiltfile
go.mod
hack/testing-manifests/wandb/default.yaml
hack/testing-manifests/wandb/oci.yaml
pkg/helm/chart.go
pkg/wandb/spec/charts/charts.go
pkg/wandb/spec/charts/local.go
pkg/wandb/spec/charts/oci.go
pkg/wandb/spec/charts/oci_test.go
pkg/wandb/spec/charts/repo.go
pkg/wandb/spec/charts/repo_test.go
pkg/wandb/spec/config.go
pkg/wandb/spec/spec.go
pkg/wandb/spec/specfakes/fake_chart.go

Add orbstack to the allowed Kubernetes contexts in README and tilt-settings.sample.json. Minor formatting cleanup in README. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

casey-coreweave

Validated for local OCI paths 👍 Added a commit to change parsing on the OCI tags so we can handle all the formats (like hash refs) and allow optional versions.

Removes stale entries for older grpc-gateway, otelhttp, otel, and x/net/oauth2 versions that were no longer referenced after recent dependency bumps, resolving the Dependency Check CI job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

zacharyblasczyk and others added 2 commits April 2, 2026 08:44

This was referenced Apr 2, 2026

Dead code: unreachable fallback path in RepoRelease.downloadChart #155

Open

Rollback ignores version parameter #157

Open

InitConfig accesses config.Releases without checking Init error #156

Open

zacharyblasczyk and others added 8 commits April 2, 2026 12:39

chore: add orbstack to Tiltfile allowed contexts

b5740e9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: go mod tidy and update dependencies

c2c5e0b

Update controller-runtime v0.23.1→v0.23.3, validator v10.26.0→v10.30.2, gomega v1.39.0→v1.39.1, and transitive dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cleanup

02f43aa

zacharyblasczyk marked this pull request as ready for review April 8, 2026 17:54

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread Makefile

Comment thread Makefile Outdated

zacharyblasczyk and others added 3 commits April 8, 2026 13:31

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread pkg/wandb/spec/charts/oci.go

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread Makefile

zacharyblasczyk and others added 2 commits April 8, 2026 15:52

chore: add orbstack to docs and sample tilt settings

015cdf5

Add orbstack to the allowed Kubernetes contexts in README and tilt-settings.sample.json. Minor formatting cleanup in README. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(chart): Use helm to validate references for versions and digest refs

4845456

casey-coreweave approved these changes Apr 17, 2026

View reviewed changes

casey-coreweave added 2 commits April 17, 2026 14:58

fix(dependency): Bump helm go module for security fix

5a258a1

fix(dependency): Bump golang.opentelemetry.io packages for security fix

f76ba75

casey-coreweave force-pushed the feat/oci-helm-chart-support branch from bd3014f to f76ba75 Compare April 17, 2026 21:59

Conversation

zacharyblasczyk commented Mar 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Helm v4 decisions

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

w-b-hivemind Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

HiveMind Sessions

Uh oh!

socket-security Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

socket-security Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zacharyblasczyk Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zacharyblasczyk commented Apr 8, 2026

OCI Auth E2E Testing

Setup

What was tested

Results

Cleanup

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zacharyblasczyk commented Apr 8, 2026

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

casey-coreweave left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

zacharyblasczyk commented Mar 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 30, 2026 •

edited

Loading

w-b-hivemind Bot commented Mar 30, 2026 •

edited

Loading

socket-security Bot commented Mar 30, 2026 •

edited

Loading

socket-security Bot commented Mar 30, 2026 •

edited

Loading

coderabbitai Bot Apr 8, 2026 •

edited

Loading