Skip to content

Graduate CRDs from v1alpha1 to v1beta1#4849

Merged
ChrisJBurns merged 8 commits intomainfrom
worktree-graduate-crds
Apr 21, 2026
Merged

Graduate CRDs from v1alpha1 to v1beta1#4849
ChrisJBurns merged 8 commits intomainfrom
worktree-graduate-crds

Conversation

@ChrisJBurns
Copy link
Copy Markdown
Collaborator

@ChrisJBurns ChrisJBurns commented Apr 15, 2026

Summary

Signal API stability by promoting all ToolHive CRDs from v1alpha1 to v1beta1 — without breaking existing users. Originally scoped as a clean break (delete and recreate all resources), the PR now serves both versions simultaneously so operators can upgrade with zero downtime and migrate manifests at their own pace.

The core insight is that v1alpha1 and v1beta1 are schema-identical — only the version string differs. Rather than duplicate every field across both packages, the new cmd/thv-operator/api/v1alpha1/ package defines only the root resource types (e.g. MCPServer, MCPGroup) as thin wrappers whose Spec and Status fields reference the canonical types from v1beta1 (e.g. Spec v1beta1.MCPServerSpec). Because controller-gen walks field types when building the OpenAPI schema, both versions resolve to the exact same generated schema without any duplication of the actual model. The "migration" is then orchestrated entirely through two flags on each CRD version entry: served and storage. The new CRD manifest lists both versions as served: true — so existing v1alpha1 clients keep working — but marks only v1beta1 as storage: true, meaning new writes go into etcd at v1beta1. v1alpha1 additionally carries deprecated: true with a deprecationWarning, which is what makes kubectl print the migration nudge on every access. Combined with conversion.strategy: None (which is safe precisely because the schemas are identical — Kubernetes just swaps the apiVersion string when serving an object at a different version than it was stored at), users can upgrade the CRD chart without deleting anything, keep reading and writing at v1alpha1 indefinitely, and migrate to v1beta1 at their own pace by simply re-applying their manifests. Once all objects have been re-stored at v1beta1 (tracked via the CRD's status.storedVersions), a future release can drop the v1alpha1 entry from the versions list and remove the wrapper package — completing the graduation with zero downtime at any point in the process.

Closes #2556

What changed across commits
  1. Mechanical rename (earlier commits on this branch): cmd/thv-operator/api/v1alpha1/v1beta1/, all Go import paths and aliases, YAML apiVersion strings in examples/chainsaw fixtures/deploy manifests, hardcoded version strings in pkg/export/k8s.go, and every generated artifact (deepcopy, CRD manifests, webhook manifests, Helm chart CRDs, swagger docs, CRD API reference docs).
  2. Regenerate stale audit deepcopy — drive-by: pkg/audit.Config.DetectApplicationErrors was added without re-running controller-gen; this commit catches it up.
  3. Add v1alpha1 wrapper types — new minimal package: 12 root types + 12 list types (~380 hand-written lines), each reusing v1beta1 Spec/Status. Each root type carries +kubebuilder:deprecatedversion:warning so kubectl prints a migration hint. main.go registers both schemes.
  4. Regenerate CRDs to serve both versions — output of task operator-manifests. Every CRD now lists v1alpha1 (served, non-storage, deprecated) and v1beta1 (served, storage). No conversion webhook (defaults to strategy: None), which is safe because the schemas are identical.
  5. Remove obsolete pre-upgrade check — the Helm hook that blocked upgrades when v1alpha1 resources were present is no longer needed, since those resources now survive the upgrade untouched.

Type of change

  • Other (describe): CRD graduation from v1alpha1 to v1beta1 via multi-version serving

Test plan

  • Unit tests (task test)
  • Build verification (task build — operator binary + all Go packages compile cleanly after the rebase on main)
  • task operator-generate and task operator-manifests re-run cleanly and produce the multi-version CRDs
  • End-to-end upgrade test (local): installed v0.21.0 CRDs + operator from the OCI registry, created one CR of each of the 12 CRD kinds, upgraded the CRDs chart to this branch, rolled out the new operator via task operator-deploy-local, verified every Deployment UID was unchanged (zero-downtime), then re-applied the resources at v1beta1 and confirmed still no restarts and status.storedVersions advanced to include v1beta1. 16 of 16 assertions passed.

Does this introduce a user-facing change?

Yes, but non-breaking. Existing v1alpha1 resources continue to work after the upgrade; no deletion or recreation is required. Users will see a deprecation warning from kubectl on every access to a v1alpha1 resource, and should migrate their manifests to apiVersion: toolhive.stacklok.dev/v1beta1 at their convenience. v1alpha1 will be removed in a future release once all stored objects have migrated.

Special notes for reviewers

This PR is large by line count but most of the volume is either generated output (controller-gen CRD manifests, zz_generated.deepcopy.go) or the original mechanical rename. Worth reviewing carefully:

  • cmd/thv-operator/api/v1alpha1/types.go — the only non-trivial hand-written file. Verify the Spec/Status field references point to v1beta1 and that the +kubebuilder:deprecatedversion and other markers match v1beta1's (minus +kubebuilder:storageversion, which must be on exactly one version).
  • cmd/thv-operator/main.go — confirms both schemes are registered.
  • One sample CRD (e.g. deploy/charts/operator-crds/files/crds/toolhive.stacklok.dev_mcpgroups.yaml) — confirm the structure: both versions listed, v1beta1 marked storage: true, v1alpha1 marked deprecated: true with a warning and storage: false, and the openAPIV3Schema sections under both versions are identical.

The remaining 23 CRD files follow the same pattern.

Large PR Justification

  • no way to split this PR out

Generated with Claude Code

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Apr 15, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 61.32512% with 251 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.56%. Comparing base (117be04) to head (327fa32).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...perator/controllers/virtualmcpserver_controller.go 45.37% 65 Missing ⚠️
...d/thv-operator/controllers/mcpserver_controller.go 63.97% 48 Missing and 1 partial ⚠️
...-operator/controllers/mcpremoteproxy_controller.go 65.90% 45 Missing ⚠️
...rator/controllers/mcptelemetryconfig_controller.go 36.00% 15 Missing and 1 partial ⚠️
...v-operator/controllers/mcpoidcconfig_controller.go 42.30% 15 Missing ⚠️
...-operator/controllers/mcpserverentry_controller.go 72.72% 12 Missing ⚠️
...-operator/controllers/mcpserver_telemetryconfig.go 21.42% 11 Missing ⚠️
...or/controllers/mcpexternalauthconfig_controller.go 66.66% 10 Missing ⚠️
...md/thv-operator/controllers/mcpgroup_controller.go 72.72% 9 Missing ⚠️
.../thv-operator/controllers/toolconfig_controller.go 58.82% 7 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4849      +/-   ##
==========================================
+ Coverage   69.50%   69.56%   +0.06%     
==========================================
  Files         551      552       +1     
  Lines       55933    55949      +16     
==========================================
+ Hits        38874    38921      +47     
+ Misses      14066    14035      -31     
  Partials     2993     2993              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator Author

@ChrisJBurns ChrisJBurns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Graduate CRDs from v1alpha1 to v1beta1

Verified:

  • Core GroupVersion constant correctly changed to v1beta1
  • All Go imports and aliases consistently updated (mcpv1alpha1mcpv1beta1)
  • Hardcoded apiVersion string in pkg/export/k8s.go updated
  • Webhook manifests (both copies) correctly reference v1beta1 paths and apiVersions
  • Helm chart CRDs, swagger docs, and skill docs updated
  • Old v1alpha1/ directory fully removed — zero straggling v1alpha1 references in production code
  • All 377 files show symmetric add/delete counts, confirming purely mechanical changes

Two items to address — both in the design doc, not in production code.

Comment thread design-crd-graduation-v1beta1.md Outdated
Comment thread design-crd-graduation-v1beta1.md Outdated
Comment thread cmd/thv-operator/controllers/mcpserver_controller.go
Comment thread cmd/thv-operator/api/v1beta1/groupversion_info.go
Comment thread cmd/thv-operator/api/v1beta1/groupversion_info.go
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 15, 2026
@github-actions github-actions Bot dismissed their stale review April 15, 2026 13:51

Large PR justification has been provided. Thank you!

@github-actions
Copy link
Copy Markdown
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 15, 2026
@ChrisJBurns ChrisJBurns force-pushed the worktree-graduate-crds branch from 5fc1e31 to ab532b0 Compare April 15, 2026 21:41
@github-actions github-actions Bot removed the size/XL Extra large PR: 1000+ lines changed label Apr 15, 2026
@ChrisJBurns ChrisJBurns force-pushed the worktree-graduate-crds branch from f05ade1 to f17269c Compare April 17, 2026 17:09
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 17, 2026
@ChrisJBurns ChrisJBurns force-pushed the worktree-graduate-crds branch from f17269c to 9eca7a0 Compare April 17, 2026 17:37
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 17, 2026
Signal API stability by promoting all ToolHive CRDs to v1beta1 while
preserving backwards compatibility — the CRDs serve both versions
simultaneously so existing v1alpha1 resources survive the upgrade
untouched and users migrate manifests at their own pace.

v1alpha1 and v1beta1 are schema-identical. The new
cmd/thv-operator/api/v1alpha1/ package defines only the root resource
types (MCPServer, MCPGroup, ...) as thin wrappers whose Spec and Status
fields reference the canonical types from v1beta1. controller-gen walks
field types when building the OpenAPI schema, so both versions resolve
to the same schema without any duplication of the actual model — I
verified empirically that all 12 CRDs produce structurally identical
schemas (modulo the legitimately-different root descriptions).

The upgrade is orchestrated through the `served` and `storage` flags
on each version entry: the new CRDs list both versions as served, mark
v1beta1 as storage, and mark v1alpha1 as deprecated with a warning.
No conversion webhook is configured, which defaults to strategy: None
— safe because the schemas are identical. Kubernetes swaps the
apiVersion string when serving an object at a different version than
it was stored at. Users can upgrade the CRD chart without deleting
anything, keep reading/writing at v1alpha1 indefinitely, and migrate
to v1beta1 by re-applying manifests. Once status.storedVersions no
longer contains v1alpha1, a future release can drop the v1alpha1
entry and remove the wrapper package.

What changed:

- Mechanical rename from v1alpha1 to v1beta1 across all Go imports,
  YAML apiVersion strings, Helm deploy manifests, chainsaw fixtures,
  examples, docs, and generated artifacts.
- New cmd/thv-operator/api/v1alpha1/ package: 12 root types + 12 list
  types as thin wrappers over v1beta1 Spec/Status, plus scheme
  registration. Each root type carries a +kubebuilder:deprecatedversion
  warning so kubectl prints a migration hint on every access.
- Register both schemes in cmd/thv-operator/main.go.
- Regenerated CRD manifests list both versions (v1alpha1 deprecated
  and non-storage, v1beta1 storage) with conversion.strategy: None.
- Removed the obsolete pre-upgrade Helm hook that blocked upgrades
  when v1alpha1 resources existed — no longer needed since those
  resources now survive the upgrade.
- Drive-by: regenerate pkg/audit/zz_generated.deepcopy.go which had
  drifted after DetectApplicationErrors was added to Config without
  re-running controller-gen.

Closes #2556
@ChrisJBurns ChrisJBurns force-pushed the worktree-graduate-crds branch from 6e60a47 to 8280aab Compare April 20, 2026 11:18
@github-actions
Copy link
Copy Markdown
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

JAORMX
JAORMX previously approved these changes Apr 20, 2026
Comment thread cmd/thv-operator/api/v1alpha1/types.go
Comment thread cmd/thv-operator/api/v1beta1/mcpserver_types.go
jhrozek
jhrozek previously approved these changes Apr 20, 2026
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't actually tested the migration (I know you asked for testers), but I did spend a fair bit of time reviewing the PR in an interactive session with CC - mostly because I wasn't familiar with this serve-old-store-new approach. It seems clever and CC called it "a textbook implementation". I really could poke any holes it in, I like how the diff, if you exclude the mechanical replaces, is actually minimal and you can reason about the important things (I did review the annotations myself, old skool). The

I'm going to approve the patch, I guess the longer we hold off the harder it gets for you to merge. Let's chat if you want/need more testing.

Two test files that came in via the merge from main still reference
the v1alpha1 Go package, which no longer exists on this branch. The
operator binaries compile fine, but CI lint/test fail on the missing
mcpv1alpha1 symbol in the test package.

- cmd/thv-operator/controllers/mcpremoteproxy_deployment_test.go
- cmd/thv-operator/controllers/mcpserver_resource_overrides_test.go

Applied the same mcpv1alpha1 → mcpv1beta1 rename that the rest of
the branch uses. No behavioural change.
jhrozek
jhrozek previously approved these changes Apr 21, 2026
rdimitrov
rdimitrov previously approved these changes Apr 21, 2026
# Conflicts:
#	cmd/thv-operator/pkg/controllerutil/authserver_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Graduate CRDs from v1alpha1 to v1beta1

5 participants