Skip to content

Repo Radius: externalize control-plane and Terraform state (rad startup / rad shutdown)#12214

Merged
sylvainsf merged 10 commits into
mainfrom
repo-radius-storage
Jun 25, 2026
Merged

Repo Radius: externalize control-plane and Terraform state (rad startup / rad shutdown)#12214
sylvainsf merged 10 commits into
mainfrom
repo-radius-storage

Conversation

@sylvainsf

Copy link
Copy Markdown
Contributor

Repo Radius: externalize control-plane and Terraform state (rad startup / rad shutdown)

Description

Implements Investment 2 of the Repo Radius feature spec — externalization of the Radius data
store
, scoped to the state-storage aspects only (control-plane PostgreSQL state and Terraform
recipe state). It adds two kind-agnostic commands, rad startup and rad shutdown, that back up
and restore all durable Radius state across an ephemeral control plane, plus the Helm and
Terraform-state plumbing they depend on.

Design note: eng/design-notes/2026-06-repo-radius-state-storage.md.

There is no dedicated workspace kind. The commands operate on the current workspace's
Kubernetes context like any other command and do not create or delete clusters or install Radius —
cluster lifecycle is the caller's responsibility.

What's included

The change is organized as three logical commits:

  1. PostgreSQL enablement (closes the chart gaps for database.enabled=true)

    • UCP, Applications RP, and Dynamic RP configmaps/deployments now switch to the postgresql
      provider and inject POSTGRES_PASSWORD when database.enabled=true (previously hardcoded to
      apiserver).
    • Adds an init-db ConfigMap (option 3 from FEATURE: Postgres DB initialization #8398) that creates the per-RP databases, users, and
      tables before the servers start.
    • Fixes the POSTGRES_DB secret value and pins the postgres image to 16-alpine.
    • Fixes the databaseProvider URL env-var substitution in factory.go.
    • Adds helm-unittest coverage and a unit test for the env-var helper.
  2. Terraform recipe state backup/restore (pkg/cli/tfstate)

    • Terraform recipes store state in Kubernetes Secrets (tfstate=true), not in PostgreSQL, so
      that state is lost on teardown. This package exports and restores those Secrets, including the
      chunked tfstate-{workspace}-{suffix}-{index} Secrets for large state.
  3. rad startup / rad shutdown + end-to-end lifecycle test

    • pkg/cli/pgbackup: control-plane PostgreSQL backup/restore via kubectl exec pg_dump/psql.
    • pkg/cli/gitstate: persists the state directory to a radius-state git orphan branch in an
      isolated worktree; the backup push fails loudly when a remote is configured (a failed push
      would otherwise be silent data loss) and tolerates the no-remote local/test case.
    • rad shutdown backs up both stores then commits and pushes; rad startup waits for the
      database then restores both stores.

Testing

Unit tests cover the chart rendering, the Terraform-state round-trip (fake Kubernetes client), the
git orphan-branch worktree behaviour (real temporary repos), and the command runners
(hand-written fakes). All pass, along with the existing 85 Helm chart tests.

End-to-end test dependency (please read)

The end-to-end lifecycle test lives at test/functional-portable/statestore and exercises the
full path that this work exists to protect: install → deploy a Terraform-backed resource →
rad shutdown → tear down → reinstall → rad startupdeploy an update to the same resource.
The update is the path that fails when Terraform state is lost.

This test depends on the separate Repo Radius workflow code (in flight) that creates the
ephemeral cluster, installs Radius, and runs the deploy.
Because rad startup / rad shutdown
are intentionally kind-agnostic and do not manage cluster lifecycle, the test needs that workflow
to stand up the environment. Until it lands, the test drives the install/uninstall itself and is
gated behind the RADIUS_STATE_E2E environment variable, so it does not run in the normal
functional suite. Once the shared cluster-create + deploy workflow is merged, the test's
install/uninstall helpers should be re-pointed at that code rather than duplicating it.

Related

Type of change

This pull request adds new features (state externalization commands) for Radius.

@sylvainsf sylvainsf requested a review from a team as a code owner June 23, 2026 04:23
Copilot AI review requested due to automatic review settings June 23, 2026 04:23
@sylvainsf sylvainsf requested a review from a team as a code owner June 23, 2026 04:23
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Repo Radius “state externalization” support by enabling PostgreSQL-backed control-plane state and introducing rad shutdown / rad startup to back up/restore both PostgreSQL databases and Terraform recipe state (Kubernetes Secrets) via a git orphan branch.

Changes:

  • Enable database.enabled=true end-to-end in the Helm chart (PostgreSQL provider wiring, init-db ConfigMap, image pinning, chart tests).
  • Add Terraform-state Secret backup/restore (pkg/cli/tfstate) and PostgreSQL dump/restore (pkg/cli/pgbackup), persisted through a git orphan-branch worktree (pkg/cli/gitstate).
  • Add rad startup / rad shutdown commands plus unit tests and an opt-in destructive functional lifecycle test.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/functional-portable/statestore/noncloud/statestore_lifecycle_test.go Adds opt-in destructive E2E lifecycle test for shutdown/startup preserving Terraform state across reinstall.
pkg/components/database/databaseprovider/storageprovider_test.go Adds unit tests for ${VAR} env substitution in PostgreSQL connection URLs.
pkg/components/database/databaseprovider/factory.go Implements ${VAR} expansion for PostgreSQL URL config via environment variables.
pkg/cli/tfstate/tfstate.go New package to back up/restore Terraform Kubernetes-backend state Secrets to/from the state directory.
pkg/cli/tfstate/tfstate_test.go Unit tests for tfstate backup/restore (fake clientset).
pkg/cli/pgbackup/pgbackup.go New helpers to pg_dump / psql the control-plane DBs via kubectl exec, plus readiness wait.
pkg/cli/gitstate/gitstate.go New orphan-branch worktree manager for persisting state outside the main working tree.
pkg/cli/gitstate/gitstate_test.go Unit tests for orphan-branch worktree and push behavior using real temp git repos.
pkg/cli/cmd/startup/stateclient.go Startup command wrapper interface over pg/tf restore operations (testable).
pkg/cli/cmd/startup/startup.go Implements rad startup to open worktree, wait for DB, restore DB + Terraform state.
pkg/cli/cmd/startup/startup_test.go Unit tests ensuring restore order and failure short-circuiting for rad startup.
pkg/cli/cmd/shutdown/stateclient.go Shutdown command wrapper interface over pg/tf backup operations (testable).
pkg/cli/cmd/shutdown/shutdown.go Implements rad shutdown to back up DB + Terraform state then commit/push to orphan branch.
pkg/cli/cmd/shutdown/shutdown_test.go Unit tests ensuring both stores are backed up and commit/push happens (or fails correctly).
eng/design-notes/2026-06-repo-radius-state-storage.md Adds technical design note for Repo Radius state storage approach and decisions.
deploy/Chart/values.yaml Pins PostgreSQL image tag default to 16-alpine.
deploy/Chart/tests/database_test.yaml Adds helm-unittest coverage for database-enabled/disabled rendering and wiring.
deploy/Chart/templates/ucp/deployment.yaml Injects POSTGRES_PASSWORD env var into UCP when database.enabled=true.
deploy/Chart/templates/ucp/configmaps.yaml Switches UCP database provider to PostgreSQL when enabled (URL uses ${POSTGRES_PASSWORD}).
deploy/Chart/templates/rp/deployment.yaml Injects POSTGRES_PASSWORD into applications-rp when enabled.
deploy/Chart/templates/rp/configmaps.yaml Switches applications-rp database provider to PostgreSQL when enabled.
deploy/Chart/templates/dynamic-rp/deployment.yaml Injects POSTGRES_PASSWORD into dynamic-rp when enabled.
deploy/Chart/templates/dynamic-rp/configmaps.yaml Switches dynamic-rp database provider to PostgreSQL when enabled.
deploy/Chart/templates/database/statefulset.yaml Mounts init-db scripts into the Postgres StatefulSet.
deploy/Chart/templates/database/configmaps.yaml Fixes POSTGRES_DB secret value to literal "radius".
deploy/Chart/templates/database/configmap-initdb.yaml Adds init-db script ConfigMap to create per-RP DBs/users and the resources table.
cmd/rad/cmd/root.go Wires rad startup and rad shutdown into the CLI root command.

Comment thread pkg/cli/gitstate/gitstate.go Outdated
Comment thread pkg/cli/gitstate/gitstate.go
Comment thread pkg/components/database/databaseprovider/storageprovider_test.go
Comment thread test/functional-portable/statestore/noncloud/statestore_lifecycle_test.go Outdated
Comment thread eng/design-notes/2026-06-repo-radius-state-storage.md Outdated
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown

Unit Tests

    2 files  ± 0    450 suites  +8   7m 39s ⏱️ +10s
5 586 tests +44  5 584 ✅ +44  2 💤 ±0  0 ❌ ±0 
6 783 runs  +61  6 781 ✅ +61  2 💤 ±0  0 ❌ ±0 

Results for commit 5e2ee54. ± Comparison against base commit 385f38e.

♻️ This comment has been updated with latest results.

@sylvainsf sylvainsf force-pushed the repo-radius-storage branch from 40a7aaf to 83c7893 Compare June 23, 2026 04:37
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 56.03774% with 233 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.84%. Comparing base (385f38e) to head (5e2ee54).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/cli/pgbackup/pgbackup.go 6.00% 94 Missing ⚠️
pkg/cli/tfstate/tfstate.go 66.29% 18 Missing and 12 partials ⚠️
pkg/cli/gitstate/gitstate.go 70.10% 16 Missing and 13 partials ⚠️
pkg/cli/cmd/startup/startup.go 72.50% 15 Missing and 7 partials ⚠️
pkg/cli/controlplane/controlplane.go 67.74% 13 Missing and 7 partials ⚠️
pkg/cli/cmd/shutdown/shutdown.go 72.58% 13 Missing and 4 partials ⚠️
pkg/cli/cmd/startup/stateclient.go 15.38% 11 Missing ⚠️
pkg/cli/cmd/shutdown/stateclient.go 22.22% 7 Missing ⚠️
...kg/components/database/databaseprovider/factory.go 78.57% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12214      +/-   ##
==========================================
+ Coverage   52.81%   52.84%   +0.03%     
==========================================
  Files         743      751       +8     
  Lines       47788    48313     +525     
==========================================
+ Hits        25238    25532     +294     
- Misses      20197    20385     +188     
- Partials     2353     2396      +43     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@brooke-hamilton brooke-hamilton self-requested a review June 23, 2026 20:59
@DariuszPorowski DariuszPorowski disabled auto-merge June 23, 2026 22:19

@brooke-hamilton brooke-hamilton left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: externalize control-plane & Terraform state (rad startup / rad shutdown)

I reviewed the code statically, ran the unit tests, reproduced the failures against a real Postgres container, and ran a full local end-to-end install on kind with database.enabled=true. The new core Go packages (gitstate, tfstate, controlplane) are well-structured and the factory.go env-var fix is correct — but the database.enabled=true startup path does not work out of the box. There are two independent blocking bugs, both reproduced end-to-end, plus some non-blocking notes and test-coverage gaps.

The two blocking items also have inline comments anchored to the exact lines (deploy/Chart/values.yaml and deploy/Chart/templates/database/configmap-initdb.yaml).


🔴 Blocking

1. database.tag: 16-alpine points at a mirror image that does not exist

deploy/Chart/values.yaml:200-201 changes tag: latest → 16-alpine, resolving via the radius.image helper to ghcr.io/radius-project/mirror/postgres:16-alpine, which is not in the mirror. database-0 sits in ImagePullBackOff and the control plane never starts.

EXISTS     ghcr.io/radius-project/mirror/postgres:latest      <- pre-PR value
NOT FOUND  ghcr.io/radius-project/mirror/postgres:16-alpine   <- this PR's value
NOT FOUND  ghcr.io/radius-project/mirror/postgres:16
NOT FOUND  ghcr.io/radius-project/mirror/postgres:16.4-alpine
EXISTS     docker.io/library/postgres:16-alpine               <- real upstream

The upstream tag exists on Docker Hub, but the radius mirror was never populated with it. This breaks database.enabled=true in CI and every default-registry environment. I could only get past it with --set database.image=docker.io/library/postgres.

Fix (pick one): populate the mirror with postgres:16-alpine before merge, revert to a tag that exists in the mirror, or point image/tag at an upstream-resolvable reference. Either way, add CI coverage for database.enabled=true so this can't regress silently.

2. permission denied for table resources — init-db never grants the per-RP users

deploy/Chart/templates/database/configmap-initdb.yaml:33-48 creates the resources table while connected as $POSTGRES_USER (superuser radius), but each RP connects as its own per-RP user (ucp, applications_rp, dynamic_rp) and is never granted privileges. After overriding the image so Postgres started, init-db ran cleanly and created the per-RP DBs/tables, then UCP crashed on startup:

Service api terminated with error: ERROR: permission denied for table resources (SQLSTATE 42501)

followed by a nil-pointer panic on the failed-startup shutdown path. applications-rp/dynamic-rp hit the same wall as soon as they take traffic.

Fix: in the table-creation loop, grant the per-RP user (precedent already exists in build/scripts/start-radius.sh:247-248):

GRANT ALL PRIVILEGES ON TABLE resources TO "$RESOURCE_PROVIDER";
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO "$RESOURCE_PROVIDER";

or create the table while connected as the per-RP user instead of the superuser.


🟡 Non-blocking

  • expandEnvURL silently expands unset vars to empty string (pkg/components/database/databaseprovider/factory.go:88-98). os.Getenv returns "" for a missing var, so a typo'd/unset variable yields a malformed connection string and a confusing downstream error rather than a clear "required env var X not set". Consider os.LookupEnv + fail-fast.
  • init-db schema duplicates deploy/init-db/db.sql.txt. The resources DDL now lives in two places; they will drift. Source it from one place.
  • replicasOf coerces 0 → 1. An explicit replicas: 0 (intentional scale-to-zero) is silently overridden to 1. Distinguish "unset" from "explicitly zero".
  • Postgres password is regenerated on every chart apply, which will break an already-initialized data directory whose users were created with the previous password. Persist/Secret-pin it.
  • rad startup/rad shutdown require a git repo but the error path when none is present could be friendlier.
  • startup.Run defers a second ScaleUp that can double-scale on the success path. Worth a second look.

🧪 Test coverage

Core packages are well covered (gitstate ~80%, tfstate ~79%, controlplane ~78%). Gaps:

  • pkg/cli/pgbackup has 0% coverage — no test file at all. HasBackup is pure file logic and trivially unit-testable; please add pgbackup_test.go.
  • Validate() is 0% in both pkg/cli/cmd/startup and pkg/cli/cmd/shutdown. 51 other cmd packages exercise Validate() via the shared radcli.SharedCommandValidation harness — please follow that pattern. (Run() is otherwise well covered via fakes.)

How I validated

Static review of all 30 changed files → go build/go test on changed packages (pass) → standalone postgres:16-alpine container to reproduce the GRANT failure → full kind install with custom-built images and database.enabled=true to confirm both blockers end-to-end. Both reproduce from a clean install. Happy to re-validate once they're addressed.

brooke-hamilton

This comment was marked as duplicate.

brooke-hamilton

This comment was marked as duplicate.

Comment thread deploy/Chart/values.yaml
Comment thread deploy/Chart/templates/database/configmap-initdb.yaml Outdated
@sylvainsf

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough end-to-end review — both blocking bugs are fixed in 9da4f58, and I've addressed most of the non-blocking notes. Summary:

🔴 Blocking — fixed

  • Postgres image: now docker.io/library/postgres:16-alpine (pullable; mirror only has :latest).
  • permission denied for table resources: init-db now grants the per-RP users on the table + sequences (matches start-radius.sh).

🟡 Non-blocking — fixed

  • expandEnvURL fail-fast: now uses os.LookupEnv and returns an error listing any unset variables instead of silently producing a malformed connection string.
  • replicasOf coercing 0 → 1: now preserves an explicit replicas: 0; only an unset count defaults to 1.
  • startup.Run double-ScaleUp: the deferred scale-up is guarded by a scaledBackUp flag that the success path sets, so it never double-scales — it only runs as a safety net when a restore step returns early. Left as-is (correct), happy to refactor for clarity if you'd prefer.

🧪 Test coverage — fixed

  • Added pkg/cli/pgbackup/pgbackup_test.go (covers HasBackup).
  • Added Validate() + command-shape tests for rad startup and rad shutdown via the shared radcli harness.
  • Added helm-unittest guards asserting the database image is the pullable reference and that the init-db script grants the per-RP users — so both blocking regressions are now caught in CI without a cluster.

Deferred (with rationale) — happy to do these if you'd like them in this PR

  • CI job for database.enabled=true: the chart-render guards above catch the two regressions you hit cheaply. A full install-and-pull functional job is a larger CI/infra addition I'd prefer to scope separately (it also overlaps the gated statestore e2e lifecycle test). Tracking as a follow-up.
  • init-db schema duplicates deploy/init-db/db.sql.txt: real drift risk, but db.sql.txt lives outside the chart directory so Helm .Files.Get can't source it. De-duplicating cleanly needs a small build step to vendor the SQL into the chart — proposing a follow-up rather than a risky restructure here.
  • Postgres password regenerated on every apply (randAlphaNum 16): this is pre-existing (unchanged by this PR) and the proper fix is the lookup-or-generate Secret pattern, which is a separate hardening change. Filing a follow-up.

Ready for re-validation whenever you have a chance.

@sylvainsf

Copy link
Copy Markdown
Contributor Author

Filed the deferred follow-ups as tracking issues:

Fixes the pre-existing gaps that prevented database.enabled=true from
producing a working PostgreSQL-backed control plane:

- UCP, Applications RP, and Dynamic RP configmaps/deployments were
  hardcoded to the apiserver provider with no database.enabled
  conditional; they now switch to the postgresql provider and inject
  POSTGRES_PASSWORD from the database secret when database.enabled=true.
- Add init-db ConfigMap (mounted at /docker-entrypoint-initdb.d) that
  creates the per-RP databases, users, and tables on first start
  (option 3 from #8398).
- Fix POSTGRES_DB secret value (was the literal string POSTGRES_DB).
- Pin the postgres image tag to 16-alpine.
- Fix the databaseProvider URL env-var substitution in factory.go, which
  replaced the entire URL with the first captured variable name instead
  of expanding env-var references.

Adds helm-unittest coverage for the conditional rendering and a unit
test for the env-var expansion helper.

Also adds the Repo Radius state-storage technical design note.

Relates to #8096, #8398

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Terraform recipes store their state in Kubernetes Secrets (the Terraform
"kubernetes" backend) in the radius-system namespace, not in the Radius
PostgreSQL databases. On an ephemeral control plane those Secrets are
destroyed on teardown, so a second deploy of the same Terraform-backed
resource in a later run plans against an empty backend and either fails
or orphans cloud resources.

Add a pkg/cli/tfstate package that exports the Secrets labelled
tfstate=true to a state directory and restores them into a fresh cluster
before any deploy runs. The label selector also captures the chunked
tfstate-{workspace}-{suffix}-{index} Secrets the backend creates for
large state. Server-managed fields are stripped on backup, and restore
is idempotent (create-or-update).

Covered by unit tests using the client-go fake clientset.

Relates to #8096

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Adds the two commands that back up and restore all durable Radius state
across an ephemeral control plane, plus the end-to-end lifecycle test.

The commands operate on the current workspace's Kubernetes context like
any other command. They do not create or delete clusters and do not
install Radius; cluster lifecycle is the caller's responsibility. There
is no dedicated workspace kind.

- pkg/cli/pgbackup: control-plane PostgreSQL backup/restore via
  "kubectl exec pg_dump/psql".
- pkg/cli/gitstate: persists the state directory to a git orphan branch
  (radius-state) in an isolated worktree. CommitAndPush fails loudly when
  a remote is configured (a failed backup push would otherwise be silent
  data loss) and tolerates the no-remote local/test case.
- rad shutdown: backs up the control-plane databases and the Terraform
  state Secrets, then commits and pushes them.
- rad startup: waits for the database, then restores the control-plane
  databases and the Terraform state Secrets.

The lifecycle test (test/functional-portable/statestore) installs Radius
with database.enabled=true, deploys a Terraform-backed resource, shuts
down, uninstalls, reinstalls, starts up, and deploys an update to the
same resource -- the cross-run path that fails when Terraform state is
lost. It is destructive and requires a cluster, so it is skipped unless
RADIUS_STATE_E2E is set.

Relates to #8096

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…e test

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
- gitstate: treat a git fetch failure as fatal when the state branch
  exists on the remote, so a transient network/credential error cannot
  silently restore stale or empty state.
- gitstate: inject a fallback git identity (Radius <radius@radapp.io>)
  for the state commit when the repo has none configured, so rad
  shutdown works in fresh CI environments.
- tfstate: drop the write to the deprecated Secret.SelfLink field
  (staticcheck SA1019), which was failing the lint check.
- databaseprovider test: set MISSING explicitly so the env-var
  expansion test is not flaky on runners that happen to have it set.
- statestore e2e: assert at least one tfstate Secret exists rather than
  exactly one, since the backend may shard large state across multiple
  Secrets.
- design note: mark the checksum manifest as future work (not
  implemented in this delivery) and fix a spelling (behaviour ->
  behavior).
- .cspellignore: add Sylvain.

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
rad startup runs after rad install, so the control-plane pods are already
running and connected to PostgreSQL when state is restored. Restoring a
pg_dump (DROP TABLE / CREATE TABLE) underneath those live connections can
invalidate the providers' cached prepared statements (pgx default
QueryExecModeCacheStatement; the resources table OID changes), and races
the UCP initializer's boot-time writes.

rad startup now scales the database-backed deployments (ucp,
applications-rp, dynamic-rp) to zero before the restore and back to their
previous replica counts afterward, via a new pkg/cli/controlplane
package. This makes the restore atomic with respect to its consumers and
ensures the providers establish fresh connection pools against the
restored schema. The deployment engine and dashboard do not connect to
PostgreSQL and are left running. The control plane is always scaled back
up, including on a failed restore, and a deployment whose previous
replica count was zero is restored to one.

Adds unit tests for the scaler (fake clientset with a reconciling reactor
that mirrors spec replicas to status) and updates the startup runner
tests to assert the scale-down -> restore -> scale-up ordering.

Records the decision in the state-storage design note.

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Two blocking bugs reproduced end-to-end by review, plus several
non-blocking improvements and test-coverage gaps.

Blocking:
- The postgres image resolved to ghcr.io/radius-project/mirror/postgres:16-alpine,
  which the registry mirror does not publish (only :latest), causing
  ImagePullBackOff. Point the chart at docker.io/library/postgres:16-alpine,
  which is pullable and keeps the version pinned.
- The init-db script created the resources table as the superuser but never
  granted the per-RP users (ucp, applications_rp, dynamic_rp) access, so UCP
  crashed on startup with "permission denied for table resources (42501)".
  Grant the per-RP user privileges on the table and sequences inside the
  table-creation loop (matches build/scripts/start-radius.sh).

Non-blocking:
- expandEnvURL now fails fast (os.LookupEnv) when a referenced env var is
  unset, instead of silently producing a malformed connection string.
- controlplane.replicasOf preserves an explicit replicas: 0 rather than
  coercing it to 1, so ScaleUp faithfully restores the prior state.

Test coverage:
- Add pkg/cli/pgbackup/pgbackup_test.go covering HasBackup.
- Add Validate()/command-shape tests for rad startup and rad shutdown via
  the shared radcli validation harness.
- Add helm-unittest guards asserting the database image is the pullable
  reference and that the init-db script grants the per-RP users, so both
  blocking regressions are caught in CI without a cluster.

Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
@sylvainsf sylvainsf force-pushed the repo-radius-storage branch from 9da4f58 to 9a157fd Compare June 24, 2026 02:44
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

Functional Tests - corerp-cloud

27 tests  ±0   27 ✅ ±0   36m 6s ⏱️ - 1m 43s
 2 suites ±0    0 💤 ±0 
 1 files   ±0    0 ❌ ±0 

Results for commit 5e2ee54. ± Comparison against base commit 385f38e.

♻️ This comment has been updated with latest results.

@sylvainsf sylvainsf enabled auto-merge June 24, 2026 21:46
@sylvainsf sylvainsf mentioned this pull request Jun 24, 2026
12 tasks
@radius-functional-tests

radius-functional-tests Bot commented Jun 25, 2026

Copy link
Copy Markdown

Radius functional test overview

🔍 Go to test action run

Click here to see the test run details
Name Value
Repository radius-project/radius
Commit ref 5e2ee54
Unique ID func602f934113
Image tag pr-func602f934113
  • KinD: v0.29.0
  • Dapr: 1.14.4
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.3.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-func602f934113
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-func602f934113
  • dynamic-rp test image location: ghcr.io/radius-project/dev/dynamic-rp:pr-func602f934113
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-func602f934113
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-func602f934113
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting ucp-cloud functional tests...
⌛ Starting corerp-cloud functional tests...
✅ ucp-cloud functional tests succeeded
✅ corerp-cloud functional tests succeeded

@github-actions

Copy link
Copy Markdown

Functional Tests - upgrade-noncloud

3 tests  ±0   1 ✅  - 2   7m 9s ⏱️ + 3m 45s
1 suites ±0   0 💤 ±0 
1 files   ±0   2 ❌ +2 

For more details on these failures, see this check.

Results for commit 5e2ee54. ± Comparison against base commit 385f38e.

@sylvainsf sylvainsf added this pull request to the merge queue Jun 25, 2026
Merged via the queue into main with commit 8412ca0 Jun 25, 2026
73 of 75 checks passed
@sylvainsf sylvainsf deleted the repo-radius-storage branch June 25, 2026 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants