Skip to content

fix(terraform): wire DATABASE_URL into dashboard Container App#282

Merged
tomqwu merged 3 commits into
mainfrom
fix/terraform-dashboard-postgres-asymmetry
May 11, 2026
Merged

fix(terraform): wire DATABASE_URL into dashboard Container App#282
tomqwu merged 3 commits into
mainfrom
fix/terraform-dashboard-postgres-asymmetry

Conversation

@tomqwu
Copy link
Copy Markdown
Owner

@tomqwu tomqwu commented May 11, 2026

Summary

Closes the dashboard ↔ DB persistence asymmetry documented in PR #273 on the Terraform Container Apps deploy path. The Helm chart was fixed in PR #271; this commit mirrors the fix into Terraform.

Before

  • API Container App: `DATABASE_URL` injected when `enable_postgres = true`.
  • Dashboard Container App: only `COSMOS_ENDPOINT` injected (when `enable_cosmos = true`). On the Postgres path, the dashboard pod silently fell back to local SQLite → Run History (page 15) and Comparative Analytics (page 19) showed stale/empty results even though the API was correctly writing to Postgres.

After

  • Both Container Apps inject `DATABASE_URL` when `enable_postgres = true`.
  • The connection string lives once in `local.postgres_database_url` (Entra-ID auth via the shared per-app UAMI). Both `secret` blocks reference it so the value can't drift.
  • No new role grants needed — the dashboard already shared the API's UAMI and that UAMI already had Postgres AD admin via the existing `azurerm_postgresql_flexible_server_active_directory_administrator.aml_uami`.

Pin

New `test_database_url_injected_on_both_container_apps` in `tests/test_terraform_lint.py` scopes its assertion per-`azurerm_container_app` resource so a future refactor can't silently re-introduce the asymmetry — mirrors the Helm-side `TestHelmPostgresFirstPrecedence` (#271).

Docs

CLAUDE.md and deploy/terraform/README.md updated — the "known issue" section is now a "resolved" section naming the regression tests on both Helm and Terraform.

Test plan

  • `terraform validate` — Success
  • `terraform fmt -check -recursive` — clean
  • `pytest tests/test_terraform_lint.py -v` — 6 passed (1 new)
  • `pytest tests/ --ignore=tests/test_e2e_dashboard.py -q` — 2169 passed
  • `/codex:review --base main` — initial pass: 3 blockers + 1 nit, all fixed in commit 2 (hoisted local, scoped test, refreshed docs)
  • CI green
  • Reviewer approval

Deploy impact

Applying this on an existing Postgres-path deploy creates a new dashboard Container App revision (transient ~30s rollover; Streamlit pod is stateless aside from in-memory session state). Cosmos-path deploys are unaffected (the new `secret` and `env` blocks are gated by `var.enable_postgres`).

🤖 Generated with Claude Code

tomqwu and others added 2 commits May 11, 2026 10:00
The Helm chart's dashboard-deployment.yaml was flipped to postgres-
first in PR #271, but the Terraform Container Apps deploy still
mirrored the pre-#271 shape: the dashboard Container App injected
`COSMOS_ENDPOINT` on the Cosmos path but never `DATABASE_URL` on
the Postgres path. PR #273 documented this as the known
"dashboard ↔ DB persistence asymmetry" — the dashboard pod silently
fell back to local SQLite while the API correctly wrote to Postgres,
so Run History (page 15) and Comparative Analytics (page 19) showed
stale/empty results.

This commit closes the asymmetry by mirroring the Helm flip:

1. Add a `dynamic "env"` block for `DATABASE_URL` to the dashboard
   Container App template, gated on `var.enable_postgres` — ordered
   before the Cosmos blocks so the env-var injection precedence
   matches `_active_backend()` in `src/aml_framework/api/db.py`.
2. Add a matching `dynamic "secret"` block defining `database-url`
   on the dashboard Container App resource. Container Apps secrets
   are resource-scoped, so the dashboard needs its own block even
   though the value (Entra-ID-authed postgres URL) is identical to
   the API's.

No new role grants needed: the dashboard already uses the same UAMI
as the API (`module.onboard.identity_id`), and that UAMI already has
Postgres AD admin via the existing
`azurerm_postgresql_flexible_server_active_directory_administrator`
resource. Entra-ID auth Just Works on both pods.

New test `test_database_url_injected_on_both_container_apps` in
`tests/test_terraform_lint.py` pins the new shape: counts ≥2
references to the `database-url` secret name and ≥2 `secret` block
definitions, so a future refactor can't silently re-introduce the
asymmetry. Mirrors `TestHelmPostgresFirstPrecedence` from #271.

Verification:
- `terraform validate` — Success.
- `terraform fmt -check` — clean.
- Full unit + API suite: 2169 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review on c123c55 flagged three blockers:

1. Duplicated secret value. The Postgres URL string was inlined in
   both the API and dashboard `database-url` secret blocks, risking
   future drift. Hoisted into `local.postgres_database_url` and
   referenced by both blocks. Empty string when postgres isn't
   enabled (the secret blocks are gated by `var.enable_postgres`
   anyway, so the empty value never emits).

2. Test scope too broad. The body-wide regex/count could pass if
   two matches existed anywhere — e.g., both on the API resource.
   Now scopes the assertion to per-`azurerm_container_app` blocks
   (api + dashboard), asserts each block independently contains
   `DATABASE_URL`, the secret reference, AND the `local.postgres_
   database_url` read. So a single-app duplication can't slip
   through.

3. Stale docs. CLAUDE.md and deploy/terraform/README.md still
   described the asymmetry as a known broken issue. Both updated
   to describe the resolved postgres-first wiring on Helm and
   Terraform, with the regression-test pins named explicitly.
   README.md also notes the revision rollover on apply.

Plus the inline-comment nit: dropped the "line ~105" reference to
the AD-admin grant in favour of the resource name
(`azurerm_postgresql_flexible_server_active_directory_administrator.
aml_uami`).

Verification:
- `terraform validate` — Success.
- `terraform fmt -check -recursive` — clean.
- Full unit + API suite: 2169 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tomqwu tomqwu enabled auto-merge (squash) May 11, 2026 14:09
Codex re-review nit: the scoped test asserted the `database-url`
secret reference + the `local.postgres_database_url` read, but
didn't pin the env-var name. A future refactor could land an env
block named e.g. `PG_CONNECTION_STRING` sourcing the same secret
and the assertion would still pass — but the runtime contract is
specifically `os.environ.get("DATABASE_URL")` in db.py:40.

Added `re.search(r'name\\s+=\\s+"DATABASE_URL"', block)` for each of
the API and dashboard `azurerm_container_app` blocks so the env-var
name itself is now pinned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tomqwu tomqwu merged commit 3b01308 into main May 11, 2026
11 of 12 checks passed
@tomqwu tomqwu deleted the fix/terraform-dashboard-postgres-asymmetry branch May 11, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant