Skip to content

fix(cli): stop declarative sync hanging on edge-runtime container exit#5714

Merged
avallete merged 9 commits into
developfrom
claude/issue-312-investigation-mec6f6
Jul 1, 2026
Merged

fix(cli): stop declarative sync hanging on edge-runtime container exit#5714
avallete merged 9 commits into
developfrom
claude/issue-312-investigation-mec6f6

Conversation

@avallete

@avallete avallete commented Jun 26, 2026

Copy link
Copy Markdown
Member

Fixes supabase db schema declarative sync --experimental (and other pg-delta flows) hanging indefinitely at 0% CPU after the shadow-database work completes.

Closes supabase/pg-toolbelt#312

There are two independent causes, both needed to unblock the command:

1. The Edge Runtime worker never exits

The pg-delta scripts run in a one-shot Edge Runtime container and rely on the event loop draining for the worker to be destroyed and the container to stop. The catalog-export script opens a real connection pool (createManagedPool); when a keepalive handle stays registered after close() resolves, the worker never exits, so the container never stops. Only the error path force-closed the loop (throw new Error("")); the success path did not.

Fix: force-close the event loop on the success path of every pg-delta script once its output is flushed, so the worker is torn down deterministically:

  • templates/pgdelta_catalog_export.ts — catalog snapshot (the path reported in test: secret list command #312)
  • templates/pgdelta.ts — diff SOURCE→TARGET
  • templates/pgdelta_declarative_export.ts — declarative file export
  • templates/pgdelta_declarative_apply.ts — apply declarative schema
  • the inline catalog-export script in internal/db/pgcache/cache.go — the db start / db push migrations-catalog cache path

The byte-for-byte embedded copies in legacy-pgdelta.deno-templates.ts are kept in sync.

2. Following the container log stream hangs under podman

With the worker now exiting, the container stops — but the parent still hung for podman users. A goroutine dump on the stuck __catalog subprocess showed the block in stdcopy.StdCopy reading the Follow:true Docker log stream (DockerStreamLogs): podman's /containers/<id>/logs?follow endpoint does not close when the container stops, so the read never gets EOF and blocks forever. Docker closes it, which is why this only reproduced on podman.

Fix: run the bounded edge-runtime scripts via a new DockerRunOnceWaitWithConfig, which detects completion by polling ContainerInspect (reliable on podman) and then reads the buffered logs without following. The shared DockerStreamLogs (live streaming for functions serve) and DockerRunOnceWithConfig (large streaming output for db dump) are intentionally left unchanged, so only the bounded pg-delta/edge-runtime path changes behaviour.

Regression coverage is added for both: guard tests asserting each script's success path force-closes, and a test pinning that the runner reads captured output and maps the inspected exit code without following the log stream.

https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3

… output

`supabase db schema declarative sync` could hang indefinitely at 0% CPU
after all migrations were applied to the shadow database
(supabase/pg-toolbelt#312).

Root cause: the pg-delta Deno scripts run inside a one-shot Edge Runtime
container and rely on the event loop draining for the worker to be
destroyed and the container to exit. The catalog-export script opens a
real connection pool (createManagedPool); when a keepalive handle lingers
after close() resolves, the worker never exits, so the container never
stops. The CLI streams that container's logs with Follow:true
(DockerStreamLogs), so a worker that never exits blocks the parent
`__catalog` subprocess — and the declarative-sync command that spawned it
— forever. Only the error path force-closed the loop
(`throw new Error("")`); the success path did not.

Fix: force-close the event loop on the success path of every pg-delta
Edge Runtime script (diff, declarative-export, catalog-export, and
declarative-apply), so the worker is torn down deterministically once
output has been flushed. The output is written synchronously before the
throw, and RunEdgeRuntimeScript already tolerates the resulting
"main worker has been destroyed" exit.

Reproduced against supabase/edge-runtime:v1.74.2: a worker with a
lingering handle keeps the container `running` and `docker logs -f` (the
equivalent of DockerStreamLogs Follow:true) never returns; adding the
force-close makes the container exit immediately with output intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete avallete requested a review from a team as a code owner June 26, 2026 14:45
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

Supabase CLI preview

npx --yes https://pkg.pr.new/supabase/cli/supabase@adb0910c4ed3c8e65793ae2d86e6dab7a03e85e3

Preview package for commit adb0910.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d4e2d749ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/cli-go/internal/db/diff/pgdelta_template_test.go
avallete and others added 3 commits June 26, 2026 17:06
The migrations-catalog cache script (pgcache.TryCacheMigrationsCatalog, used
by db start / db push with pg-delta caching) uses the same
createManagedPool/extractCatalog/close() pattern as the other pg-delta Edge
Runtime scripts and had the same missing success-path force-close, so it could
hang the same way (supabase/pg-toolbelt#312). Add the force-close and a guard
test. Flagged by automated PR review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

Reviewed commit: 1d4b65b0a0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude and others added 3 commits June 30, 2026 10:00
…ts logs

The worker force-close made the edge-runtime container exit, but declarative
sync still hung under podman. A user goroutine dump on the hung __catalog
subprocess showed the block is in stdcopy.StdCopy reading the Follow:true
Docker log stream (DockerStreamLogs): the stream never receives EOF after the
container exits, so the read blocks forever at 0% CPU
(supabase/pg-toolbelt#312). podman's /containers/<id>/logs?follow endpoint does
not close when the container stops, unlike Docker.

Run the bounded edge-runtime scripts via DockerRunOnceWaitWithConfig, which
detects completion by polling ContainerInspect (reliable on podman) and then
reads the buffered logs without following. The shared DockerStreamLogs (used by
functions serve for genuine live streaming) and DockerRunOnceWithConfig (used by
db dump for large streaming output) are left unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XbxecW4DVmwgQB1YX321K3
@avallete avallete marked this pull request as ready for review June 30, 2026 14:02
@avallete avallete changed the title fix(pgdelta): force close event loop on success path fix(cli): stop declarative sync hanging on edge-runtime container exit Jul 1, 2026
@avallete avallete added this pull request to the merge queue Jul 1, 2026
Merged via the queue into develop with commit 38d2f7c Jul 1, 2026
31 checks passed
@avallete avallete deleted the claude/issue-312-investigation-mec6f6 branch July 1, 2026 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

declarative sync hangs indefinitely after migrations applied - goroutine deadlock in __catalog subprocess

3 participants