Skip to content

SQL splitter matches 'atomic' too aggressively, triggering SQLSTATE 42601 #5062

@adpreethamkumar96

Description

@adpreethamkumar96

Bug report

CLI version

2.84.2 (also reproduced on 1.226.4, at a different migration)

$ supabase --version
2.84.2

Environment

  • macOS 26.4 (Darwin 25.4.0)
  • Apple Silicon (arm64)
  • Docker Desktop 29.2.1
  • PostgreSQL 17.6.1.063 (Supabase official image)
  • Installation: brew install supabase/tap/supabase

Summary

The migration runner's SQL splitter (or its interaction with pgx's extended
query protocol) raises ERROR: cannot insert multiple commands into a prepared statement (SQLSTATE 42601) when a migration file contains a
CREATE [OR REPLACE] FUNCTION statement whose unquoted identifier contains
the substring atomic (case-insensitive) and any further SQL statement
after that function body's closing $$;.

My working hypothesis is that the CLI's splitter is performing a naive text
search for the token ATOMIC (as in PostgreSQL 14's
CREATE FUNCTION ... BEGIN ATOMIC ... END alternative-body syntax) and
treating everything up to the end of the file — or the next top-level
END — as the function body. The subsequent SQL statements are then
concatenated into a single Parse message, which Postgres rejects because
the extended query protocol permits only one command per Parse.

The splitter does not tokenize SQL properly — comments, string literals,
and quoted/unquoted identifiers are all treated as raw text.

Minimal reproducer

Create a fresh Supabase project and put this one file in
supabase/migrations/:

-- 001_repro.sql — triggers SQLSTATE 42601 on `supabase db reset`

CREATE OR REPLACE FUNCTION public.atomic_example()
RETURNS INTEGER
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN 1;
END;
$$;

-- ^^ any statement below this line triggers the bug
GRANT EXECUTE ON FUNCTION public.atomic_example() TO authenticated;

Then:

supabase start
supabase db reset

Expected: migration applies cleanly. Postgres parses
CREATE FUNCTION ... atomic_example ... $$; as one statement and
GRANT EXECUTE ... as a separate statement, since atomic_example is an
identifier, not the BEGIN ATOMIC keyword sequence.

Actual:

Applying migration 001_repro.sql...
GRANT EXECUTE ON FUNCTION public.atomic_example() TO authenticated;
Try rerunning the command with --debug to troubleshoot the error.

With --debug:

PG Recv: {"Type":"ErrorResponse","Severity":"ERROR","Code":"42601",
         "Message":"cannot insert multiple commands into a prepared statement",
         "File":"postgres.c","Line":1478,"Routine":"exec_parse_message"}
ERROR: cannot insert multiple commands into a prepared statement (SQLSTATE 42601)

Variations confirmed with the same reproducer project

Variation Result
Function body contains the word atomic in a comment, with no trailing statement ✅ passes
Function body contains atomic in a comment, with trailing GRANT ❌ fails
Function name contains atomic (atomic_foo, foo_atomic, abc_atomic_bar) + trailing statement ❌ fails
Function name atomic_foo with no trailing statement (nothing after $$;) ✅ passes
Whole file wrapped in BEGIN; ... COMMIT; ❌ still fails
Top-level -- atomic comment (no function with atomic in its name) + CREATE FUNCTION + GRANT ✅ passes

Summary: the presence of atomic in a CREATE FUNCTION identifier,
combined with a following statement, is the trigger. atomic in comments
alone is fine.

Workaround

Rename the function to avoid the substring atomic. A _tx suffix works
well semantically for transactional RPCs:

CREATE OR REPLACE FUNCTION public.example_tx()  -- was: atomic_example
...

For codebases with many such functions, a simple lint rule on
CREATE [OR REPLACE] FUNCTION identifiers catches the pattern at commit
time. Example (bash):

grep -rEn 'CREATE[[:space:]]+(OR[[:space:]]+REPLACE[[:space:]]+)?FUNCTION[[:space:]]+([a-zA-Z_][a-zA-Z_0-9]*\.)?[a-zA-Z_0-9]*[aA][tT][oO][mM][iI][cC][a-zA-Z_0-9]*[[:space:]]*\(' supabase/migrations/

Suggested fix

The splitter should tokenize SQL with awareness of:

  1. Dollar-quoted strings ($$...$$, $tag$...$tag$) as opaque bodies
  2. Line comments (--) and block comments (/* */)
  3. Quoted identifiers ("ATOMIC" is an identifier, not a keyword)
  4. Context for ATOMIC: only match it as a keyword when it appears
    immediately after a closing ) of a function signature and before an
    opening BEGIN
    (the actual BEGIN ATOMIC syntax), not inside
    identifiers

A straightforward fix is to replace any regex-based statement splitting
with a real SQL tokenizer (e.g., go-pg's or pgx's built-in parser)
that respects PostgreSQL's lexer rules.

Impact

This is a silent-until-triggered bug. A codebase with 178+ migrations built
over several years (we've been maintaining a production Supabase project since 2023) can contain perfectly valid migrations for a
long time before someone adds a trailing GRANT that suddenly breaks
supabase db reset. The resulting error message points nowhere near the
root cause (the function definition is many hundreds of lines away) and
the word atomic is not obviously a reserved or dangerous token to the
developer.

Teams often hit this only when:

  • Upgrading the CLI (when the splitter was rewritten), or
  • Running db reset for the first time in a while, or
  • Bringing a new contributor onto the project who tries to set up local dev

Additional context

  • supabase migration repair --status reverted is needed between failed
    attempts because the CLI records failed migrations as "applied" in
    schema_migrations before the error bubbles up.
  • Pinning older CLI versions (1.226.4) doesn't help for all projects —
    the same bug triggers on a different SQL pattern in that line.
  • The case sensitivity of the match is inconsistent: ATOMIC in a comment
    passes, atomic in a function name fails, ATOMIC in a function name
    also fails. A single case-insensitive raw text search for atomic is
    the likely mechanism.

Happy to help

I have a minimal reproducer project that isolates the bug in 15 lines of
SQL. I can share it privately if useful. I also have a bisection log
showing exactly which SQL patterns do and don't trigger the error.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions