SQL splitter matches 'atomic' too aggressively, triggering SQLSTATE 42601

## Bug report

### CLI version

`2.84.2` (also reproduced on `1.226.4`, at a different migration)

```
$ supabase --version
2.84.2
```

### Environment

- macOS 26.4 (Darwin 25.4.0)
- Apple Silicon (`arm64`)
- Docker Desktop 29.2.1
- PostgreSQL 17.6.1.063 (Supabase official image)
- Installation: `brew install supabase/tap/supabase`

### Summary

The migration runner's SQL splitter (or its interaction with pgx's extended
query protocol) raises `ERROR: cannot insert multiple commands into a prepared
statement (SQLSTATE 42601)` when a migration file contains a
`CREATE [OR REPLACE] FUNCTION` statement whose unquoted identifier contains
the substring `atomic` (case-insensitive) **and** any further SQL statement
after that function body's closing `$$;`.

My working hypothesis is that the CLI's splitter is performing a naive text
search for the token `ATOMIC` (as in PostgreSQL 14's
`CREATE FUNCTION ... BEGIN ATOMIC ... END` alternative-body syntax) and
treating everything up to the end of the file — or the next top-level
`END` — as the function body. The subsequent SQL statements are then
concatenated into a single `Parse` message, which Postgres rejects because
the extended query protocol permits only one command per Parse.

The splitter does **not** tokenize SQL properly — comments, string literals,
and quoted/unquoted identifiers are all treated as raw text.

### Minimal reproducer

Create a fresh Supabase project and put this one file in
`supabase/migrations/`:

```sql
-- 001_repro.sql — triggers SQLSTATE 42601 on `supabase db reset`

CREATE OR REPLACE FUNCTION public.atomic_example()
RETURNS INTEGER
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN 1;
END;
$$;

-- ^^ any statement below this line triggers the bug
GRANT EXECUTE ON FUNCTION public.atomic_example() TO authenticated;
```

Then:

```bash
supabase start
supabase db reset
```

**Expected:** migration applies cleanly. Postgres parses
`CREATE FUNCTION ... atomic_example ... $$;` as one statement and
`GRANT EXECUTE ...` as a separate statement, since `atomic_example` is an
identifier, not the `BEGIN ATOMIC` keyword sequence.

**Actual:**

```
Applying migration 001_repro.sql...
GRANT EXECUTE ON FUNCTION public.atomic_example() TO authenticated;
Try rerunning the command with --debug to troubleshoot the error.
```

With `--debug`:

```
PG Recv: {"Type":"ErrorResponse","Severity":"ERROR","Code":"42601",
         "Message":"cannot insert multiple commands into a prepared statement",
         "File":"postgres.c","Line":1478,"Routine":"exec_parse_message"}
ERROR: cannot insert multiple commands into a prepared statement (SQLSTATE 42601)
```

### Variations confirmed with the same reproducer project

| Variation | Result |
|---|---|
| Function body contains the word `atomic` in a comment, with no trailing statement | ✅ passes |
| Function body contains `atomic` in a comment, with trailing `GRANT` | ❌ fails |
| Function name contains `atomic` (`atomic_foo`, `foo_atomic`, `abc_atomic_bar`) + trailing statement | ❌ fails |
| Function name `atomic_foo` with no trailing statement (nothing after `$$;`) | ✅ passes |
| Whole file wrapped in `BEGIN; ... COMMIT;` | ❌ still fails |
| Top-level `-- atomic` comment (no function with `atomic` in its name) + CREATE FUNCTION + GRANT | ✅ passes |

**Summary**: the presence of `atomic` *in a CREATE FUNCTION identifier*,
combined with a following statement, is the trigger. `atomic` in comments
alone is fine.

### Workaround

Rename the function to avoid the substring `atomic`. A `_tx` suffix works
well semantically for transactional RPCs:

```sql
CREATE OR REPLACE FUNCTION public.example_tx()  -- was: atomic_example
...
```

For codebases with many such functions, a simple lint rule on
`CREATE [OR REPLACE] FUNCTION` identifiers catches the pattern at commit
time. Example (bash):

```bash
grep -rEn 'CREATE[[:space:]]+(OR[[:space:]]+REPLACE[[:space:]]+)?FUNCTION[[:space:]]+([a-zA-Z_][a-zA-Z_0-9]*\.)?[a-zA-Z_0-9]*[aA][tT][oO][mM][iI][cC][a-zA-Z_0-9]*[[:space:]]*\(' supabase/migrations/
```

### Suggested fix

The splitter should tokenize SQL with awareness of:

1. **Dollar-quoted strings** (`$$...$$`, `$tag$...$tag$`) as opaque bodies
2. **Line comments** (`--`) and **block comments** (`/* */`)
3. **Quoted identifiers** (`"ATOMIC"` is an identifier, not a keyword)
4. **Context for `ATOMIC`**: only match it as a keyword when it appears
   **immediately after a closing `)` of a function signature and before an
   opening `BEGIN`** (the actual `BEGIN ATOMIC` syntax), not inside
   identifiers

A straightforward fix is to replace any regex-based statement splitting
with a real SQL tokenizer (e.g., go-pg's or pgx's built-in parser)
that respects PostgreSQL's lexer rules.

### Impact

This is a silent-until-triggered bug. A codebase with 178+ migrations built
over several years (we've been maintaining a production Supabase project since 2023) can contain perfectly valid migrations for a
long time before someone adds a trailing `GRANT` that suddenly breaks
`supabase db reset`. The resulting error message points nowhere near the
root cause (the function definition is many hundreds of lines away) and
the word `atomic` is not obviously a reserved or dangerous token to the
developer.

Teams often hit this only when:

- Upgrading the CLI (when the splitter was rewritten), or
- Running `db reset` for the first time in a while, or
- Bringing a new contributor onto the project who tries to set up local dev

### Additional context

- `supabase migration repair --status reverted` is needed between failed
  attempts because the CLI records failed migrations as "applied" in
  `schema_migrations` before the error bubbles up.
- Pinning older CLI versions (`1.226.4`) doesn't help for all projects —
  the same bug triggers on a different SQL pattern in that line.
- The case sensitivity of the match is inconsistent: `ATOMIC` in a comment
  passes, `atomic` in a function name fails, `ATOMIC` in a function name
  also fails. A single case-insensitive raw text search for `atomic` is
  the likely mechanism.

### Happy to help

I have a minimal reproducer project that isolates the bug in 15 lines of
SQL. I can share it privately if useful. I also have a bisection log
showing exactly which SQL patterns do and don't trigger the error.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL splitter matches 'atomic' too aggressively, triggering SQLSTATE 42601 #5062

Bug report

CLI version

Environment

Summary

Minimal reproducer

Variations confirmed with the same reproducer project

Workaround

Suggested fix

Impact

Additional context

Happy to help

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Variation	Result
Function body contains the word `atomic` in a comment, with no trailing statement	✅ passes
Function body contains `atomic` in a comment, with trailing `GRANT`	❌ fails
Function name contains `atomic` (`atomic_foo`, `foo_atomic`, `abc_atomic_bar`) + trailing statement	❌ fails
Function name `atomic_foo` with no trailing statement (nothing after `$$;`)	✅ passes
Whole file wrapped in `BEGIN; ... COMMIT;`	❌ still fails
Top-level `-- atomic` comment (no function with `atomic` in its name) + CREATE FUNCTION + GRANT	✅ passes

SQL splitter matches 'atomic' too aggressively, triggering SQLSTATE 42601 #5062

Description

Bug report

CLI version

Environment

Summary

Minimal reproducer

Variations confirmed with the same reproducer project

Workaround

Suggested fix

Impact

Additional context

Happy to help

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions