Skip to content

Experimental postgres query (PR 3/4): multi-input + error formatting#5138

Open
simonfaltum wants to merge 7 commits intosimonfaltum/postgres-query-pr2-streamingfrom
simonfaltum/postgres-query-pr3-multi-input
Open

Experimental postgres query (PR 3/4): multi-input + error formatting#5138
simonfaltum wants to merge 7 commits intosimonfaltum/postgres-query-pr2-streamingfrom
simonfaltum/postgres-query-pr3-multi-input

Conversation

@simonfaltum
Copy link
Copy Markdown
Member

@simonfaltum simonfaltum commented Apr 30, 2026

PR Stack

  1. #5135 — PR 1: scaffold + autoscaling targeting + text output
  2. #5136 — PR 2: provisioned + JSON/CSV streaming + types + sqlcli.ResolveFormat
  3. PR 3 (this PR)#5138 — multi-input + multi-statement rejection + error formatting + sqlcli.Collect
  4. #5143 — PR 4: cancellation + timeout + TUI

Stacked on PR 2.

Why

PR 2 shipped a single-statement, single-input command. Real workflows want multi-input (set-then-query, file-then-stdin), multi-statement rejection with a friendly hint, and rich pg error formatting.

This PR also extends experimental/libs/sqlcli with input-collection logic shared by aitools and postgres. Same architectural principle as PR 2: instead of postgres growing its own duplicate of aitools' resolveSQLs, both commands now call sqlcli.Collect.

Changes

Architectural: experimental/libs/sqlcli/input.go adds:

  • sqlcli.SQLFileExtension const (.sql).
  • sqlcli.Input{SQL, Source} type — Source is the human-readable origin label ("--file PATH", "argv[N]", "stdin").
  • sqlcli.CollectOptions{Cleaner func(string) string} — aitools passes its cleanSQL (strips comments+quotes); postgres passes the default TrimSpace because its multi-statement scanner needs comments preserved.
  • sqlcli.Collect — files-first then positionals, stdin only when neither is present, .sql autodetect on positionals.

aitools' resolveSQLs collapses to a thin wrapper around sqlcli.Collect (drops the SQL strings, ignores Source). The "SQL statement #N is empty after removing comments" wording is replaced with sqlcli's argv[N] is empty; aitools tests updated.

User-facing changes for postgres query:

  • Variadic positionals + repeatable --file + stdin fallback.
  • Multi-statement strings rejected up front with a hint (the hand-written conservative scanner ignores ; inside string literals, identifiers, line/block comments, and dollar-quoted bodies; tag must be a valid unquoted identifier so $1 and $foo-bar$ are correctly NOT treated as tags).
  • Multi-input output: per-unit blocks for text; canonical-shape JSON array {"source","sql","kind","elapsed_ms",...} for json; csv rejected pre-flight when N>1.
  • Rich pg error formatting (SEVERITY: message (SQLSTATE XXXXX) with DETAIL/HINT lines), applied on both single-input and multi-input paths.

Single-input keeps streaming. runUnitBuffered is a thin wrapper around executeOne + a bufferSink, so the row-loop and error-wrapping logic stays in one place.

Test plan

  • go test ./experimental/... (multistatement scanner: 28 cases including dollar-tag punctuation rejection, sqlcli.Collect: 12 cases including a custom-cleaner test, error formatting, multi-input renderers including byte-equal canonical-shape and first-unit-fails framing)
  • go tool ... golangci-lint run ./experimental/... (0 issues)

This is PR 3 of the experimental postgres query stack. Adds the rest
of the input ergonomics promised in the plan and the error-formatting
polish.

Inputs: positional args become variadic, --file is repeatable, stdin
is read when neither is present, and a positional ending in '.sql'
that exists on disk is treated as a SQL file. Execution order is
files-first then positionals (cobra/pflag does not preserve interleaved
spelling, documented in --help).

Each input unit must contain exactly one statement. checkSingleStatement
walks the SQL with a hand-written conservative scanner that ignores
';' inside single-quoted strings, double-quoted identifiers, line
comments, block comments, and dollar-quoted bodies. Multi-statement
strings are rejected before connect with a hint pointing at the
multi-input alternatives.

Multi-input output:

  - text: each per-unit result rendered inline, separated by a blank
    line (mirrors psql's compact text shape).
  - json: top-level array of per-unit result objects with shape
    {"sql","kind","elapsed_ms",...}; rows-producing units carry a
    "rows":[...] array, command-only carry "command"+"rows_affected".
    Each per-unit object is buffered to completion before write; the
    outer array streams across units. The plan accepts this trade-off:
    huge SELECTs in multi-input invocations buffer.
  - csv: rejected pre-flight when N>1 (no sensible cross-schema shape).
    Single-input csv keeps streaming.

Per-unit errors render as a {"kind":"error", ...} entry in the JSON
shape so scripts can detect failure without checking exit code.
Sequential execution stops on the first failing unit; the successful
prefix is rendered.

formatPgError renders *pgconn.PgError with SEVERITY, SQLSTATE, DETAIL,
HINT inline. Non-PgError values pass through unchanged so connect-time
errors keep their original wording.

Single-input keeps the streaming sinks from PR 2; only multi-input goes
through the buffered renderer. Session state (SET, temp tables) carries
across input units because they share one connection.

TUI for >30 rows is deferred to a follow-up. The current text path uses
the static tabwriter table for both single- and multi-input.

Co-authored-by: Isaac
MUSTs:
- Multi-input JSON error envelope: thread the failing *unitResult into
  renderJSONMulti so source/sql/elapsed_ms reflect the actual failing
  input instead of empty strings.
- Canonical key order for every per-unit object:
    {"source", "sql", "kind", "elapsed_ms", payload}
  Success and error envelopes now share the same shape so consumers
  don't have to special-case kind=="error" for missing fields.

SHOULDs:
- Single-input path now goes through formatPgError, so DETAIL/HINT
  surface consistently across single- and multi-input.
- runUnitBuffered reuses executeOne via a new bufferSink. The two
  query loops collapse to one; future error-handling changes auto-
  propagate.
- Scanner: reject `$<digit>...` as a dollar-quote tag (PG docs forbid
  digit-leading tags). Pinned with a test for `SELECT $1, $2 FROM t`
  and `SELECT $1 FROM t; SELECT 2`.
- Pin the E-string over-rejection behaviour with a test, so a future
  scanner improvement has to update the assertion.

CONSIDERs:
- Capture elapsed_ms on the error path too (was previously discarded).
- Promote multiStatementHint to a const.
- Drop jsonEscapeShort (was a fragile micro-opt for an always-ASCII
  domain); use marshalJSON for the command verb instead.
- Add TestRenderJSONMulti_FirstUnitFails to pin the empty-success-
  prefix framing.

Co-authored-by: Isaac
Round-2 reviewer noted jsonErrorObject's defensive branches around
writeJSONUnitHeader/marshalJSON are unreachable (encoding/json doesn't
error on string inputs), and the repo rule says drop "just in case"
fallbacks. Replace with panic-on-impossible helpers.

Co-authored-by: Isaac
…tum/postgres-query-pr3-multi-input

# Conflicts:
#	acceptance/cmd/experimental/postgres/query/argument-errors/output.txt
#	acceptance/cmd/experimental/postgres/query/argument-errors/script
…tum/postgres-query-pr3-multi-input

# Conflicts:
#	experimental/postgres/cmd/query.go
@arsenyinfo
Copy link
Copy Markdown
Contributor

Invalid dollar-quote tag character validation allows multi-statement SQL to bypass the single-statement guard

  • Priority: P2
  • Location: experimental/postgres/cmd/multistatement.go:136-153 (readDollarTag), called from multistatement.go:69-76 (checkSingleStatement), guard enforced at experimental/postgres/cmd/query.go:127-135
  • Scenario: User passes input such as SELECT $foo-bar$a;b$foo-bar$. readDollarTag accepts foo-bar as a valid tag because it only rejects a digit-starting first character or whitespace/; anywhere; it does not reject punctuation like -. checkSingleStatement then calls scanDollarBody and skips over the ; as if it were inside a dollar-quoted body. PostgreSQL does not allow - in a dollar-quote tag (tags must follow unquoted-identifier rules: letters, digits, underscores only), so PostgreSQL would actually parse this as two statements. The pre-flight guard passes, the SQL is sent to the server, and either two statements execute or a parse error is returned — violating the guard's contract.
  • Potential solution: After the digit-start check (line 144), validate every subsequent tag character against PostgreSQL identifier-continuation rules (Unicode letter, digit, or underscore); return ("", start) on the first character that fails. This prevents punctuation-containing pseudo-tags from being treated as real dollar-quote openers.

🔍 Reviewed by nitpicker

…tum/postgres-query-pr3-multi-input

# Conflicts:
#	experimental/postgres/cmd/query.go
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:02 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 1, 2026 07:02 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants