fix(postgres): strip trailing semicolon before subquery-wrapping#2407
Conversation
query()/dry_run() wrap user SQL as 'SELECT * FROM ({sql}) AS _sub LIMIT N'.
A trailing ';' made the subquery invalid (Postgres: syntax error at or
near ";"), so any query ending in a semicolon failed when a limit/dry-run
wrap was applied. The canner, trino, and clickhouse connectors already
strip the terminating semicolon run before wrapping; postgres did not.
Add the same _strip_trailing_semicolon helper (preserves semicolons inside
string literals) and use it at both wrap sites.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThe PostgreSQL connector now strips trailing semicolons and whitespace from SQL before wrapping it in a subquery with ChangesTrailing semicolon stripping in Postgres connector
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@Bartok9, please check the CI fails. Thanks. |
The 'unit tests' CI job does not install the psycopg/postgres extra (only the
dedicated 'postgres tests' job does). test_postgres_strip_semicolon.py imports
wren.connector.postgres, which imports psycopg at module load, so collection
raised ModuleNotFoundError: No module named 'psycopg' and failed the entire
unit suite.
Add pytest.importorskip('psycopg') so the test skips cleanly in the unit job
and still runs in the postgres job (which has the driver). No product code
change.
|
Thanks @goldmedal — CI fix pushed. ✅ Root cause: the Fix: added Verified locally: without psycopg the file now reports |
|
Thanks for the fix — the change itself is correct and nicely consistent with the canner/clickhouse helpers. One thing on the test coverage though: The new test doesn't actually run in any CI job, so the regression it's meant to guard isn't protected. Tracing the two jobs in
So Suggestion: move these four tests into |
|
Good catch, and you're exactly right — Moved the four mocked tests into Verified locally in the postgres job's scope: Now the regression is actually guarded. Thanks for the careful trace through the CI jobs. 🙏 |
Summary
PostgresConnector.query()anddry_run()wrap user SQL asSELECT * FROM ({sql}) AS _sub LIMIT N. A trailing;in the user SQL made the wrapped subquery invalid:SELECT * FROM (SELECT 1;) AS _sub LIMIT N→ Postgressyntax error at or near ";".Motivation
The canner, trino, and clickhouse connectors already strip the terminating semicolon run before subquery-wrapping (each has a
_strip_trailing_semicolonhelper). The postgres connector — the reference psycopg implementation — was the inconsistent one and wrapped the raw SQL:Fix
Add the same
_strip_trailing_semicolonhelper (regex[;\s]+\Z, so it strips only the trailing run and preserves semicolons inside string literals likeSELECT 'a;b') and call it at both wrap sites (queryanddry_run).Real behavior proof
Real environment: Python 3.12,
core/wrenviauv run --no-sync(with thepostgresextra installed for the import), repomain.Commands:
AFTER fix:
Pre-fix wrap (demonstrates the bug):
Tests
tests/unit/test_postgres_strip_semicolon.pyuses a mocked connection (no live DB) and asserts the SQL the connector actually executes forqueryanddry_runis the stripped, valid form. The two behavior tests fail without the call-site fix.Scope / risk
Apache-2.0 path (
core/wren). Touches only the two wrap sites + a new pure helper. SQL without a trailing semicolon is byte-identical to before; semicolons inside string literals are preserved. (The same gap exists in redshift/duckdb/datafusion; happy to follow up with a consistent sweep if you'd prefer one PR for all.)Summary by CodeRabbit
Bug Fixes
Tests