Skip to content

Conversation

@EDsCODE
Copy link
Contributor

@EDsCODE EDsCODE commented Dec 18, 2025

Summary

  • Replace regex-based query rewriting with proper AST-based transpilation using pg_query_go (PostgreSQL's C parser via CGO)
  • Add comprehensive type, function, and operator mappings based on sqlglot's DuckDB dialect
  • Add logging to show transpiled queries when they differ from the original

Changes

New transpiler/ package with transform pipeline:

  • PgCatalogTransform: Rewrites pg_catalog.* references to local views
  • TypeMappingTransform: Converts PostgreSQL types (JSONB→JSON, BYTEA→BLOB, INET→TEXT, etc.)
  • TypeCastTransform: Handles type casts (::regtype → ::VARCHAR)
  • FunctionTransform: Maps 50+ PostgreSQL functions (array_agg→list, string_to_array→string_split, etc.)
  • OperatorTransform: Handles JSON and regex operator compatibility
  • VersionTransform: Replaces version() with PostgreSQL-compatible string
  • SetShowTransform: Handles SET/SHOW command syntax with 70+ ignored parameters
  • OnConflictTransform: Handles ON CONFLICT (upsert) syntax
  • DDLTransform: Strips constraints for DuckLake mode
  • PlaceholderTransform: Converts $1, $2 → ? for extended query protocol

Benefits over regex approach:

  • Proper handling of string literals (won't rewrite 'pg_catalog' inside strings)
  • Handles nested structures correctly
  • Easier to maintain and extend
  • Better error messages on parse failures

Test plan

  • All existing tests pass
  • New transpiler tests cover type mappings, function mappings, DDL stripping, etc.
  • Integration tests verify end-to-end ETL workflow
  • Manual testing with psql/lib-pq clients

🤖 Generated with Claude Code

EDsCODE and others added 3 commits December 18, 2025 10:59
Replace regex-based query rewriting with proper AST-based transpilation
using pg_query_go (PostgreSQL's C parser via CGO). This provides:

- Proper handling of string literals (won't rewrite 'pg_catalog' in strings)
- Type mappings: JSONB→JSON, BYTEA→BLOB, INET→TEXT, etc.
- Function mappings: array_agg→list, string_to_array→string_split, etc.
- Operator compatibility for JSON and regex operators
- ON CONFLICT (upsert) syntax support
- DDL constraint stripping for DuckLake mode
- SET/SHOW command handling with 70+ ignored parameters

The transpiler package applies a pipeline of transforms to the parsed AST:
- PgCatalogTransform: pg_catalog.* references
- TypeMappingTransform: PostgreSQL→DuckDB type conversions
- TypeCastTransform: ::regtype→::VARCHAR
- FunctionTransform: 50+ function mappings
- OperatorTransform: operator compatibility
- VersionTransform: version() replacement
- SetShowTransform: SET/SHOW command syntax
- OnConflictTransform: upsert handling
- DDLTransform: constraint stripping (DuckLake mode)
- PlaceholderTransform: $1→? conversion

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Shows what query is actually executed after transpilation, helping debug
PostgreSQL to DuckDB conversion issues.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@EDsCODE EDsCODE merged commit 7aac83b into main Dec 18, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants