Skip to content

feat(wren): add cross-dialect type translation to type_mapping#2410

Merged
goldmedal merged 2 commits into
Canner:mainfrom
Bartok9:feat/cross-dialect-type-translation
Jun 30, 2026
Merged

feat(wren): add cross-dialect type translation to type_mapping#2410
goldmedal merged 2 commits into
Canner:mainfrom
Bartok9:feat/cross-dialect-type-translation

Conversation

@Bartok9

@Bartok9 Bartok9 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What & why

wren.type_mapping already gives us parse_type() / parse_types() to normalize a raw DB type string into sqlglot's canonical form for a single dialect. A common adjacent need in cross-engine modeling (mirroring a Postgres schema into BigQuery, generating MDL for a different target engine, etc.) is to translate a type from one engine's spelling into another's.

This PR adds that as a small, focused companion:

  • translate_type(type_str, source_dialect, target_dialect) — parse in source dialect, render in target dialect.
  • translate_types(columns, source_dialect, target_dialect, *, type_field="raw_type") — batch variant mirroring parse_types (non-mutating, adds a type key).
  • wren utils translate-type and wren utils translate-types CLI commands, matching the existing parse-type / parse-types UX (single via flags, batch via stdin/--input JSON).

Examples:

input source → target output
int8 postgres → bigquery INT64
character varying(255) postgres → clickhouse Nullable(String)
INT64 bigquery → postgres BIGINT
DECIMAL(10,2) mysql → snowflake DECIMAL(10, 2)

Parse failures fall back to the original string, matching parse_type's existing contract.

Implementation notes

  • Built on sqlglot only — no connector drivers — so it's consistent with the existing module and the new tests run in the lightweight unit tests CI job (no pytest.importorskip guard needed).
  • Original input dicts are never mutated; helpers return new lists.
  • No new dependencies; no behavior change to existing parse_* paths.

Tests

Added to tests/unit/test_type_mapping.py:

  • parametrized translate_type cases (cross-dialect, identity-normalization, unknown-type fallback, empty passthrough)
  • translate_types batch tests (adds field, no mutation, custom field, empty list)
  • CLI integration tests for translate-type / translate-types

Local run (core/wren, sqlglot 30.12.0):

$ python -m pytest tests/unit/test_type_mapping.py -q
43 passed in 2.27s
$ ruff check src/wren/type_mapping.py src/wren/utils_cli.py tests/unit/test_type_mapping.py
All checks passed!

Summary by CodeRabbit

  • New Features
    • Added SQL type translation between dialects for single values and batch inputs.
    • Added CLI commands to translate one type or a JSON list (from a file or stdin).
  • Bug Fixes
    • Improved error handling for unreadable input files and invalid JSON, returning a clean nonzero exit with an error message.
    • Unknown/invalid types now safely fall back to the original value.
    • Batch translation avoids mutating the original records.
  • Tests
    • Expanded unit and CLI integration coverage for translation, custom type fields, empty inputs, and failure scenarios.

Add translate_type() and translate_types() helpers plus utils
`translate-type`/`translate-types` CLI commands that parse a SQL
type string in a source dialect and re-render it in a target dialect
(e.g. postgres int8 -> bigquery INT64, postgres character varying(255)
-> clickhouse Nullable(String)).

This complements the existing parse_type/parse_types normalization for
schema-mirroring and cross-engine modeling workflows where a column's
type must be expressed in a different engine's spelling. Parsing failures
fall back to the original string, matching parse_type's behavior.

Built on sqlglot only (no connector drivers), so the new tests run in
the lightweight unit CI job.
@github-actions github-actions Bot added python Pull requests that update Python code core labels Jun 29, 2026
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c8008469-15fb-4b3e-8fd6-ae762be6a39f

📥 Commits

Reviewing files that changed from the base of the PR and between dbd361f and 8223481.

📒 Files selected for processing (2)
  • core/wren/src/wren/utils_cli.py
  • core/wren/tests/unit/test_type_mapping.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/wren/src/wren/utils_cli.py
  • core/wren/tests/unit/test_type_mapping.py

Walkthrough

Adds cross-dialect SQL type translation helpers, two CLI commands for single and batch translation, and tests covering translation behavior, fallback, stdin/file input, and error handling.

Changes

SQL Type Translation

Layer / File(s) Summary
Type translation helpers
core/wren/src/wren/type_mapping.py
Adds translate_type for parse/render translation with fallback, and translate_types for batch dict processing with a derived type field.
CLI translation commands
core/wren/src/wren/utils_cli.py
Adds translate-type and translate-types commands, including file/stdin loading, error handling, and JSON output.
Translation tests
core/wren/tests/unit/test_type_mapping.py
Adds unit tests for translation behavior and CLI tests for both commands, including fallback, stdin input, and file-read errors.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hop through types from shore to shore,
One dialect in, one dialect more.
If parsing trips, I keep the old,
And JSON bounces neat and bold.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding cross-dialect type translation to type_mapping.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren/src/wren/utils_cli.py`:
- Around line 103-114: The input parsing in the CLI path around the
`Path.read_text`, `json.loads`, and `json.load` flow only handles
`json.JSONDecodeError`, so file read and encoding failures from `--input` can
still surface as tracebacks. Update the `utils_cli` parsing block to catch
`OSError` and `UnicodeDecodeError` alongside the existing JSON decode handling,
and route them through the same `typer.echo(..., err=True)` plus `typer.Exit(1)`
behavior used for missing files and invalid JSON.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 44c354b1-2779-46d7-af18-a20cceb1753a

📥 Commits

Reviewing files that changed from the base of the PR and between 1be4baa and dbd361f.

📒 Files selected for processing (3)
  • core/wren/src/wren/type_mapping.py
  • core/wren/src/wren/utils_cli.py
  • core/wren/tests/unit/test_type_mapping.py

Comment thread core/wren/src/wren/utils_cli.py
translate-types and parse-types now catch file read/decode failures and
exit(1) cleanly instead of leaking a traceback, matching the existing
not-found and invalid-JSON handling. Adds regression tests.
@Bartok9

Bartok9 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Good catch — fixed in 82234814. Wrapped path.read_text(...) in both translate-types and parse-types so OSError/UnicodeDecodeError now hit the clean typer.Exit(1) path with a could not read file message instead of escaping as a traceback, matching the existing not-found and invalid-JSON handling. Added two regression tests (missing file → exit 1 + no traceback; unreadable path → could not read file + no traceback). Full suite: 45 passed.

@Bartok9

Bartok9 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

The one actionable finding (catch OSError/UnicodeDecodeError alongside json.JSONDecodeError on --input file reads) is already addressed in commit 8223481 for both translate-types and parse-types — the path.read_text(...) call is wrapped in a try/except that routes OSError/UnicodeDecodeError through typer.echo(..., err=True) + typer.Exit(1). No further change needed.

@goldmedal goldmedal left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Bartok9 look good 👍

@goldmedal goldmedal merged commit e124ff5 into Canner:main Jun 30, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants