Skip to content

[Feature] Jinja/dbt template preprocessing for SQL analysis tools #63

@anandgupta42

Description

@anandgupta42

Problem

SQL analysis tools (sql_analyze, sql_format, sql_validate, sql_optimize) fail on Jinja-templated dbt SQL. For a dbt-focused tool, not being able to analyze {{ ref('model') }}, {{ source('src', 'table') }}, or {% if is_incremental() %} blocks is a significant gap.

Users working with dbt models must mentally strip Jinja before using analysis tools, and error messages don't explain WHY the tool failed on Jinja syntax.

Desired Behavior

SQL tools should handle dbt Jinja syntax gracefully, either by preprocessing it or by using dbt's own compilation when available.

Implementation Notes

Key Files

  • SQL analysis tools in packages/altimate-code/src/tool/ (sql-analyze, sql-format, sql-validate, sql-optimize)
  • Python engine SQL methods in packages/altimate-engine/
  • Bridge client for adding preprocessing step

Approach

Phase 1: Jinja preprocessor (quick win)

Add a preprocessing step that stubs common dbt macros before passing SQL to analysis tools:

{{ ref('orders') }}           → orders
{{ source('raw', 'events') }} → raw__events
{{ config(...) }}             → (removed)
{% if is_incremental() %}...{% endif %} → (strip block)
{{ this }}                    → __this__
{{ var('start_date') }}       → '__var_start_date__'
{# comments #}               → (removed)
{% set x = ... %}             → (removed)
{% for ... %}...{% endfor %}  → (keep inner content)

This can be a Python function in the engine that runs before any SQL parsing.

Phase 2: dbt compile integration

When a dbt project is detected:

  • Use dbt compile --select <model> to get fully rendered SQL from target/compiled/
  • Analyze the compiled output instead of the raw template
  • This gives 100% accurate rendering including custom macros

Phase 3: Graceful fallback

For any SQL input:

  1. First try analysis as-is
  2. If parse error detected, check if input contains Jinja patterns ({{, {%, {#)
  3. If yes, preprocess with Phase 1 stubs and retry
  4. If dbt project available, offer to use dbt compile for full accuracy
  5. Note in output: "SQL was preprocessed to remove Jinja templates — some analysis may be approximate"

Industry Patterns

  • SQLFluff (Jinja templater): Renders Jinja with Python's Jinja2 library, has built-in dbt macro stubs for ref, var, is_incremental()
  • SQLFluff (dbt templater): Uses dbt itself to render SQL — most accurate but requires working dbt installation
  • sqlglot: Does NOT handle Jinja (expects pre-rendered SQL) — confirms preprocessing is necessary
  • SQLMesh: Offers SQL-native macro system as Jinja replacement

Acceptance Criteria

  • sql_analyze works on SQL containing {{ ref() }}, {{ source() }}, {{ config() }}
  • sql_format and sql_validate handle Jinja syntax without crashing
  • Preprocessing stubs are accurate enough for meaningful analysis
  • Error messages explain Jinja limitation when preprocessing can't handle complex templates
  • When dbt project is available, option to use dbt compile for full accuracy
  • Output notes when Jinja preprocessing was applied (transparency)

Metadata

Metadata

Assignees

No one assigned

    Labels

    dbtdbt integrationenhancementNew feature or requestpriority:highHigh prioritysql-engineSQL analysis, lineage, translation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions