Skip to content

Add Snowflake Cortex Semantic Model adapter#30

Merged
nicosuave merged 3 commits intomainfrom
nicosuave/snowflake-semantic
Jan 3, 2026
Merged

Add Snowflake Cortex Semantic Model adapter#30
nicosuave merged 3 commits intomainfrom
nicosuave/snowflake-semantic

Conversation

@nicosuave
Copy link
Copy Markdown
Member

Summary

  • Adds bidirectional adapter for Snowflake Cortex Semantic Model format (parse and export)
  • Handles tables, dimensions, time dimensions, facts, metrics, relationships, and filters
  • Includes comprehensive test suite with 60 tests including kitchen sink integration tests with real DuckDB database
  • Integrates with auto-loader for automatic format detection

Key features

  • Parse Snowflake semantic model YAML files into Sidemantic's internal representation
  • Export Sidemantic models back to valid Snowflake format
  • Column qualification for filters/segments (converts bare column names to {model}.column format)
  • Proper handling of complex derived metrics vs simple aggregations
  • Cross-format conversion support with Cube and MetricFlow adapters

Implement bidirectional adapter for Snowflake Cortex Semantic Model format:

Parsing:
- Tables with base_table references (database.schema.table)
- Dimensions (categorical, numeric, boolean)
- Time dimensions with granularity support
- Facts with aggregations (sum, count, avg, min, max, median)
- Complex metrics with derived expressions
- Relationships between tables
- Filters parsed as segments with column qualification

Export:
- Proper Snowflake YAML structure with name, tables, relationships
- Dimension type mapping to Snowflake data types (TEXT, NUMBER, BOOLEAN, TIMESTAMP)
- Time dimensions separated from regular dimensions
- Simple metrics as facts, complex metrics with expr
- Segments exported as filters

Tests (60 tests):
- Parsing: basic models, dimensions, metrics, relationships, filters
- Roundtrip: parse -> export -> parse preserves semantics
- Conversion: cross-format with Cube and MetricFlow
- Query: SQL generation with joins and segments
- Kitchen sink: integration tests with real DuckDB database
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66e5d08ffc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +292 to +296
return Metric(
name=name,
type="derived",
sql=expr,
description=metric_def.get("description"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rewrite complex metrics to reference model columns

When a Snowflake metric expression contains multiple aggregates, _parse_metric returns a derived metric with the raw expr string. The SQL generator treats derived metrics with inline aggregates as already‑valid SQL and does not rewrite column references. That means expressions like SUM(quantity * unit_price) / COUNT(DISTINCT order_id) (from the sample Snowflake models) will be emitted as‑is, but quantity/unit_price are not selected into the model CTE (only quantity_raw/unit_price_raw exist), so queries that include these metrics will fail with unknown/ambiguous column errors. This adapter should either translate those expressions to use model placeholders/measure references or convert them into simple aggregations built from raw columns so the generator can resolve them.

Useful? React with 👍 / 👎.

Comment on lines +664 to +665
if segment.sql:
filter_def["expr"] = segment.sql
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip {model} placeholders when exporting filters

Sidemantic segments are expected to use {model} placeholders, but Snowflake filters require bare column names. _export_filter writes segment.sql verbatim, so round‑tripping a Snowflake model will emit filters like {model}.status = 'delivered', which are not valid Snowflake semantic model expressions. This will break export for any segment created by this adapter (or by users who follow Sidemantic’s {model} convention). The export should remove {model}./{model} before serializing.

Useful? React with 👍 / 👎.

Kitchen sink test revealed that derived metrics with inline SQL aggregations
(like `SUM(quantity * unit_price) / COUNT(DISTINCT order_id)`) weren't being
properly handled. The column references inside the aggregations need to be
qualified to point to the CTE columns.

Changes:
- Update _build_metric_sql to properly qualify column references in metrics
  that contain inline aggregate functions
- Dimensions are qualified as {model}_cte.{dim} (no _raw suffix)
- Metrics/facts are qualified as {model}_cte.{name}_raw
- Skip complex SQL expressions when building the column map
- Skip adding dimension columns to the metrics map to avoid incorrect _raw suffix
- Add test_avg_order_value to kitchen sink tests to verify the fix
- Extract _qualify_columns() as shared function in Snowflake adapter
- Qualify column references at parse time with {model} placeholder
- Fix CTE generation to include measure columns referenced in derived metrics
- Simplify generator to just replace {model} and add _raw suffix for measures
@nicosuave nicosuave merged commit 31875cf into main Jan 3, 2026
10 checks passed
@nicosuave nicosuave deleted the nicosuave/snowflake-semantic branch January 3, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant