Skip to content

Align native contract and SQL correctness#184

Open
nicosuave wants to merge 13 commits into
review/cli-ci-release-hardeningfrom
review/python-native-contract
Open

Align native contract and SQL correctness#184
nicosuave wants to merge 13 commits into
review/cli-ci-release-hardeningfrom
review/python-native-contract

Conversation

@nicosuave
Copy link
Copy Markdown
Member

@nicosuave nicosuave commented May 31, 2026

Base: #183

  • Align Python and Rust native loading for compact SQL/YAML fixtures, relationship defaults, aliases, statistical aggregations, and table calculations.
  • Keep the generated JSON schema in sync with native relationship aliases.
  • Fix graph-level variance_pop SQL emission and add regression coverage.

@nicosuave nicosuave force-pushed the review/python-native-contract branch from 6ccaf41 to cd7a08e Compare May 31, 2026 18:11
@nicosuave nicosuave changed the title Align Python native contract and SQL correctness Align native contract and SQL correctness May 31, 2026
@nicosuave nicosuave force-pushed the review/cli-ci-release-hardening branch from 326838a to c6b38ca Compare May 31, 2026 18:49
@nicosuave nicosuave force-pushed the review/python-native-contract branch from cd7a08e to 4b9c638 Compare May 31, 2026 18:49
@nicosuave nicosuave marked this pull request as ready for review May 31, 2026 21:47
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4b9c638ef7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2063 to +2064
if jp.custom_condition:
join_cond = self._custom_join_condition(jp)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Project columns referenced by custom joins

When a relationship sql uses columns that are not the declared foreign/primary keys (for example {from}.tenant_id = {to}.tenant_id AND {to}.valid_to IS NULL), this branch emits the predicate against the per-model CTE aliases, but _build_model_cte only projects dimensions, metrics, PKs, and declared join keys. The generated SQL then references missing CTE columns like customers_cte.valid_to; either extract/project columns from the custom predicate or reject predicates that use non-projected columns.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 38929833cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +76 to +83
METRIC_FIELDS = {
"name",
"extends",
"type",
"agg",
"sql",
"expr",
"measure",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve legacy derived metric dependencies

With the new strict field validation, native YAML exported by previous versions for a derived metric can no longer be loaded: the old exporter wrote a metric-local metrics: [...] dependency list for derived metrics, but this allow-list omits that field, so _parse_metric() now raises unknown native field(s) before it can ignore or migrate it. This affects users re-loading previously exported Sidemantic YAML whenever a derived metric had detected dependencies; allow the legacy metrics key or handle it during parsing to keep those files compatible.

Useful? React with 👍 / 👎.

@nicosuave nicosuave force-pushed the review/python-native-contract branch from 3892983 to 262e1a8 Compare May 31, 2026 22:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 262e1a8f9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if resolved_metric and resolved_model_name is None:
metric_expr = self._build_metric_sql(resolved_metric)
metric_expr = self._wrap_with_fill_nulls(metric_expr, resolved_metric)
alias = aliases.get(metric_ref, resolved_metric.name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve dotted graph metric aliases in ORDER BY

When a graph-level metric itself contains a dot (now resolved before treating model.metric as a model measure), this branch emits the SELECT alias as the full metric name, e.g. AS "orders.revenue", but the existing ORDER BY path later strips everything before the dot and orders by revenue. A query like metrics=["orders.revenue"] with a graph-level metric named orders.revenue and order_by=["orders.revenue DESC"] therefore generates ORDER BY revenue DESC, which does not match the projected alias and fails unless an unrelated revenue column exists; use the same alias normalization for dotted graph metrics as the SELECT list.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 262e1a8f9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1464 to +1469
if resolved_metric and ref_model_name is None:
for dep in resolved_metric.get_dependencies(self.graph, model_name):
collect_measures_from_metric(dep, visited)
if resolved_metric.sql and sql_has_aggregate(resolved_metric.sql, self.dialect):
collect_sql_columns_for_model(resolved_metric.sql)
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Project raw SQL columns for graph aggregations

When a top-level metric resolves through the new exact graph-metric path and is a simple aggregation (for example name: finance.revenue, agg: sum, sql: orders.amount), this branch only collects dependency metrics or SQL that still contains an aggregate. The metric SQL later renders as SUM(orders_cte.amount), but _build_model_cte never projects amount, so dotted top-level aggregation metrics that this change now documents/supports generate SQL referencing a missing CTE column. Collect columns from resolved_metric.sql for resolved_metric.agg as well.

Useful? React with 👍 / 👎.

@nicosuave nicosuave force-pushed the review/python-native-contract branch from b3ee67c to 1b91738 Compare May 31, 2026 23:24
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b91738452

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/sidemantic.py Outdated
Comment on lines +433 to +435
for model in graph.models.values():
if any(metric.extends for metric in model.metrics):
resolved_metrics = resolve_metric_inheritance({metric.name: metric for metric in model.metrics})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer metric inheritance until parent models merge

When load_from_directory parses native files one at a time, a child model whose extends parent is in a different file reaches this loop before model inheritance has been merged. If that child also has a metric extending a parent metric, resolve_metric_inheritance({metric.name: metric for metric in model.metrics}) only sees the child file's local metrics and raises Metric '<parent>' not found, even though the parent model will be loaded later. This breaks cross-file native model inheritance that otherwise works; skip model-local metric inheritance while a model parent is unresolved, or run it after the directory-level merge.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 62ae7e36f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 380 to +381
for metric_def in data.get("metrics") or []:
metric = self._parse_metric(metric_def)
metric = self._parse_metric(metric_def, source_path=source_path, context="metric")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recognize native metrics-only files in directory loading

This top-level native metrics parsing only works when SidemanticAdapter is invoked directly. In the CLI-first load_from_directory path, YAML detection still selects SidemanticAdapter only when a file has top-level models; a native file that contains only version: 1 plus graph-level metrics is instead handed to MetricFlowAdapter (sidemantic/loaders.py lines 130-134) and fails to load. This breaks valid split native projects where models and graph metrics live in separate files; detect native version/native metric shape before the MetricFlow fallback.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ebdf712278

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 338 to +340
normalized_frontmatter = normalize_sql_frontmatter(frontmatter) if frontmatter else {}
if normalized_frontmatter:
model = self._parse_model(normalized_frontmatter)
model = self._parse_model(normalized_frontmatter, source_path=source_path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore root-only frontmatter before parsing SQL models

When a native .sql file contains graph-level SQL definitions plus valid root metadata such as connection in the YAML frontmatter, normalize_sql_frontmatter() only removes version, so this branch treats the remaining connection mapping as model frontmatter and _parse_model() now rejects it as an unknown model field. This breaks SQL-form native files that only add top-level metadata; strip root-only keys like connection before using frontmatter as a model definition.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b40dbbfbf5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/sidemantic.py Outdated
Comment on lines +445 to +446
if any(metric.extends for metric in graph.metrics.values()):
graph.metrics = resolve_metric_inheritance(graph.metrics)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer graph metric inheritance across native files

When load_from_directory parses split native projects, each YAML file is handed to SidemanticAdapter before all_metrics are combined. This per-file inheritance resolution makes a valid metrics-only file fail if it contains extends: base_metric and base_metric is declared in another native file, raising Metric 'base_metric' not found before the directory loader can merge the graph-level metrics. Skip unresolved graph-metric inheritance here or resolve it after directory-wide metric collection.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e6ae26b9e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/sql/generator.rs Outdated
Comment on lines +783 to +791
for model in self.graph.models() {
if model.get_metric(reference).is_some() {
owners.push(model.name.clone());
}
}

match owners.len() {
0 => {}
1 => return Ok(Some((owners[0].clone(), reference.to_string()))),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer exact graph metrics before model metrics

When a native project defines a top-level metric with the same name as a model metric, this "exact" resolver returns the single model owner before checking self.graph.get_metric(reference), so the Rust runtime compiles the local model metric instead of the top-level metric. This diverges from the Python contract (resolve_metric_reference checks graph metrics first) and affects queries like metrics: [revenue] where orders.revenue exists but a graph-level revenue intentionally overrides or derives from it; check graph metrics before scanning model metrics.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0ff097ebf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/config/sql_parser.rs Outdated
Comment on lines +1563 to +1566
if !remaining.trim().is_empty() {
return Err(SidemanticError::Validation(
"Rust compact SQL model parser does not support trailing graph-level definitions after compact model blocks".to_string(),
));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow graph definitions after compact SQL models

When a native compact SQL file contains a model block followed by a graph-level METRIC/PARAMETER block, parse_sql_content calls parse_sql_models(content) first and returns the model error for any file starting with model, so this explicit trailing-content rejection prevents the later parse_sql_graph_definitions_extended(content) step from ever collecting those graph objects. This is a documented/Python-supported native SQL shape (see the existing test_parse_graph_definitions_after_table_block coverage), so Rust loading now fails for valid split-free SQL model files; leave the remainder for graph-definition parsing instead of rejecting it here.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 986ab7a24d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic-rs/src/sql/generator.rs Outdated
Comment on lines +823 to +825
for model in self.graph.models() {
if model.get_metric(reference).is_some() {
owners.insert(model.name.clone());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use graph metric dependencies to pick owner

Fresh evidence in this revision is that this new helper still seeds ownership from any model metric with the same name before inspecting the graph metric's own SQL. When a top-level metric revenue intentionally overrides a local orders.revenue but is defined from another model, e.g. sql: sales.amount, the Rust planner assigns the graph metric to orders and builds its raw expression in the orders CTE, so it can reference the wrong/missing column and omit the needed sales join; owner selection should come from the graph metric's dependencies rather than the same-named model metric.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 27c6f0e3df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +640 to +648
let agg = self
.agg
.as_deref()
.map(parse_aggregation)
.or_else(|| inline_aggregation.as_ref().map(|(agg, _)| agg.clone()));
let sql = inline_aggregation
.as_ref()
.and_then(|(_, inner_sql)| inner_sql.clone())
.or(self.sql);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve derived metrics when parsing inline aggregates

When a native metric explicitly sets type: derived but its SQL is a single aggregate such as SUM(orders.amount) or COUNT(*), this normalization still extracts agg/inner sql while leaving metric_type as Derived. The Rust generator then takes the derived-metric path and tries to expand orders.amount/* as a metric expression instead of rendering the aggregate, so a metric shape that the Python runtime handles as an inline aggregate fails or emits invalid SQL in Rust. Apply the inline-aggregation rewrite only when the type is inferred/simple, or convert the metric type consistently.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abb4da20b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +824 to +840
for fragment in [
metric.sql.as_deref(),
metric.numerator.as_deref(),
metric.denominator.as_deref(),
metric.base_metric.as_deref(),
metric.entity.as_deref(),
metric.base_event.as_deref(),
metric.conversion_event.as_deref(),
metric.cohort_event.as_deref(),
metric.activity_event.as_deref(),
metric.having.as_deref(),
]
.into_iter()
.flatten()
{
self.collect_owner_models_from_fragment(fragment, &mut owners);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve graph-metric dependencies when choosing owner

When a top-level metric derives from other top-level metrics by unqualified name (for example conversion_rate with numerator: signups and denominator: visitors) in a native project with more than one model, this owner detection only scans the literal fragments for model.field references and never follows those graph-metric dependencies. exact_metric_reference() then returns no owner, and parse_metric_refs() reports the graph metric as not found even though its dependencies may all resolve to a single model; recurse through referenced graph metrics here before falling back to the single-model heuristic.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant