Align native contract and SQL correctness#184
Conversation
6ccaf41 to
cd7a08e
Compare
326838a to
c6b38ca
Compare
cd7a08e to
4b9c638
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4b9c638ef7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if jp.custom_condition: | ||
| join_cond = self._custom_join_condition(jp) |
There was a problem hiding this comment.
Project columns referenced by custom joins
When a relationship sql uses columns that are not the declared foreign/primary keys (for example {from}.tenant_id = {to}.tenant_id AND {to}.valid_to IS NULL), this branch emits the predicate against the per-model CTE aliases, but _build_model_cte only projects dimensions, metrics, PKs, and declared join keys. The generated SQL then references missing CTE columns like customers_cte.valid_to; either extract/project columns from the custom predicate or reject predicates that use non-projected columns.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38929833cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| METRIC_FIELDS = { | ||
| "name", | ||
| "extends", | ||
| "type", | ||
| "agg", | ||
| "sql", | ||
| "expr", | ||
| "measure", |
There was a problem hiding this comment.
Preserve legacy derived metric dependencies
With the new strict field validation, native YAML exported by previous versions for a derived metric can no longer be loaded: the old exporter wrote a metric-local metrics: [...] dependency list for derived metrics, but this allow-list omits that field, so _parse_metric() now raises unknown native field(s) before it can ignore or migrate it. This affects users re-loading previously exported Sidemantic YAML whenever a derived metric had detected dependencies; allow the legacy metrics key or handle it during parsing to keep those files compatible.
Useful? React with 👍 / 👎.
3892983 to
262e1a8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 262e1a8f9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if resolved_metric and resolved_model_name is None: | ||
| metric_expr = self._build_metric_sql(resolved_metric) | ||
| metric_expr = self._wrap_with_fill_nulls(metric_expr, resolved_metric) | ||
| alias = aliases.get(metric_ref, resolved_metric.name) |
There was a problem hiding this comment.
Preserve dotted graph metric aliases in ORDER BY
When a graph-level metric itself contains a dot (now resolved before treating model.metric as a model measure), this branch emits the SELECT alias as the full metric name, e.g. AS "orders.revenue", but the existing ORDER BY path later strips everything before the dot and orders by revenue. A query like metrics=["orders.revenue"] with a graph-level metric named orders.revenue and order_by=["orders.revenue DESC"] therefore generates ORDER BY revenue DESC, which does not match the projected alias and fails unless an unrelated revenue column exists; use the same alias normalization for dotted graph metrics as the SELECT list.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 262e1a8f9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if resolved_metric and ref_model_name is None: | ||
| for dep in resolved_metric.get_dependencies(self.graph, model_name): | ||
| collect_measures_from_metric(dep, visited) | ||
| if resolved_metric.sql and sql_has_aggregate(resolved_metric.sql, self.dialect): | ||
| collect_sql_columns_for_model(resolved_metric.sql) | ||
| return |
There was a problem hiding this comment.
Project raw SQL columns for graph aggregations
When a top-level metric resolves through the new exact graph-metric path and is a simple aggregation (for example name: finance.revenue, agg: sum, sql: orders.amount), this branch only collects dependency metrics or SQL that still contains an aggregate. The metric SQL later renders as SUM(orders_cte.amount), but _build_model_cte never projects amount, so dotted top-level aggregation metrics that this change now documents/supports generate SQL referencing a missing CTE column. Collect columns from resolved_metric.sql for resolved_metric.agg as well.
Useful? React with 👍 / 👎.
b3ee67c to
1b91738
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1b91738452
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for model in graph.models.values(): | ||
| if any(metric.extends for metric in model.metrics): | ||
| resolved_metrics = resolve_metric_inheritance({metric.name: metric for metric in model.metrics}) |
There was a problem hiding this comment.
Defer metric inheritance until parent models merge
When load_from_directory parses native files one at a time, a child model whose extends parent is in a different file reaches this loop before model inheritance has been merged. If that child also has a metric extending a parent metric, resolve_metric_inheritance({metric.name: metric for metric in model.metrics}) only sees the child file's local metrics and raises Metric '<parent>' not found, even though the parent model will be loaded later. This breaks cross-file native model inheritance that otherwise works; skip model-local metric inheritance while a model parent is unresolved, or run it after the directory-level merge.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62ae7e36f3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for metric_def in data.get("metrics") or []: | ||
| metric = self._parse_metric(metric_def) | ||
| metric = self._parse_metric(metric_def, source_path=source_path, context="metric") |
There was a problem hiding this comment.
Recognize native metrics-only files in directory loading
This top-level native metrics parsing only works when SidemanticAdapter is invoked directly. In the CLI-first load_from_directory path, YAML detection still selects SidemanticAdapter only when a file has top-level models; a native file that contains only version: 1 plus graph-level metrics is instead handed to MetricFlowAdapter (sidemantic/loaders.py lines 130-134) and fails to load. This breaks valid split native projects where models and graph metrics live in separate files; detect native version/native metric shape before the MetricFlow fallback.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ebdf712278
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| normalized_frontmatter = normalize_sql_frontmatter(frontmatter) if frontmatter else {} | ||
| if normalized_frontmatter: | ||
| model = self._parse_model(normalized_frontmatter) | ||
| model = self._parse_model(normalized_frontmatter, source_path=source_path) |
There was a problem hiding this comment.
Ignore root-only frontmatter before parsing SQL models
When a native .sql file contains graph-level SQL definitions plus valid root metadata such as connection in the YAML frontmatter, normalize_sql_frontmatter() only removes version, so this branch treats the remaining connection mapping as model frontmatter and _parse_model() now rejects it as an unknown model field. This breaks SQL-form native files that only add top-level metadata; strip root-only keys like connection before using frontmatter as a model definition.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b40dbbfbf5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if any(metric.extends for metric in graph.metrics.values()): | ||
| graph.metrics = resolve_metric_inheritance(graph.metrics) |
There was a problem hiding this comment.
Defer graph metric inheritance across native files
When load_from_directory parses split native projects, each YAML file is handed to SidemanticAdapter before all_metrics are combined. This per-file inheritance resolution makes a valid metrics-only file fail if it contains extends: base_metric and base_metric is declared in another native file, raising Metric 'base_metric' not found before the directory loader can merge the graph-level metrics. Skip unresolved graph-metric inheritance here or resolve it after directory-wide metric collection.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3e6ae26b9e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for model in self.graph.models() { | ||
| if model.get_metric(reference).is_some() { | ||
| owners.push(model.name.clone()); | ||
| } | ||
| } | ||
|
|
||
| match owners.len() { | ||
| 0 => {} | ||
| 1 => return Ok(Some((owners[0].clone(), reference.to_string()))), |
There was a problem hiding this comment.
Prefer exact graph metrics before model metrics
When a native project defines a top-level metric with the same name as a model metric, this "exact" resolver returns the single model owner before checking self.graph.get_metric(reference), so the Rust runtime compiles the local model metric instead of the top-level metric. This diverges from the Python contract (resolve_metric_reference checks graph metrics first) and affects queries like metrics: [revenue] where orders.revenue exists but a graph-level revenue intentionally overrides or derives from it; check graph metrics before scanning model metrics.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a0ff097ebf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if !remaining.trim().is_empty() { | ||
| return Err(SidemanticError::Validation( | ||
| "Rust compact SQL model parser does not support trailing graph-level definitions after compact model blocks".to_string(), | ||
| )); |
There was a problem hiding this comment.
Allow graph definitions after compact SQL models
When a native compact SQL file contains a model block followed by a graph-level METRIC/PARAMETER block, parse_sql_content calls parse_sql_models(content) first and returns the model error for any file starting with model, so this explicit trailing-content rejection prevents the later parse_sql_graph_definitions_extended(content) step from ever collecting those graph objects. This is a documented/Python-supported native SQL shape (see the existing test_parse_graph_definitions_after_table_block coverage), so Rust loading now fails for valid split-free SQL model files; leave the remainder for graph-definition parsing instead of rejecting it here.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 986ab7a24d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for model in self.graph.models() { | ||
| if model.get_metric(reference).is_some() { | ||
| owners.insert(model.name.clone()); |
There was a problem hiding this comment.
Use graph metric dependencies to pick owner
Fresh evidence in this revision is that this new helper still seeds ownership from any model metric with the same name before inspecting the graph metric's own SQL. When a top-level metric revenue intentionally overrides a local orders.revenue but is defined from another model, e.g. sql: sales.amount, the Rust planner assigns the graph metric to orders and builds its raw expression in the orders CTE, so it can reference the wrong/missing column and omit the needed sales join; owner selection should come from the graph metric's dependencies rather than the same-named model metric.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27c6f0e3df
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let agg = self | ||
| .agg | ||
| .as_deref() | ||
| .map(parse_aggregation) | ||
| .or_else(|| inline_aggregation.as_ref().map(|(agg, _)| agg.clone())); | ||
| let sql = inline_aggregation | ||
| .as_ref() | ||
| .and_then(|(_, inner_sql)| inner_sql.clone()) | ||
| .or(self.sql); |
There was a problem hiding this comment.
Preserve derived metrics when parsing inline aggregates
When a native metric explicitly sets type: derived but its SQL is a single aggregate such as SUM(orders.amount) or COUNT(*), this normalization still extracts agg/inner sql while leaving metric_type as Derived. The Rust generator then takes the derived-metric path and tries to expand orders.amount/* as a metric expression instead of rendering the aggregate, so a metric shape that the Python runtime handles as an inline aggregate fails or emits invalid SQL in Rust. Apply the inline-aggregation rewrite only when the type is inferred/simple, or convert the metric type consistently.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: abb4da20b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for fragment in [ | ||
| metric.sql.as_deref(), | ||
| metric.numerator.as_deref(), | ||
| metric.denominator.as_deref(), | ||
| metric.base_metric.as_deref(), | ||
| metric.entity.as_deref(), | ||
| metric.base_event.as_deref(), | ||
| metric.conversion_event.as_deref(), | ||
| metric.cohort_event.as_deref(), | ||
| metric.activity_event.as_deref(), | ||
| metric.having.as_deref(), | ||
| ] | ||
| .into_iter() | ||
| .flatten() | ||
| { | ||
| self.collect_owner_models_from_fragment(fragment, &mut owners); | ||
| } |
There was a problem hiding this comment.
Resolve graph-metric dependencies when choosing owner
When a top-level metric derives from other top-level metrics by unqualified name (for example conversion_rate with numerator: signups and denominator: visitors) in a native project with more than one model, this owner detection only scans the literal fragments for model.field references and never follows those graph-metric dependencies. exact_metric_reference() then returns no owner, and parse_metric_refs() reports the graph metric as not found even though its dependencies may all resolve to a single model; recurse through referenced graph metrics here before falling back to the single-model heuristic.
Useful? React with 👍 / 👎.
Base: #183