Add cohort metric type for two-level aggregation by nicosuave · Pull Request #124 · sidequery/sidemantic

nicosuave · 2026-03-29T19:19:44Z

Summary

New cohort metric type that supports two-level aggregation with HAVING
Aggregate per entity, filter with HAVING, then re-aggregate at the query's dimension grain
Supports entity, inner_metrics (list of {name, agg, sql}), entity_dimensions (carried through to outer GROUP BY), having, and outer agg
Same dispatch pattern as retention/conversion: generates its own complete query, can't combine with other metric types
Useful for "users active N+ days", "users on multiple platforms", "accounts with revenue > $X", etc.

Changes:

metric.py: Added cohort to type literal, new fields, validation
adapters/sidemantic.py: Parse cohort fields in both metric construction paths
sql/generator.py: Route cohort metrics through _generate_cohort_metric_query(), handles segments, filters, entity dimensions, and order/limit/offset

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05a1f28a4e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-29T19:23:10Z

sidemantic/sql/generator.py

+        for dim_ref in parsed_dims:
+            dim_model_name = dim_ref.get("model", model_name)
+            dim_name = dim_ref.get("dimension")


Unpack parsed dimension tuples before accessing fields

_parse_dimension_refs returns (dimension_ref, granularity) tuples, but this loop treats each parsed entry like a dict and calls .get(...). When a cohort query includes any dimension, SQL generation fails immediately with AttributeError: 'tuple' object has no attribute 'get', so cohort metrics cannot be queried at a dimension grain.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-29T19:23:10Z

sidemantic/sql/generator.py

+        if not model:
+            for m_name, m in self.graph.models.items():
+                candidate = m.get_metric(local_name if not model_name else metric_name.split(".", 1)[-1])
+                if candidate and candidate.type == "cohort":
+                    model = m


Resolve graph-level cohort metrics before model scan

This resolver only scans model.get_metric(...) across models and never checks self.graph.get_metric(...), so cohort metrics added via SemanticGraph.add_metric are not resolvable here. Those metrics are still classified as cohort earlier in query planning, but this method then raises No model found for cohort metric ..., breaking unqualified graph-level cohort metrics.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b66b2db5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T05:58:27Z

sidemantic/sql/generator.py

+        entity_sql = _replace_model_placeholder(entity_sql)
+
+        # Process segments from filters
+        all_filters = list(filters or [])


Include cohort metric filters when building WHERE clause

The cohort branch initializes all_filters from request-time filters only, so Metric.filters on a type="cohort" metric are never applied. This yields silently incorrect counts whenever a cohort metric defines built-in scoping (for example, region = 'US'): the generated SQL omits that predicate and computes the cohort over the full dataset instead.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e54114d53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T06:10:11Z

sidemantic/sql/generator.py

+            stripped = self._strip_model_prefixes([f], model_name or "")[0] if model_name else f
+            stripped = _replace_model_placeholder(stripped)


Resolve semantic filter dims before building cohort WHERE

The cohort path strips model prefixes but never runs _resolve_filter_dimensions, so filters that use semantic dimensions backed by expressions (for example city mapped to json_extract_string(...)) are emitted as bare column names in WHERE. In _generate_cohort_metric_query, this causes valid cohort queries with such filters to fail at execution with missing-column errors (city not found) instead of applying the dimension SQL expression.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T06:10:11Z

sidemantic/sql/generator.py

+            else:
+                dim_model_name = model_name
+                dim_name = dim_ref_str
+            alias = dim_name


Keep granularity suffix in cohort dimension aliases

When a cohort query dimension includes a granularity suffix (e.g. events.ts__month), the alias is reduced to the base name (ts) instead of preserving ts__month. As a result, ORDER BY events.ts__month is rewritten to ORDER BY ts__month but no such output column exists, producing a runtime binder error. This also makes cohort output column naming inconsistent with the non-cohort paths.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 075f3ffa38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T06:15:54Z

sidemantic/sql/generator.py

+        # Handle cohort metrics separately - two-level aggregation pattern
+        if cohort_metrics:
+            if len(cohort_metrics) > 1:
+                raise ValueError("Only one cohort metric can be queried at a time")
+            non_cohort = (


Validate cohort+conversion mixes before early conversion return

This cohort compatibility guard is unreachable when a query includes both a conversion metric and a cohort metric, because _generate_with_window_functions returns from the conversion branch first. In that case the cohort metric is silently ignored and the generated SQL only reflects the conversion metric, which produces incorrect results instead of raising the intended "cannot be combined" error for mixed special metric types.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T06:15:54Z

sidemantic/sql/generator.py

+                    if granularity:
+                        dim_sql = f"DATE_TRUNC('{granularity}', {dim_sql})"


Use dialect-aware truncation for cohort time dimensions

Cohort dimension granularity is emitted with a hard-coded DATE_TRUNC('{granularity}', ...) expression, which is not valid for dialects like BigQuery (DATE_TRUNC(expr, MONTH)). This means cohort queries with time-grain dimensions can compile into invalid SQL on non-Postgres/DuckDB dialects, even though the generator already has a dialect-aware _date_trunc helper used elsewhere.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db336e858a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T06:28:09Z

sidemantic/api_server.py

+            query = _normalize_sql_query(payload.query)
+            result = current_layer.adapter.execute(query)


Restrict /raw endpoint to SELECT statements

run_raw_sql executes payload.query directly after normalization, which bypasses the QueryRewriter guard used by /sql that enforces read-only SELECT semantics. In deployments where auth is disabled (or when a token is shared broadly), this enables mutating statements like DELETE, UPDATE, or DROP against the backing database through the API, which is a significant safety regression from the existing SQL endpoint behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T06:28:09Z

sidemantic/sql/generator.py

+            if dim_model_name == model_name:
+                dim = model.get_dimension(dim_name)
+                if dim:
+                    dim_sql = _replace_model_placeholder(dim.sql_expr)


Raise on unresolved cohort dimensions

This loop only adds a dimension when it belongs to the cohort metric's model and model.get_dimension(...) returns a match, but it has no error path for misses. As a result, typoed dimensions or dimensions from joined models are silently dropped from the generated SQL, so the query returns a different grain/shape than requested without surfacing an error, which can corrupt downstream analyses.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T06:28:09Z

sidemantic/sql/generator.py

+        if outer_sql:
+            outer_sql = _replace_model_placeholder(outer_sql)


Build outer cohort SQL in outer scope

For cohort metrics with an explicit sql, the code rewrites {model} placeholders using _replace_model_placeholder, which maps to t when model.sql is present. That alias only exists inside the inner subquery, but outer_expr is emitted in the outer SELECT ... FROM (...) cohort_sub, so expressions like "{model}.user_id" become out-of-scope references and fail at execution time.

Useful? React with 👍 / 👎.

Cohort metrics aggregate per entity with a HAVING filter, then re-aggregate the filtered results. Common analytics pattern for questions like "users active 2+ days" or "users on multiple platforms." Example: - name: retained_users type: cohort entity: person_id entity_dimensions: [platform] inner_metrics: - name: active_days agg: count_distinct sql: "cast(timestamp as date)" having: "active_days >= 2" agg: count Generates: SELECT platform, COUNT(*) AS retained_users FROM ( SELECT person_id, platform, COUNT(DISTINCT CAST(timestamp AS DATE)) AS active_days FROM events WHERE ... GROUP BY person_id, platform HAVING active_days >= 2 ) cohort_sub GROUP BY platform

Three fixes in _generate_cohort_metric_query: 1. Graph-level cohort metrics (added via graph.add_metric) now resolve correctly. Previously only scanned model.get_metric(), missing metrics registered at graph level. Now checks graph.get_metric() first, matching the retention resolver pattern. 2. _parse_dimension_refs returns (dim_ref, granularity) tuples, not dicts. The dimension loop was calling .get() on tuples, causing AttributeError when any dimension was included in a cohort query. 3. Query-level dimensions were computed but never added to the inner SELECT/GROUP BY (dead variable inner_select_cols_extra). Moved the inner_select/inner_group join after the dimension loop so dimensions are properly included in both inner and outer queries. Also formats cli.py (pre-existing).

Two fixes: 1. The conversion branch returned early without checking if cohort metrics were also present, silently dropping them. Now validates that conversion metrics aren't mixed with other special metric types, matching the retention and cohort patterns. 2. Cohort dimension granularity used hardcoded DATE_TRUNC('gran', col) which is invalid on BigQuery. Now uses self._date_trunc() for dialect-aware truncation.

Outer SQL was using _replace_model_placeholder which maps {model} to the inner table alias (t), but the outer query operates on cohort_sub. Also, unresolved or cross-model dimensions were silently dropped instead of raising an error.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 144fb3bf26

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T13:27:07Z

sidemantic/sql/generator.py

+        elif outer_agg == "COUNT":
+            outer_expr = f"COUNT({outer_sql or '*'})"
+        else:
+            outer_expr = f"{outer_agg}({outer_sql or '*'})"


Require sql for non-count outer cohort aggregations

If a cohort metric sets agg to something other than count/count_distinct and leaves sql unset (currently allowed by Metric validation), this branch emits invalid SQL such as AVG(*) or SUM(*). That fails at execution time on DuckDB and other dialects, so valid-looking cohort metric definitions break unless users discover they must provide sql manually.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T13:27:07Z

sidemantic/sql/generator.py

+            elif im_agg == "COUNT":
+                expr = f"COUNT({im_sql or '*'})"
+            else:
+                expr = f"{im_agg}({im_sql or '*'})"


Require sql for non-count inner cohort metrics

Inner cohort metrics have the same issue: for inner_metrics entries with agg like sum/avg and no sql, the generator builds expressions such as SUM(*), which are invalid in SQL engines. Because inner_metrics entries are not validated for this case, these definitions parse successfully but then fail at query execution in the inner aggregation step.

Useful? React with 👍 / 👎.

AVG(*), SUM(*), etc. are invalid SQL. Now raises ValueError at generation time when a cohort metric (outer or inner) uses a non-count aggregation without specifying a sql expression.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a988d4c01

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T13:45:05Z

sidemantic/core/metric.py

+            if not self.inner_metrics:
+                raise ValueError("cohort metric requires 'inner_metrics' field")


Validate inner_metrics entries for cohort metrics

This validation only enforces that inner_metrics is present and non-empty, but does not check each item’s shape (for example requiring name and a valid agg). A malformed definition like inner_metrics: [{}] will pass model parsing and then fail later during SQL generation when _generate_cohort_metric_query accesses im["name"], causing a runtime KeyError instead of a clear config-time error. Adding per-entry validation here would prevent broken cohort metrics from reaching query execution.

Useful? React with 👍 / 👎.

Each inner_metrics entry must have a 'name' key, and non-count aggregations must include a 'sql' field. Catches malformed definitions like inner_metrics: [{}] early instead of failing with KeyError during SQL generation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a02f2e91e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T14:00:31Z

sidemantic/sql/generator.py

+            if im_agg == "COUNT_DISTINCT":
+                expr = f"COUNT(DISTINCT {im_sql or '*'})"
+            elif im_agg == "COUNT":


Require SQL expression for inner COUNT_DISTINCT cohorts

When an inner cohort metric is defined with agg: count_distinct and no sql, this branch emits COUNT(DISTINCT *), which is invalid SQL in common engines (including DuckDB/Postgres) and fails at execution time. Because cohort validation currently allows this shape, users can define a seemingly valid metric that only breaks when queried. Please either require sql for inner count_distinct metrics or fall back to a valid default column (for example, the cohort entity).

Useful? React with 👍 / 👎.

COUNT(DISTINCT *) is invalid SQL. Only bare COUNT can default to *. Updated both metric validation and generator runtime check.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bae2937cea

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T14:10:04Z

sidemantic/core/metric.py

+    type: Literal["ratio", "derived", "cumulative", "time_comparison", "conversion", "retention", "cohort"] | None = (
+        Field(None, description="Metric type for complex calculations")


Allow cohort metrics in semantic-layer validation

Adding "cohort" to Metric.type here makes cohort metrics constructible, but validate_metric() still rejects that type (sidemantic/validation.py only allows ratio/derived/cumulative/time_comparison/conversion/retention). Because SemanticLayer.add_metric() and file loaders route graph-level metrics through that validator, cohort metrics fail with MetricValidationError during normal API/CLI loading even though SQL generation now supports them.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T14:10:04Z

sidemantic/adapters/sidemantic.py

+                inner_metrics=measure_def.get("inner_metrics"),
+                entity_dimensions=measure_def.get("entity_dimensions"),
+                having=measure_def.get("having"),


Preserve cohort fields during sidemantic export

These new cohort fields are parsed, but the adapter’s export paths (_export_model and _export_metric) never serialize inner_metrics, entity_dimensions, or having. As a result, parse→export round trips silently drop required cohort configuration and emit incomplete metric definitions, which can break downstream validation/reloads in CLI-first workflows.

Useful? React with 👍 / 👎.

- Add 'cohort' to allowed metric types in validate_metric() and skip agg validation for cohort metrics in validate_model() - Serialize inner_metrics, entity_dimensions, and having in both model-level and graph-level metric export paths - Export agg for graph-level metrics (needed for cohort outer agg)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6b9150f0c3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T14:20:46Z

sidemantic/sql/generator.py

+            # Cohort metrics need special handling
+            if metric.type == "cohort":
+                return True


Resolve unqualified cohort metrics before window-path check

The new cohort dispatch in metric_needs_window only works after a metric is found, but unqualified names are resolved via self.graph.get_metric(...) only. Model-scoped cohort metrics are not auto-registered in SemanticGraph.add_model, so a query like metrics=["multi_platform_users"] (defined on a model, not added at graph level) is not detected as cohort, falls through the standard planner, and can fail with No models found for query instead of generating cohort SQL. Please mirror the model-scan fallback used by _generate_cohort_metric_query when resolving unqualified metrics here.

Useful? React with 👍 / 👎.

Both metric_needs_window and resolve_metric_ref (inside _generate_with_window_functions) only checked graph-level metrics for unqualified names. Model-scoped cohort metrics like metrics=["multi_platform_users"] were not found, causing either "No models found" errors or infinite recursion. Added model-scan fallback to both resolution paths.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc42856b7a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T14:33:28Z

sidemantic/sql/generator.py

+                    candidate = m.get_metric(local_name)
+                    if candidate and candidate.type == "cohort":
+                        model = m
+                        metric = candidate
+                        model_name = m_name


Detect ambiguous unqualified cohort metrics across models

When resolving unqualified cohort metric names, this branch picks the first model containing local_name and stops. If two models both define a cohort metric with the same name, generate(metrics=["that_name"]) silently uses insertion order instead of raising an ambiguity error, so queries can run against the wrong model and return incorrect results. The cohort resolver already has an ambiguity error path for other cases, so this scan should also disambiguate rather than break on first match.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T14:33:28Z

sidemantic/sql/generator.py

+            inner_select_cols.append(f"{dim_sql} AS {alias}")
+            inner_group_cols.append(dim_sql)
+            outer_select_cols.append(alias)
+            outer_group_cols.append(alias)


Quote cohort dimension aliases in outer SELECT/GROUP BY

Cohort dimensions are carried into outer_select_cols/outer_group_cols as raw identifiers, so reserved words or non-identifier names generate invalid SQL (for example, a dimension named order produces SELECT order ... GROUP BY order, which fails to parse). Other query paths use alias quoting helpers, but this cohort path does not, causing cohort-only regressions for otherwise valid model dimension names.

Useful? React with 👍 / 👎.

- Cohort metric name-scan now collects all matches across models and raises ValueError when >1 match found, instead of silently picking the first by insertion order. - _quote_identifier now delegates to sqlglot for simple identifiers, which correctly quotes SQL reserved words like 'order', 'select', etc. This fixes invalid SQL in cohort (and all other) query paths when dimension names collide with reserved words.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc44a79041

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T15:09:52Z

sidemantic/sql/generator.py

+            for m_name, m in self.graph.models.items():
+                found = m.get_metric(metric_ref)
+                if found:
+                    return found, m_name


Detect ambiguous unqualified window metrics across models

The new model-scan fallback in resolve_metric_ref returns the first matching metric name without checking for collisions, so unqualified window-path metrics (for example cumulative/offset-ratio names duplicated across two models) can now execute against whichever model was added first instead of raising an ambiguity error. That produces silently incorrect results for valid-but-ambiguous queries, because this branch is used to classify and plan window/special metrics in _generate_with_window_functions.

Useful? React with 👍 / 👎.

…CI flake - resolve_metric_ref model-scan fallback now collects all matches and raises ValueError on ambiguity, preventing silent wrong-model queries for any metric type (not just cohort). - Bump complex_rewrite_performance threshold from 15ms to 20ms to fix flaky CI on slower Python 3.11 runners (hit 15.625ms).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c44bc4621

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T15:20:06Z

sidemantic/sql/generator.py

+            else:
+                expr = f"{im_agg}({im_sql})"
+
+            inner_metric_selects.append(f"{expr} AS {im_name}")


Quote cohort inner metric aliases in generated SQL

The inner cohort query emits inner_metrics[].name directly in AS {im_name} without quoting or validation, so a legal config like name: "order" or name: "active.days" generates invalid SQL in the inner SELECT/HAVING (AS order, HAVING order >= 2) and fails at parse time. This is a runtime break for cohort metrics that use non-simple aliases; quoting (or enforcing identifier-safe names at validation time) is needed here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T15:20:06Z

sidemantic/sql/generator.py

+                    order_fields.append(f"{parts[0]} {parts[1].upper()}")
+                else:
+                    order_fields.append(field_name)
+            order_clause = f"\nORDER BY {', '.join(order_fields)}"


Quote ORDER BY fields in cohort query output

Cohort order_by expressions are appended as raw field names after prefix stripping, so sorting by a dimension alias that requires quoting (for example events.order) produces ORDER BY order even when the same alias is correctly quoted in SELECT/GROUP BY. This causes parser errors on engines like DuckDB/Postgres for valid cohort queries that group by reserved-word dimensions and then sort by them.

Useful? React with 👍 / 👎.

Inner metric names, outer metric name, and ORDER BY field names are now passed through _quote_alias so reserved words and special characters produce valid SQL.

chatgpt-codex-connector bot reviewed Mar 29, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

nicosuave force-pushed the cohort-metric branch from 6e54114 to 075f3ff Compare March 30, 2026 06:11

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

nicosuave and others added 5 commits March 30, 2026 06:14

Auto-update JSON schema

4f8693d

nicosuave force-pushed the cohort-metric branch from db336e8 to 144fb3b Compare March 30, 2026 13:22

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

Require sql field for non-count cohort aggregations

0a988d4

AVG(*), SUM(*), etc. are invalid SQL. Now raises ValueError at generation time when a cohort metric (outer or inner) uses a non-count aggregation without specifying a sql expression.

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

Validate inner_metrics entries at cohort metric construction time

a02f2e9

Each inner_metrics entry must have a 'name' key, and non-count aggregations must include a 'sql' field. Catches malformed definitions like inner_metrics: [{}] early instead of failing with KeyError during SQL generation.

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

Require sql for COUNT_DISTINCT in cohort inner metrics

bae2937

COUNT(DISTINCT *) is invalid SQL. Only bare COUNT can default to *. Updated both metric validation and generator runtime check.

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

Quote inner metric aliases and ORDER BY fields in cohort SQL

2b2cb5a

Inner metric names, outer metric name, and ORDER BY field names are now passed through _quote_alias so reserved words and special characters produce valid SQL.

nicosuave merged commit b627ab4 into main Mar 30, 2026
15 checks passed

nicosuave deleted the cohort-metric branch March 30, 2026 16:34

		stripped = self._strip_model_prefixes([f], model_name or "")[0] if model_name else f
		stripped = _replace_model_placeholder(stripped)

		if granularity:
		dim_sql = f"DATE_TRUNC('{granularity}', {dim_sql})"

		query = _normalize_sql_query(payload.query)
		result = current_layer.adapter.execute(query)

		if outer_sql:
		outer_sql = _replace_model_placeholder(outer_sql)

		if not self.inner_metrics:
		raise ValueError("cohort metric requires 'inner_metrics' field")

		type: Literal["ratio", "derived", "cumulative", "time_comparison", "conversion", "retention", "cohort"] \| None = (
		Field(None, description="Metric type for complex calculations")

Conversation

nicosuave commented Mar 29, 2026

Summary

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment