diff --git a/docs/content/0.1.0/docs/developers/how-metric-requests-are-converted-to-sql.md b/docs/content/0.1.0/docs/developers/how-metric-requests-are-converted-to-sql.md index 9f9c01ccb..20a80a84e 100644 --- a/docs/content/0.1.0/docs/developers/how-metric-requests-are-converted-to-sql.md +++ b/docs/content/0.1.0/docs/developers/how-metric-requests-are-converted-to-sql.md @@ -527,6 +527,22 @@ The `is_cross_fact_window` flag on GrainGroupSQL controls this routing. --- +## Column Semantic Types + +Every column in generated SQL is tagged with a `semantic_type` (see `ColumnMetadata` in `build_v3/types.py`). This tag tells downstream consumers (combiners, the metrics phase, cube matching) how to treat the column. Five values are used in practice: + +| `semantic_type` | Version | Meaning | Emitted by | +|---|---|---|---| +| `dimension` | v2, v3 | Grouping or filter column — what you slice by (e.g. `customer.name`, `order_date`). Appears in GROUP BY and in the final output. | Measures phase (grain group dims) and metrics phase (final SELECT) | +| `measure` | v2 only | Pre-aggregated value coming from a `measure`-type node. Read-side code (combiners, cube matching) still tolerates it, but no v3 build path emits it. | v2 build path | +| `metric` | v2, v3 | Fully-aggregated, user-facing metric value in the final result (e.g. `total_revenue`). | Metrics phase final SELECT | +| `metric_component` | v3 only | A piece of a decomposed metric that has been **pre-aggregated for its grain group's phase** but still needs a final combine (e.g. `SUM(x) AS sum_x` and `COUNT(x) AS count_x` for an `AVG`). Produced by `MetricComponentExtractor`. | Measures phase, when `aggregability` is `FULL` or `LIMITED` | +| `metric_input` | v3 only | A **raw, un-aggregated** column feeding a metric whose grain group can't aggregate at the CTE level. Aggregation is deferred to the outer metrics SQL. | Measures phase, when `aggregability == NONE` | + +Note: the `SemanticType` enum in `datajunction_server/models/column.py` only contains `measure`, `metric`, `dimension`, and `timestamp` — `metric_component` and `metric_input` are v3-internal string literals not yet promoted to the enum. + +The key distinction between `metric_component` and `metric_input`: both feed metrics, but a `metric_component` is already aggregated at its CTE (so the final SELECT just combines them), while a `metric_input` is a raw passthrough whose aggregation happens in the outer metrics SQL. See `build_v3/measures.py` (the `Aggregability.NONE` branch around the `ColumnMetadata` construction). + ## Code References The SQL generation logic lives in `datajunction_server/construction/build_v3/`: