Skip to content

Improve BSL compatibility with current upstream syntax#182

Merged
nicosuave merged 8 commits into
mainfrom
bsl-upstream-support
May 31, 2026
Merged

Improve BSL compatibility with current upstream syntax#182
nicosuave merged 8 commits into
mainfrom
bsl-upstream-support

Conversation

@nicosuave
Copy link
Copy Markdown
Member

Improves Boring Semantic Layer compatibility with the current upstream examples and YAML syntax.

This fixes YAML auto-detection false positives, adds support for current calculated_measures expressions including _.all(...), preserves profile/database and metadata fields, and handles join aliases, self-joins, cross joins, and BSL join how settings in generated SQL.

The root cause was that BSL detection used substring checks and the adapter only modeled the older core YAML shape, so recent upstream examples either routed to the wrong adapter or dropped or compiled newer fields incorrectly.

@nicosuave nicosuave marked this pull request as ready for review May 31, 2026 14:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: daf374ddd5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 390 to +391
return Relationship(
name=target_model,
name=name,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve BSL joins when aliases are loaded across files

When the CLI loader reads a BSL project, it calls adapter.parse(str(file_path)) for each YAML file separately, so a join target can live in a different file and _add_join_alias_models() cannot clone the alias model yet. In that common multi-file case, storing the relationship under the alias name makes SemanticGraph.build_adjacency() skip the edge because no model named by the alias exists, so joins such as flights -> origin with model: airports become unusable even though airports is loaded later.

Useful? React with 👍 / 👎.

Comment thread sidemantic/sql/generator.py Outdated
Comment on lines +2465 to +2466
if agg_func == "COUNT_DISTINCT":
return f"SUM(COUNT(DISTINCT {raw_col})) OVER ()"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid summing grouped distinct counts for _.all()

When _.all() wraps a COUNT_DISTINCT measure and the query is grouped by a dimension, this returns the sum of each group's distinct count rather than the distinct count over the whole result set. If the same distinct key appears in more than one group, the denominator is over-counted, so percent-of-total calculated measures based on distinct users/customers are wrong.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae3353e4c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

expr = self._build_measure_aggregation_sql(model_name, measure)
expr = expr.replace(self._cte_ref(model_name, f"{measure.name}_raw"), raw_col)

return f"(SELECT {expr} FROM {cte_name} AS {cte_alias})"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve joined filters in distinct _.all totals

When _.all() wraps a COUNT_DISTINCT measure in a query that also joins to another model (for example orders.user_share by or filtered on customers.region), this scalar subquery reads only the measure model CTE, so it ignores the joined model's filter and even the joined result set. The numerator is computed after the join, but the denominator counts distinct values across all rows in the base CTE, producing incorrect percent-of-total results whenever related-model dimensions/filters restrict the result.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92fc8f0140

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +330 to +332
attrs = _collect_attrs(node)
if attrs:
return ".".join(attrs)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve qualified BSL measure references in calculations

When a calculated measure references a joined model measure such as _.customers.count, this emits customers.count, but the later dependency extraction only keeps the column name count. If the current model also has a count measure, the dependency resolves to the local measure and replacement rewrites only the bare count, producing invalid SQL like customers.(COUNT(orders_cte.count_raw)) instead of joining/aggregating customers.count. This breaks BSL calculations that qualify measures to disambiguate common names across joined models.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bef6719b1d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sidemantic/adapters/bsl.py Outdated
alias = rel.metadata.get("bsl_alias")
if not target_model or not alias or alias == target_model:
continue
aliases[alias] = target_model
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep BSL join aliases scoped to their source model

When two different BSL models reuse the same join alias for different targets, this global aliases map lets the later relationship overwrite the earlier one, and both relationships still point at the single cloned model named by that alias. For example, two models can each define a local user: join to different target models; after parsing, queries through the first join will read the second target's CTE, producing wrong joins/results instead of preserving BSL's per-model alias scope.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ac2d8ae8e2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


right_table = self._quote_identifier(self._cte_name(jp.to_model))
if jp.relationship == "cross":
query = query.join(right_table, join_type="cross")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid selecting default IDs for cross joins

For BSL type: cross joins where the joined table does not declare a primary key column, this new cross-join path still goes through _build_model_cte() with needs_joins=True, which unconditionally projects the model's default primary key (id). A calendar/scaffold table with only day will compile to SELECT id AS id, day AS day FROM calendar before the CROSS JOIN, so the query fails at execution even though cross joins do not need join keys.

Useful? React with 👍 / 👎.

@nicosuave nicosuave merged commit 7d58ee9 into main May 31, 2026
18 checks passed
@nicosuave nicosuave deleted the bsl-upstream-support branch May 31, 2026 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant