Skip to content

Releases: ralforion/orionbelt-semantic-layer

v2.16.0

23 Jun 16:44
ecfd366

Choose a tag to compare

[2.16.0] - 2026-06-23

Added

  • OBML contract manifest (schema/obml-contract.yml). A single, drift-checked inventory of the OBML field surface: every enum, class, and field with its camelCase alias, JSON-schema exposure, ontology property, and OSI round-trip status. CI fails if the Pydantic models, the JSON schema, or the ontology drift from it, enforcing the "OBML is the single source of truth" rule mechanically.
  • JSON Schema validation at the API ingestion boundary. Model-load and query endpoints (session, shortcut, and oneshot) plus the MODEL_FILES startup preload now validate payloads against the published JSON Schema and return a 422 on a violation, so the schemas are the load-bearing contract rather than just published artifacts.
  • Architecture quality gates in CI. Import dependency-direction enforcement, broad-except containment, RawSQL containment, compiler metamorphic invariants, and coverage floors (compiler/parser/api and the OSI converter package).

Changed

  • Breaking: the query field order_by is now orderBy. It was the only snake_case key in the query contract; it is now camelCase like usePathNames and dimensionsExclude. The query JSON schema (additionalProperties: false) rejects order_by, so update query payloads to orderBy. The Python field name stays order_by (snake_case field names, camelCase aliases, as elsewhere), so Python QueryObject(order_by=...) is unchanged. Docs, examples, and integration tool definitions were updated.
  • OBML payloads are held to the camelCase contract. The JSON schema is now camelCase-only (the duplicate snake_case max_staleness and intent_tags spellings were removed). Payloads that previously relied on lenient coercion (snake_case keys, a string version, or uppercase enum values) are now rejected at the API; send canonical camelCase. The vendored schema in the OSI converter was refreshed to match.
  • Internal hardening, no SQL or endpoint behavior change. The compiler wrapper stage is now an explicit pass pipeline; sessions.py delegates to an api/services/ layer; the app owns its runtime explicitly with per-request isolation; and five large modules (compiler/resolution, compiler/cfl, ui/app, the OSI converter, and the Flight server) were split into focused submodules. All compiler drift snapshots are byte-identical.

Full Changelog: v2.15.0...v2.16.0

v2.15.0

18 Jun 09:09
bef4568

Choose a tag to compare

[2.15.0] - 2026-06-18

Added

  • Governed views in the Dremio demo. The demo bootstrap now creates a governed Dremio Space with one view per curated query (A1 raw lakehouse plus A2-A7 governed), so they can be browsed and queried by name. Demonstrates saving a governed query as a reusable view.

Fixed

  • DECIMAL columns report as NUMERIC over the Postgres wire protocol (#116). Decimal measures/metrics were coarsened to FLOAT8, so values lost scale on display (574585.00 showed as 574585.0, large values as 1.6E7). They are now reported as NUMERIC(precision, scale) (from the model's declared dataType) across the query RowDescription, the catalog metadata, and the encoded values, so BI tools render the declared scale.
  • DuckDB's internal main schema is hidden from BI-tool schema browsers. Dremio's Postgres source enumerates schemas from pg_tables / pg_views; those are now shadowed (alongside pg_namespace / information_schema.schemata) so only the per-model schemas appear.
  • Dremio federation view/filter pushdown now compiles. The pgwire flattener handles Dremio's nested derived-table pushdown (constant-folded dimension equality, CAST(... AS DECIMAL) projections) so saved views and filtered queries work; it bails when an outer filter/order would cross an inner LIMIT rather than return wrong rows.
  • HAVING filters on period-over-period metrics are applied. They were silently dropped by the PoP wrapper, returning unfiltered rows.

Changed

  • Clearer errors for incompatible-artefact combinations. Period-over-period grain mismatch, fanout, and cross-fact two-column-aggregate errors are rewritten in plain language with remediation.
  • Demo uses the orionbelt brand catalog as the pgwire database. The model is a schema (commerce) inside it; the Dremio source is renamed from obsl to orionbelt (path orionbelt.commerce.model).

Full Changelog: v2.14.0...v2.15.0

v2.14.0

16 Jun 08:46
2f58ec0

Choose a tag to compare

[2.14.0] - 2026-06-16

Added

  • Artefacts Composability Resolution (ACR). A new composables endpoint answers, for the query you have built so far, which other artefacts can still be added and yield a valid, fanout-free result. POST /v1/sessions/{id}/models/{mid}/composables takes a query as the anchor (its dimensions and measures), and GET .../composables?anchor=... accepts one or more named anchors; both return composable dimensions, measures, and metrics, plus cflMeasures / cflMetrics for artefacts combinable only through the Composite Fact Layer. Top-level shortcuts (/v1/composables) auto-resolve a single session/model. ACR reuses the planner's directed join-graph reachability, so anything it reports as composable is guaranteed to compile. See docs/guide/composability.md.
  • Guided query building in the UI. The Gradio playground now highlights composable dimensions and measures/metrics in the artefact pickers as you edit the query (a check mark, with (via CFL) for cross-fact candidates). Highlighting never hides artefacts, so independent (CFL) analyses stay discoverable.

Full Changelog: v2.12.0...v2.14.0

v2.12.0

14 Jun 18:42
ecb3a4c

Choose a tag to compare

[2.12.0] - 2026-06-14

Added

  • Unified authentication across every surface. A single AUTH_MODE selector (none / api_key / oidc) governs REST, Arrow Flight SQL, the Postgres wire protocol, the Gradio UI, and the MCP server. Off by default (AUTH_MODE=none), so the public demo and local dev are unchanged; production turns it on with AUTH_MODE=api_key + API_KEYS (comma-separated keys, rotated by overlap). Startup fails fast on an empty key list or a weak key (under 32 characters or low-entropy). oidc is reserved for a later release and is rejected loudly until then. See the new docs/guide/authentication.md.
  • REST API-key auth. Every /v1 endpoint requires a valid key when auth is on; X-API-Key (configurable via API_KEY_HEADER) and Authorization: Bearer are both accepted. Missing credentials return 401 with WWW-Authenticate, invalid ones 403. /health, /robots.txt, /docs, /redoc, /openapi.json, and /ui stay open, and /health now reports auth_mode so clients can detect the requirement without a key.
  • Flight + pgwire auth on the shared key store. Arrow Flight validates the handshake credential (the API key) against the same store. The Postgres wire surface requires the key as a password, defaulting to SCRAM-SHA-256 (which never sends the key on the wire); operators can opt into cleartext with PGWIRE_AUTH_MODE=password. The legacy FLIGHT_AUTH_MODE=token / FLIGHT_API_TOKEN path keeps working for one release with a deprecation warning.
  • UI credential forwarding. The Gradio UI reads OBSL_API_KEY and forwards it on every REST call; browser users never see it. It logs a clear startup error when the API requires auth but no key is set. The co-hosted (embedded) UI also requires OBSL_API_KEY to be set explicitly (see Security).

Changed

  • AUTH_ENABLED is deprecated. It now acts as an alias for AUTH_MODE=api_key and logs a startup warning. Migrate to AUTH_MODE.
  • FastAPI 0.137 compatibility. FastAPI 0.137 rejects empty-string route paths supplied via include_router(prefix=...) ("Prefix and path cannot be both empty"). The five affected routers (sessions, models, settings, dialects, reference) now declare their prefix on the APIRouter() constructor instead, which keeps their root routes at the same URLs with no trailing slash. No version cap needed.

Security

  • Embedded UI no longer auto-loads the server's API key. /ui is a server-side proxy that can act on /v1, and it is not itself behind API-key auth. Previously, with AUTH_MODE=api_key the embedded UI silently adopted the first configured key, turning /ui into an open privileged proxy. It now injects a key only when OBSL_API_KEY is set explicitly, and logs a warning that /ui must be network-protected when it is.
  • pgwire pre-auth DoS hardening. The startup + password/SCRAM handshake now runs under a hard deadline (PGWIRE_AUTH_TIMEOUT_SECONDS, default 10s) so a stalled unauthenticated client cannot pin a connection slot. Post-startup frames are capped at 16 MB, and auth (password/SASL) frames at 64 KB, so a client cannot advertise a huge frame to exhaust memory before authenticating.
  • Heartbeat is exempt from global API-key auth. POST /v1/heartbeat keeps its own Authorization: Bearer <HEARTBEAT_AUTH_TOKEN> auth and is included outside the auth-bearing router, so its token is no longer rejected by the global auth's Bearer-as-API-key fallback.
  • Weak API keys are rejected at startup. Keys shorter than 32 characters or with low character diversity are refused (the server will not start), since short / low-entropy keys are vulnerable to offline attack on captured SCRAM transcripts. Generate a strong key with python3 -c "import secrets; print(f'obsl_pat_{secrets.token_hex(20)}')".
  • Flight requires explicit FLIGHT_ENABLED. The Arrow Flight SQL server no longer auto-starts merely because ob-flight-extension is installed; it starts only when FLIGHT_ENABLED=true. This prevents silently exposing a SQL surface on 0.0.0.0 by package presence alone. When the package is present but the flag is off, startup logs a hint. (Deployments that relied on auto-start must now set FLIGHT_ENABLED=true.)
  • Unauthenticated Flight startup warns loudly. When Flight is enabled but starts without auth (no AUTH_MODE=api_key, no FLIGHT_API_TOKEN) it now logs a prominent warning that it is exposing an unauthenticated SQL surface on 0.0.0.0.

Full Changelog: v2.11.0...v2.12.0

v2.11.0

13 Jun 16:36
7fb75a9

Choose a tag to compare

[2.11.0] - 2026-06-13

Added

  • Filter pushdown through Postgres-federation BI tools (Dremio). When a tool like Dremio federates into OrionBelt's pgwire surface, its connector wraps the virtual model table in a trivial derived table and lifts the predicate to the outer query (SELECT ... FROM (SELECT ... FROM model) WHERE ...). The pgwire translator now detects and flattens that wrapper, so dimension filters (WHERE) and measure filters (HAVING) execute through federation instead of being rejected as unsupported subqueries. SQL that is not this exact shape is left untouched, so genuinely unsupported subqueries still reject.
  • Result cache now serves the pgwire surface. The freshness-driven result cache used to be wired only into the REST query handlers, so BI tools querying over pgwire (Dremio federation, DBeaver, Tableau) always bypassed it. The compile -> cache key -> freshness TTL -> get -> on-miss execute -> set pipeline is extracted into a shared orionbelt.api.query_cache service that REST and pgwire both use, so repeated pgwire queries are served from cache (GET /v1/cache/stats). Catalog / metadata probes never reach the semantic execute path and are never cached. Arrow Flight has a separate streaming execution path and is still pending (issue #117).
  • Period-over-period: multiple comparison offsets per query. One query can now combine PoP metrics with different offsets (e.g. month-over-month and year-over-year). They share a single date spine (same time dimension and base grain), and each distinct offset gets its own prior-period self-join. Previously all PoP metrics in a query silently reused the first metric's offset, producing wrong results for the others; a query mixing different base grains now raises a clear error instead.
  • Dremio "semantic sidecar" demo (demo/dremio/). A one-command, self-contained stack (MinIO + Dremio OSS + OrionBelt in single-model mode + the Gradio playground) that shows Dremio federating into OrionBelt over pgwire while OrionBelt compiles to the Dremio dialect and pushes execution back into Dremio over Arrow Flight. Includes a built-in raw-Parquet-vs-governed comparison, a runbook, and asset builders (DuckDB seed to Parquet, plus a Dremio-dialect model generated from the canonical commerce model).

Fixed

  • _metrics_metadata catalog view exposes each metric's formula again. The pgwire/BI metadata view read a non-existent formula attribute (the metric field is expression), so the formula column came back empty for every metric in every client (DBeaver, Tableau, Dremio, psql). It now shows the derived metric's expression and a synthesized formula for cumulative (e.g. avg(Total Sales) rolling 30 over Sales Date) and period-over-period (e.g. percentChange({[Total Sales]}, -1 year)) metrics. Not dialect-specific.
  • Cross-fact metrics no longer leak component-measure columns. A CFL (multi-fact) query that selected a derived/ratio metric (e.g. Return Rate, Gross Margin) projected the metric's underlying component measures (e.g. Total Returns, Total Purchases) as extra result columns the caller never requested. The outer SELECT now projects only the requested dimensions, measures, and metrics (the components are still aggregated internally to feed the metric expression). Direct Postgres clients tolerated the extra columns, but Postgres-federation engines that pin a dataset's column set (Dremio) rejected them with INVALID_DATASET_METADATA; cross-fact metrics now work through Dremio federation.
  • Period-over-period metrics on Dremio. The PoP wrapper self-joined the base CTE under the alias prev, which is a reserved word in Dremio and rejected as an unquoted table alias (Encountered "- prev"). The alias is now pop_prev, so period-over-period metrics (e.g. Sales MoM Change) compile and execute on the Dremio dialect, including through Dremio's pgwire federation.

Changed

  • Cleaner federated catalog in admin-curated mode. With MODEL_FILES set, the pgwire/BI catalog now exposes only the curated models. Transient user/scratch sessions (REST clients, the Gradio playground) are no longer surfaced as one schema per session id under the source, so BI-tool schema browsers stay uncluttered. Dynamic (non-curated) mode still lights up REST-loaded sessions as before.

Full Changelog: v2.10.0...v2.11.0

v2.10.0

12 Jun 16:49
5020417

Choose a tag to compare

[2.10.0] - 2026-06-12

Added

  • osi-orionbelt converter package. The bidirectional OBML <-> OSI converter is now a standalone, pip-installable package (Apache-2.0), developed in-repo as a uv workspace member under packages/osi-orionbelt and published to PyPI alongside the ob-* drivers. It ships a single osi-orionbelt command with obml-to-osi / osi-to-obml subcommands (with --ontology), mirroring the osi-dbt converter, and vendors all three schemas (the two OSI artefacts plus a synced snapshot of the OBML schema) so it builds and validates standalone with no orionbelt dependency.
  • OSI conversion is now an optional extra. osi-orionbelt is no longer a hard dependency of orionbelt-semantic-layer, so a bare pip install orionbelt-semantic-layer stays lean. The /convert, /models/from-osi, and /osi endpoints return a clear 503 when the converter is absent. Install it with pip install 'orionbelt-semantic-layer[osi]' (or pip install osi-orionbelt); the flight and flight-duckdb-only deploy extras bundle it, so the shipped API images keep OSI conversion working. The previous force-include of the converter into the wheel and the COPY osi-obml lines in all three Dockerfiles are removed.
  • Third-party vendor extension preservation. OSI custom_extensions from vendors the converter does not handle internally (e.g. SNOWFLAKE, DBT, SALESFORCE, GOODDATA) now round-trip verbatim at the model, dataObject/dataset, column/field, and measure/metric levels. OSI has no separate dimension entity, so an OBML dimension's foreign extensions surface on its OSI field.

Changed

  • Converter vendor identity. OBML -> OSI now tags OrionBelt-proprietary payloads as ORIONBELT (was COMMON), and OSI -> OBML stashes OSI-native fields OBML cannot hold (unique keys, field labels, leftover ai_context) under OSI (was the misleading OBSL). Read paths still accept the legacy COMMON/OBSL tags, so previously emitted documents keep round-tripping. The converter self-identifies as osi-orionbelt in its roundtrip metadata. vendor_name is an open string in OSI, so this is a non-breaking output change.
  • UI: Export as OSI. The Model Workbench export button (now ⬇ Export as OSI) shows clean, directly copyable OSI YAML in the preview box (relabelled to "OSI YAML (exported)", reset to "Generated SQL" on the next compile/validate/import) and downloads the result as model.osi.yaml, instead of prepending a status banner into the YAML. The validation status moves to the explain box.

Full Changelog: v2.9.0...v2.10.0

v2.9.0

11 Jun 09:23
4fe5932

Choose a tag to compare

[2.9.0] - 2026-06-11

Added

  • Model-level default locale (settings.defaultLocale). A model can declare a BCP-47 locale tag (e.g. de-DE) that becomes the default for result value formatting (thousand/decimal separators) on /v1/query/execute?format_values=true. Resolution order at request time: explicit ?locale=settings.defaultLocaleDEFAULT_LOCALE env. Added to ModelSettings and the JSON schema (replacing the never-implemented rich locale object that had shipped in the schema since the initial commit).
  • OSI ontology export. OBML models can now be exported to the OSI ontology layer (the conceptual EntityType/relationship layer defined by ontology.json), in addition to the existing OSI core-spec export. OSI validates the two layers with separate schemas and keeps them in separate documents, so the ontology is returned as a distinct, individually-valid artefact:
    • POST /v1/convert/obml-to-osi accepts include_ontology: true.
    • GET /v1/sessions/{id}/models/{mid}/osi accepts ?include_ontology=true.
    • When requested, the response carries ontology_yaml plus its own ontology_validation; the core-spec output_yaml is unchanged (default behaviour is fully backward-compatible).
    • Mapping: each dataObject becomes an EntityType concept; each join becomes a relationship whose multiplicity derives from the join joinType (many-to-one to ManyToOne, one-to-one to OneToOne); concept_mappings bind concepts to physical columns. Many-to-many joins, named secondary paths, measures/metrics, and column-level value concepts are not represented and surface as conversion warnings. See osi-obml/osi_obml_ontology_mapping_analysis.md.
    • The OSI ontology JSON Schema (osi-obml/osi-ontology-schema.json) is vendored and bundled into the wheel; its external core-spec $refs resolve against the local vendored core schema so validation never touches the network.
    • An ontology importer (OSI ontology to OBML) is intentionally deferred while OSI remains at 0.2.0.dev0; import the OSI core spec instead.

Fixed

  • OBML JSON schema realigned with the model. An audit of schema/obml-schema.json against models/semantic.py (the source of truth) surfaced drift, now resolved:
    • Top-level name is now accepted. The schema's root additionalProperties: false had been rejecting valid multi-model OBML that sets name:, even though the model and resolver fully support it.
    • Removed the vestigial locale object definition and measure.functions property (both present, unimplemented, since the initial commit). locale is superseded by settings.defaultLocale; the schema's deciamlSep typo disappeared with the removed block.
    • Trimmed the measure-filter definitions (filter, parameterValue) to the implemented surface — dynamic-date filters and time/timestamp/millis filter value types were schema-only and rejected at load. They remain a possible future feature, not a schema claim.
    • Added a regression guard (tests/unit/test_schema_model_alignment.py) asserting the schema stays aligned with the model.

Full Changelog: v2.8.0...v2.9.0

v2.8.0

02 Jun 08:58
e9618e1

Choose a tag to compare

[2.8.0] - 2026-06-02

Added

  • Session-scoped OSI model endpoints. Two new endpoints bridge the model store with Open Semantic Interchange (OSI), distinct from the existing stateless /v1/convert/* transforms:
    • POST /v1/sessions/{id}/models/from-osi accepts OSI YAML (osi_yaml), converts it to OBML, and loads the result into the session's model store. Returns the standard model summary plus conversion_warnings and the advisory OSI input_validation.
    • GET /v1/sessions/{id}/models/{mid}/osi exports a loaded model from the store as OSI YAML, with optional ?model_name=, ?model_description=, and ?ai_instructions= overrides.
  • ModelStore.get_raw(): public accessor for a model's raw OBML dict, preferring the faithful copy captured at load time and returning a deep copy so callers cannot mutate internal state.

Fixed

  • OSI converter not packaged for non-editable wheel installs. The converter (repo-root osi-obml/) was only discoverable via repo-root or /app paths, so a PyPI wheel or Docker install raised ModuleNotFoundError from the /convert and new OSI endpoints. The converter module and its OSI schema are now bundled into the wheel as package data under orionbelt/_osi_obml/ (hatch force-include), and the lookup searches that location first. Verified on a clean wheel install.
  • Mermaid ER diagram clipped every label by one character (string to strin, Supplier to Supplie). Mermaid pre-measures ER column and edge-label widths with the theme fontFamily, but the browser painted text with a wider font the host CSS cascaded in. Pinned a local-only font stack (Helvetica, Arial, sans-serif) in the diagram's %%{init}%% themeVariables.fontFamily and forced the same family on the rendered ER text, so measure-time and paint-time use the identical font.

Full Changelog: v2.7.10...v2.8.0

v2.7.10

01 Jun 15:57
602645d

Choose a tag to compare

[2.7.10] - 2026-06-01

Fixed

  • GET /v1/reference/schemas/obml and /v1/reference/schemas/query returned HTTP 500: Schema file '...' is missing from this deployment on every non-editable install (PyPI wheel and Docker / Cloud Run). The loader resolved the JSON Schema files via Path(__file__).resolve().parents[4] / "schema", which only equals the repo root in a source / editable layout; in an installed wheel that path points into site-packages and the files were never shipped there (packages = ["src/orionbelt"] excluded the repo-root schema/ directory). The test suite missed it because it runs editable, where the buggy assumption holds. The schema files are now shipped inside the wheel as package data under orionbelt/schema/ (via hatch force-include) and loaded through importlib.resources, with a source-tree fallback for editable checkouts. Added a regression test that exercises the loader directly.

Full Changelog: v2.7.9...v2.7.10

v2.7.9

27 May 17:36
3353dee

Choose a tag to compare

[2.7.9] - 2026-05-27

Fixed

  • Colab notebook still broken on v2.7.8: /v1/query/execute returned HTTP 503: ob-flight-extension package is not installed on every query cell. v2.7.8 dropped ob-flight-extension from the notebook's _REQUIRED map on the assumption the quickstart only queries via REST and never opens a Flight SQL connection. That assumption was wrong: src/orionbelt/service/db_executor.py imports ob_flight.db_router.get_credentials unconditionally for credential lookup on every dialect, including DuckDB. v2.7.8 unbroke API startup but moved the failure two cells later. Restored ob-flight-extension to the _REQUIRED map; PyPI now has 2.6.1 (published during the v2.7.8 cycle) with the cache= kwarg the API expects, so the install resolves cleanly. v2.7.8's except TypeError guard in the lifespan stays as forward-compat insurance.

Tooling

  • Notebook smoke workflow had been masking Colab regressions. The existing job ran inside uv sync --all-extras which always installs every drivers/* package as a workspace member; the test environment had ob-flight-extension==2.6.1 from local source even when PyPI was at 2.1.0. My v2.7.8 verification missed cell-level errors because the wrapper script crashed before printing PASS/FAIL and I read the bg task's exit code as success. Added a second workflow job notebook-pypi-equivalent that builds the OBSL wheel from PR source, installs it + side packages strictly from PyPI into a plain python -m venv (no uv, no workspace), executes the notebook end-to-end, and asserts every code cell ran cleanly (mermaid.ink transient 503s are filtered as the only allowed exception). The workflow now also triggers on src/orionbelt/** and pyproject.toml changes - both v2.7.7 and v2.7.8 shipped Colab regressions through src/ changes that the path-pinned trigger missed.

Background (this is the 5th release touching notebook bugs)

The drift was real and the root cause was incomplete test coverage:

Release Notebook fix What it missed
v2.7.5 Added notebook smoke workflow with xfail Workflow ran in uv workspace, mirrored Colab poorly
v2.7.6 #87 (install cell idempotent), #88 (show_yaml typo), #89 (UI fallback) Workflow still in workspace; Colab still broken upstream
v2.7.7 #94 (uv-venv-no-pip), #91 / #92 / #94 bundle Workflow finally green; Colab pip install ob-flight-extension resolves stale PyPI 2.1.0, TypeError on lifespan
v2.7.8 #96 (drop ob-flight from notebook + catch TypeError) Lifespan no longer crashes, but db_executor still requires ob_flight -> HTTP 503 on every execute
v2.7.9 Restore ob-flight (now that PyPI has 2.6.1) + add clean-venv workflow job Real gate against the regressions above

Full Changelog: v2.7.8...v2.7.9