Releases: ralforion/orionbelt-semantic-layer
Releases · ralforion/orionbelt-semantic-layer
v2.16.0
[2.16.0] - 2026-06-23
Added
- OBML contract manifest (
schema/obml-contract.yml). A single, drift-checked inventory of the OBML field surface: every enum, class, and field with its camelCase alias, JSON-schema exposure, ontology property, and OSI round-trip status. CI fails if the Pydantic models, the JSON schema, or the ontology drift from it, enforcing the "OBML is the single source of truth" rule mechanically. - JSON Schema validation at the API ingestion boundary. Model-load and query endpoints (session, shortcut, and oneshot) plus the
MODEL_FILESstartup preload now validate payloads against the published JSON Schema and return a422on a violation, so the schemas are the load-bearing contract rather than just published artifacts. - Architecture quality gates in CI. Import dependency-direction enforcement, broad-except containment, RawSQL containment, compiler metamorphic invariants, and coverage floors (compiler/parser/api and the OSI converter package).
Changed
- Breaking: the query field
order_byis noworderBy. It was the only snake_case key in the query contract; it is now camelCase likeusePathNamesanddimensionsExclude. The query JSON schema (additionalProperties: false) rejectsorder_by, so update query payloads toorderBy. The Python field name staysorder_by(snake_case field names, camelCase aliases, as elsewhere), so PythonQueryObject(order_by=...)is unchanged. Docs, examples, and integration tool definitions were updated. - OBML payloads are held to the camelCase contract. The JSON schema is now camelCase-only (the duplicate snake_case
max_stalenessandintent_tagsspellings were removed). Payloads that previously relied on lenient coercion (snake_case keys, a stringversion, or uppercase enum values) are now rejected at the API; send canonical camelCase. The vendored schema in the OSI converter was refreshed to match. - Internal hardening, no SQL or endpoint behavior change. The compiler wrapper stage is now an explicit pass pipeline;
sessions.pydelegates to anapi/services/layer; the app owns its runtime explicitly with per-request isolation; and five large modules (compiler/resolution,compiler/cfl,ui/app, the OSI converter, and the Flight server) were split into focused submodules. All compiler drift snapshots are byte-identical.
Full Changelog: v2.15.0...v2.16.0
v2.15.0
[2.15.0] - 2026-06-18
Added
- Governed views in the Dremio demo. The demo bootstrap now creates a
governedDremio Space with one view per curated query (A1 raw lakehouse plus A2-A7 governed), so they can be browsed and queried by name. Demonstrates saving a governed query as a reusable view.
Fixed
- DECIMAL columns report as NUMERIC over the Postgres wire protocol (#116). Decimal measures/metrics were coarsened to FLOAT8, so values lost scale on display (
574585.00showed as574585.0, large values as1.6E7). They are now reported asNUMERIC(precision, scale)(from the model's declareddataType) across the query RowDescription, the catalog metadata, and the encoded values, so BI tools render the declared scale. - DuckDB's internal
mainschema is hidden from BI-tool schema browsers. Dremio's Postgres source enumerates schemas frompg_tables/pg_views; those are now shadowed (alongsidepg_namespace/information_schema.schemata) so only the per-model schemas appear. - Dremio federation view/filter pushdown now compiles. The pgwire flattener handles Dremio's nested derived-table pushdown (constant-folded dimension equality,
CAST(... AS DECIMAL)projections) so saved views and filtered queries work; it bails when an outer filter/order would cross an innerLIMITrather than return wrong rows. - HAVING filters on period-over-period metrics are applied. They were silently dropped by the PoP wrapper, returning unfiltered rows.
Changed
- Clearer errors for incompatible-artefact combinations. Period-over-period grain mismatch, fanout, and cross-fact two-column-aggregate errors are rewritten in plain language with remediation.
- Demo uses the
orionbeltbrand catalog as the pgwire database. The model is a schema (commerce) inside it; the Dremio source is renamed fromobsltoorionbelt(pathorionbelt.commerce.model).
Full Changelog: v2.14.0...v2.15.0
v2.14.0
[2.14.0] - 2026-06-16
Added
- Artefacts Composability Resolution (ACR). A new
composablesendpoint answers, for the query you have built so far, which other artefacts can still be added and yield a valid, fanout-free result.POST /v1/sessions/{id}/models/{mid}/composablestakes a query as the anchor (its dimensions and measures), andGET .../composables?anchor=...accepts one or more named anchors; both return composabledimensions,measures, andmetrics, pluscflMeasures/cflMetricsfor artefacts combinable only through the Composite Fact Layer. Top-level shortcuts (/v1/composables) auto-resolve a single session/model. ACR reuses the planner's directed join-graph reachability, so anything it reports as composable is guaranteed to compile. Seedocs/guide/composability.md. - Guided query building in the UI. The Gradio playground now highlights composable dimensions and measures/metrics in the artefact pickers as you edit the query (a check mark, with
(via CFL)for cross-fact candidates). Highlighting never hides artefacts, so independent (CFL) analyses stay discoverable.
Full Changelog: v2.12.0...v2.14.0
v2.12.0
[2.12.0] - 2026-06-14
Added
- Unified authentication across every surface. A single
AUTH_MODEselector (none/api_key/oidc) governs REST, Arrow Flight SQL, the Postgres wire protocol, the Gradio UI, and the MCP server. Off by default (AUTH_MODE=none), so the public demo and local dev are unchanged; production turns it on withAUTH_MODE=api_key+API_KEYS(comma-separated keys, rotated by overlap). Startup fails fast on an empty key list or a weak key (under 32 characters or low-entropy).oidcis reserved for a later release and is rejected loudly until then. See the newdocs/guide/authentication.md. - REST API-key auth. Every
/v1endpoint requires a valid key when auth is on;X-API-Key(configurable viaAPI_KEY_HEADER) andAuthorization: Bearerare both accepted. Missing credentials return401withWWW-Authenticate, invalid ones403./health,/robots.txt,/docs,/redoc,/openapi.json, and/uistay open, and/healthnow reportsauth_modeso clients can detect the requirement without a key. - Flight + pgwire auth on the shared key store. Arrow Flight validates the handshake credential (the API key) against the same store. The Postgres wire surface requires the key as a password, defaulting to SCRAM-SHA-256 (which never sends the key on the wire); operators can opt into cleartext with
PGWIRE_AUTH_MODE=password. The legacyFLIGHT_AUTH_MODE=token/FLIGHT_API_TOKENpath keeps working for one release with a deprecation warning. - UI credential forwarding. The Gradio UI reads
OBSL_API_KEYand forwards it on every REST call; browser users never see it. It logs a clear startup error when the API requires auth but no key is set. The co-hosted (embedded) UI also requiresOBSL_API_KEYto be set explicitly (see Security).
Changed
AUTH_ENABLEDis deprecated. It now acts as an alias forAUTH_MODE=api_keyand logs a startup warning. Migrate toAUTH_MODE.- FastAPI 0.137 compatibility. FastAPI 0.137 rejects empty-string route paths supplied via
include_router(prefix=...)("Prefix and path cannot be both empty"). The five affected routers (sessions, models, settings, dialects, reference) now declare their prefix on theAPIRouter()constructor instead, which keeps their root routes at the same URLs with no trailing slash. No version cap needed.
Security
- Embedded UI no longer auto-loads the server's API key.
/uiis a server-side proxy that can act on/v1, and it is not itself behind API-key auth. Previously, withAUTH_MODE=api_keythe embedded UI silently adopted the first configured key, turning/uiinto an open privileged proxy. It now injects a key only whenOBSL_API_KEYis set explicitly, and logs a warning that/uimust be network-protected when it is. - pgwire pre-auth DoS hardening. The startup + password/SCRAM handshake now runs under a hard deadline (
PGWIRE_AUTH_TIMEOUT_SECONDS, default 10s) so a stalled unauthenticated client cannot pin a connection slot. Post-startup frames are capped at 16 MB, and auth (password/SASL) frames at 64 KB, so a client cannot advertise a huge frame to exhaust memory before authenticating. - Heartbeat is exempt from global API-key auth.
POST /v1/heartbeatkeeps its ownAuthorization: Bearer <HEARTBEAT_AUTH_TOKEN>auth and is included outside the auth-bearing router, so its token is no longer rejected by the global auth's Bearer-as-API-key fallback. - Weak API keys are rejected at startup. Keys shorter than 32 characters or with low character diversity are refused (the server will not start), since short / low-entropy keys are vulnerable to offline attack on captured SCRAM transcripts. Generate a strong key with
python3 -c "import secrets; print(f'obsl_pat_{secrets.token_hex(20)}')". - Flight requires explicit
FLIGHT_ENABLED. The Arrow Flight SQL server no longer auto-starts merely becauseob-flight-extensionis installed; it starts only whenFLIGHT_ENABLED=true. This prevents silently exposing a SQL surface on0.0.0.0by package presence alone. When the package is present but the flag is off, startup logs a hint. (Deployments that relied on auto-start must now setFLIGHT_ENABLED=true.) - Unauthenticated Flight startup warns loudly. When Flight is enabled but starts without auth (no
AUTH_MODE=api_key, noFLIGHT_API_TOKEN) it now logs a prominent warning that it is exposing an unauthenticated SQL surface on0.0.0.0.
Full Changelog: v2.11.0...v2.12.0
v2.11.0
[2.11.0] - 2026-06-13
Added
- Filter pushdown through Postgres-federation BI tools (Dremio). When a tool like Dremio federates into OrionBelt's pgwire surface, its connector wraps the virtual
modeltable in a trivial derived table and lifts the predicate to the outer query (SELECT ... FROM (SELECT ... FROM model) WHERE ...). The pgwire translator now detects and flattens that wrapper, so dimension filters (WHERE) and measure filters (HAVING) execute through federation instead of being rejected as unsupported subqueries. SQL that is not this exact shape is left untouched, so genuinely unsupported subqueries still reject. - Result cache now serves the pgwire surface. The freshness-driven result cache used to be wired only into the REST query handlers, so BI tools querying over pgwire (Dremio federation, DBeaver, Tableau) always bypassed it. The compile -> cache key -> freshness TTL -> get -> on-miss execute -> set pipeline is extracted into a shared
orionbelt.api.query_cacheservice that REST and pgwire both use, so repeated pgwire queries are served from cache (GET /v1/cache/stats). Catalog / metadata probes never reach the semantic execute path and are never cached. Arrow Flight has a separate streaming execution path and is still pending (issue #117). - Period-over-period: multiple comparison offsets per query. One query can now combine PoP metrics with different offsets (e.g. month-over-month and year-over-year). They share a single date spine (same time dimension and base grain), and each distinct offset gets its own prior-period self-join. Previously all PoP metrics in a query silently reused the first metric's offset, producing wrong results for the others; a query mixing different base grains now raises a clear error instead.
- Dremio "semantic sidecar" demo (
demo/dremio/). A one-command, self-contained stack (MinIO + Dremio OSS + OrionBelt in single-model mode + the Gradio playground) that shows Dremio federating into OrionBelt over pgwire while OrionBelt compiles to the Dremio dialect and pushes execution back into Dremio over Arrow Flight. Includes a built-in raw-Parquet-vs-governed comparison, a runbook, and asset builders (DuckDB seed to Parquet, plus a Dremio-dialect model generated from the canonical commerce model).
Fixed
_metrics_metadatacatalog view exposes each metric's formula again. The pgwire/BI metadata view read a non-existentformulaattribute (the metric field isexpression), so theformulacolumn came back empty for every metric in every client (DBeaver, Tableau, Dremio, psql). It now shows the derived metric's expression and a synthesized formula for cumulative (e.g.avg(Total Sales) rolling 30 over Sales Date) and period-over-period (e.g.percentChange({[Total Sales]}, -1 year)) metrics. Not dialect-specific.- Cross-fact metrics no longer leak component-measure columns. A CFL (multi-fact) query that selected a derived/ratio metric (e.g.
Return Rate,Gross Margin) projected the metric's underlying component measures (e.g.Total Returns,Total Purchases) as extra result columns the caller never requested. The outer SELECT now projects only the requested dimensions, measures, and metrics (the components are still aggregated internally to feed the metric expression). Direct Postgres clients tolerated the extra columns, but Postgres-federation engines that pin a dataset's column set (Dremio) rejected them withINVALID_DATASET_METADATA; cross-fact metrics now work through Dremio federation. - Period-over-period metrics on Dremio. The PoP wrapper self-joined the base CTE under the alias
prev, which is a reserved word in Dremio and rejected as an unquoted table alias (Encountered "- prev"). The alias is nowpop_prev, so period-over-period metrics (e.g.Sales MoM Change) compile and execute on the Dremio dialect, including through Dremio's pgwire federation.
Changed
- Cleaner federated catalog in admin-curated mode. With
MODEL_FILESset, the pgwire/BI catalog now exposes only the curated models. Transient user/scratch sessions (REST clients, the Gradio playground) are no longer surfaced as one schema per session id under the source, so BI-tool schema browsers stay uncluttered. Dynamic (non-curated) mode still lights up REST-loaded sessions as before.
Full Changelog: v2.10.0...v2.11.0
v2.10.0
[2.10.0] - 2026-06-12
Added
osi-orionbeltconverter package. The bidirectional OBML <-> OSI converter is now a standalone, pip-installable package (Apache-2.0), developed in-repo as a uv workspace member underpackages/osi-orionbeltand published to PyPI alongside theob-*drivers. It ships a singleosi-orionbeltcommand withobml-to-osi/osi-to-obmlsubcommands (with--ontology), mirroring theosi-dbtconverter, and vendors all three schemas (the two OSI artefacts plus a synced snapshot of the OBML schema) so it builds and validates standalone with noorionbeltdependency.- OSI conversion is now an optional extra.
osi-orionbeltis no longer a hard dependency oforionbelt-semantic-layer, so a barepip install orionbelt-semantic-layerstays lean. The/convert,/models/from-osi, and/osiendpoints return a clear 503 when the converter is absent. Install it withpip install 'orionbelt-semantic-layer[osi]'(orpip install osi-orionbelt); theflightandflight-duckdb-onlydeploy extras bundle it, so the shipped API images keep OSI conversion working. The previousforce-includeof the converter into the wheel and theCOPY osi-obmllines in all three Dockerfiles are removed. - Third-party vendor extension preservation. OSI
custom_extensionsfrom vendors the converter does not handle internally (e.g.SNOWFLAKE,DBT,SALESFORCE,GOODDATA) now round-trip verbatim at the model, dataObject/dataset, column/field, and measure/metric levels. OSI has no separate dimension entity, so an OBML dimension's foreign extensions surface on its OSI field.
Changed
- Converter vendor identity. OBML -> OSI now tags OrionBelt-proprietary payloads as
ORIONBELT(wasCOMMON), and OSI -> OBML stashes OSI-native fields OBML cannot hold (unique keys, field labels, leftoverai_context) underOSI(was the misleadingOBSL). Read paths still accept the legacyCOMMON/OBSLtags, so previously emitted documents keep round-tripping. The converter self-identifies asosi-orionbeltin its roundtrip metadata.vendor_nameis an open string in OSI, so this is a non-breaking output change. - UI: Export as OSI. The Model Workbench export button (now
⬇ Export as OSI) shows clean, directly copyable OSI YAML in the preview box (relabelled to "OSI YAML (exported)", reset to "Generated SQL" on the next compile/validate/import) and downloads the result asmodel.osi.yaml, instead of prepending a status banner into the YAML. The validation status moves to the explain box.
Full Changelog: v2.9.0...v2.10.0
v2.9.0
[2.9.0] - 2026-06-11
Added
- Model-level default locale (
settings.defaultLocale). A model can declare a BCP-47 locale tag (e.g.de-DE) that becomes the default for result value formatting (thousand/decimal separators) on/v1/query/execute?format_values=true. Resolution order at request time: explicit?locale=→settings.defaultLocale→DEFAULT_LOCALEenv. Added toModelSettingsand the JSON schema (replacing the never-implemented richlocaleobject that had shipped in the schema since the initial commit). - OSI ontology export. OBML models can now be exported to the OSI ontology layer (the conceptual EntityType/relationship layer defined by
ontology.json), in addition to the existing OSI core-spec export. OSI validates the two layers with separate schemas and keeps them in separate documents, so the ontology is returned as a distinct, individually-valid artefact:POST /v1/convert/obml-to-osiacceptsinclude_ontology: true.GET /v1/sessions/{id}/models/{mid}/osiaccepts?include_ontology=true.- When requested, the response carries
ontology_yamlplus its ownontology_validation; the core-specoutput_yamlis unchanged (default behaviour is fully backward-compatible). - Mapping: each
dataObjectbecomes anEntityTypeconcept; each join becomes a relationship whosemultiplicityderives from the joinjoinType(many-to-onetoManyToOne,one-to-onetoOneToOne);concept_mappingsbind concepts to physical columns. Many-to-many joins, named secondary paths, measures/metrics, and column-level value concepts are not represented and surface as conversionwarnings. Seeosi-obml/osi_obml_ontology_mapping_analysis.md. - The OSI ontology JSON Schema (
osi-obml/osi-ontology-schema.json) is vendored and bundled into the wheel; its external core-spec$refs resolve against the local vendored core schema so validation never touches the network. - An ontology importer (OSI ontology to OBML) is intentionally deferred while OSI remains at
0.2.0.dev0; import the OSI core spec instead.
Fixed
- OBML JSON schema realigned with the model. An audit of
schema/obml-schema.jsonagainstmodels/semantic.py(the source of truth) surfaced drift, now resolved:- Top-level
nameis now accepted. The schema's rootadditionalProperties: falsehad been rejecting valid multi-model OBML that setsname:, even though the model and resolver fully support it. - Removed the vestigial
localeobject definition andmeasure.functionsproperty (both present, unimplemented, since the initial commit).localeis superseded bysettings.defaultLocale; the schema'sdeciamlSeptypo disappeared with the removed block. - Trimmed the measure-filter definitions (
filter,parameterValue) to the implemented surface — dynamic-date filters andtime/timestamp/millisfilter value types were schema-only and rejected at load. They remain a possible future feature, not a schema claim. - Added a regression guard (
tests/unit/test_schema_model_alignment.py) asserting the schema stays aligned with the model.
- Top-level
Full Changelog: v2.8.0...v2.9.0
v2.8.0
[2.8.0] - 2026-06-02
Added
- Session-scoped OSI model endpoints. Two new endpoints bridge the model store with Open Semantic Interchange (OSI), distinct from the existing stateless
/v1/convert/*transforms:POST /v1/sessions/{id}/models/from-osiaccepts OSI YAML (osi_yaml), converts it to OBML, and loads the result into the session's model store. Returns the standard model summary plusconversion_warningsand the advisory OSIinput_validation.GET /v1/sessions/{id}/models/{mid}/osiexports a loaded model from the store as OSI YAML, with optional?model_name=,?model_description=, and?ai_instructions=overrides.
ModelStore.get_raw(): public accessor for a model's raw OBML dict, preferring the faithful copy captured at load time and returning a deep copy so callers cannot mutate internal state.
Fixed
- OSI converter not packaged for non-editable wheel installs. The converter (repo-root
osi-obml/) was only discoverable via repo-root or/apppaths, so a PyPI wheel or Docker install raisedModuleNotFoundErrorfrom the/convertand new OSI endpoints. The converter module and its OSI schema are now bundled into the wheel as package data underorionbelt/_osi_obml/(hatchforce-include), and the lookup searches that location first. Verified on a clean wheel install. - Mermaid ER diagram clipped every label by one character (
stringtostrin,SuppliertoSupplie). Mermaid pre-measures ER column and edge-label widths with the themefontFamily, but the browser painted text with a wider font the host CSS cascaded in. Pinned a local-only font stack (Helvetica, Arial, sans-serif) in the diagram's%%{init}%%themeVariables.fontFamilyand forced the same family on the rendered ER text, so measure-time and paint-time use the identical font.
Full Changelog: v2.7.10...v2.8.0
v2.7.10
[2.7.10] - 2026-06-01
Fixed
GET /v1/reference/schemas/obmland/v1/reference/schemas/queryreturnedHTTP 500: Schema file '...' is missing from this deploymenton every non-editable install (PyPI wheel and Docker / Cloud Run). The loader resolved the JSON Schema files viaPath(__file__).resolve().parents[4] / "schema", which only equals the repo root in a source / editable layout; in an installed wheel that path points intosite-packagesand the files were never shipped there (packages = ["src/orionbelt"]excluded the repo-rootschema/directory). The test suite missed it because it runs editable, where the buggy assumption holds. The schema files are now shipped inside the wheel as package data underorionbelt/schema/(via hatchforce-include) and loaded throughimportlib.resources, with a source-tree fallback for editable checkouts. Added a regression test that exercises the loader directly.
Full Changelog: v2.7.9...v2.7.10
v2.7.9
[2.7.9] - 2026-05-27
Fixed
- Colab notebook still broken on v2.7.8:
/v1/query/executereturnedHTTP 503: ob-flight-extension package is not installedon every query cell. v2.7.8 dropped ob-flight-extension from the notebook's_REQUIREDmap on the assumption the quickstart only queries via REST and never opens a Flight SQL connection. That assumption was wrong:src/orionbelt/service/db_executor.pyimportsob_flight.db_router.get_credentialsunconditionally for credential lookup on every dialect, including DuckDB. v2.7.8 unbroke API startup but moved the failure two cells later. Restoredob-flight-extensionto the_REQUIREDmap; PyPI now has 2.6.1 (published during the v2.7.8 cycle) with thecache=kwarg the API expects, so the install resolves cleanly. v2.7.8'sexcept TypeErrorguard in the lifespan stays as forward-compat insurance.
Tooling
- Notebook smoke workflow had been masking Colab regressions. The existing job ran inside
uv sync --all-extraswhich always installs everydrivers/*package as a workspace member; the test environment hadob-flight-extension==2.6.1from local source even when PyPI was at 2.1.0. My v2.7.8 verification missed cell-level errors because the wrapper script crashed before printing PASS/FAIL and I read the bg task's exit code as success. Added a second workflow jobnotebook-pypi-equivalentthat builds the OBSL wheel from PR source, installs it + side packages strictly from PyPI into a plainpython -m venv(no uv, no workspace), executes the notebook end-to-end, and asserts every code cell ran cleanly (mermaid.ink transient 503s are filtered as the only allowed exception). The workflow now also triggers onsrc/orionbelt/**andpyproject.tomlchanges - both v2.7.7 and v2.7.8 shipped Colab regressions throughsrc/changes that the path-pinned trigger missed.
Background (this is the 5th release touching notebook bugs)
The drift was real and the root cause was incomplete test coverage:
| Release | Notebook fix | What it missed |
|---|---|---|
| v2.7.5 | Added notebook smoke workflow with xfail | Workflow ran in uv workspace, mirrored Colab poorly |
| v2.7.6 | #87 (install cell idempotent), #88 (show_yaml typo), #89 (UI fallback) | Workflow still in workspace; Colab still broken upstream |
| v2.7.7 | #94 (uv-venv-no-pip), #91 / #92 / #94 bundle | Workflow finally green; Colab pip install ob-flight-extension resolves stale PyPI 2.1.0, TypeError on lifespan |
| v2.7.8 | #96 (drop ob-flight from notebook + catch TypeError) | Lifespan no longer crashes, but db_executor still requires ob_flight -> HTTP 503 on every execute |
| v2.7.9 | Restore ob-flight (now that PyPI has 2.6.1) + add clean-venv workflow job | Real gate against the regressions above |
Full Changelog: v2.7.8...v2.7.9