v0.4.0
Pre-release
Pre-release
OntoBricks — Release Notes V0.4.0
Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 2003 passing, 80 skipped).
Highlights
- Lakebase GraphDB engine (full): Postgres-backed triple store via Databricks Lakebase Autoscaling completely replaces LadybugDB as the primary graph backend. Synchronization can be done in two (load) modes
—app_managed(direct streaming into Postgres)
—managed_synced(Lakeflow-managed Unity Catalog synced-table pipeline).
Both modes share the same 3-object Postgres layout (*_sync+*__appcompanion + union view). - Managed Sync pipeline end-to-end:
SyncedTableManagerhandles UC synced-table registration, Lakeflow pipeline polling, ghost control-plane state recovery, union-view creation, and all downstream Digital Twin build steps — with live progress visible in the app log and Build page. - Registry OBX Export / Import: Export one or several registry domains to a single
.obx(JSON) file with per-domain version-mode selection; import with per-domain Skip / Overwrite / Rename conflict resolution. Format-version gating ensures future backward compatibility. - Ontology Pitfalls Detector: D2KLab's OPD (Apache-2.0) integrated as a new Ontology sidebar panel. Detects 19 structural, logical, and semantic pitfalls across four categories, powered by an async TaskManager job. ML-heavy checks can be enabled optionally via
[pitfalls]extra. - HL7 FHIR R5 / R4B / R4 industry import: FHIR added as a fourth importable ontology alongside FIBO, CDISC, and IOF. OWL restriction-based property extraction, six domain buckets (Foundation required + Clinical/Diagnostics/Medications/Workflow/Financial), user-selectable version.
- Ontology labels throughout: Labels (with name fallback) now propagate to the KG detail panel, graph-chat agent responses, MCP tool outputs, ontology viewer link labels, and entity-type / predicate columns.
- Security: Closed urllib3 GHSA-mf9v / GHSA-qccp (#27, #28) by bumping to 2.7.0; GitPython bumped to 3.1.50 for GHSA-x2qx (CVE-2026-42215 follow-on); Mako ≥ 1.3.12, python-multipart ≥ 0.0.27 retained from v0.3.x.
- CI simplification: Sphinx HTML build removed from CI; generated artifacts gitignored;
scripts/build_docs.shretained for local on-demand builds.
Lakebase GraphDB Engine
- New pluggable graph backend (
graph_engine = "lakebase") alongside LadybugDB. Selected in Settings → Graph DB. - Process-wide Postgres connection pool with JWT-aware Lakebase auth (
lakebase/pool.py). LakebaseFlatStoreimplements DDL (triple table +datatype/langRDF columns), CRUD,VACUUM ANALYZEoptimize, bounded-memorybulk_insert_iter, keyed-paginationiter_triples,_sql_relationoverride for physical vs logical table name handling.- Factory-only dispatch (
GraphDBFactory._create_lakebase): Ladybug and Lakebase are mutually exclusive; engine config validated on save. LAKEBASE_AVAILABLEcapability flag;TripleStoreFactoryincludes Lakebase in availability detection.- Reference DDL:
src/back/core/graphdb/lakebase/schema.sql; autodoc page:docs/sphinx/api/app.core.graphdb.lakebase.rst.
App-managed companion layout
app_managedbuilds now use the same 3-object Postgres layout asmanaged_synced:*_sync— bulk warehouse data (streamed by the build pipeline)*__app— companion (reasoning / materialise writes)- union view — single read surface
LakebaseFlatStore.bulk_load_into_sync()for the build pipeline;_writable_table_id()always returns companion (*__app).drop_table()cleans up all 3 objects;optimize_table()vacuums both*_syncand*__app.
Settings — Graph DB tab
- Cascading Lakebase pickers: Project → Branch → Database → Schema (with manual pencil override). UC catalog picker triggers a UC schema picker for Managed Sync configuration.
- Health probe (
GET /settings/graph-engine/lakebase-health): usespg_catalogqueries (privilege-independent) and the sameresolve_lakebase_graph_schemalogic as the build pipeline. - Loading spinner while Graph DB tab data fetches.
- Live UC synced-table name preview (Settings → Managed Sync panel) showing all 4 Postgres / UC object names before the first build.
- Lakebase Objects panel: lists all user-visible schemas, tables, and views in the configured database; Drop button (owner-only, Bootstrap confirm modal) for objects the service principal owns.
- Local Graph Files panel hidden when Lakebase is selected (shows only for LadybugDB).
GET /settings/graph-engineandGET /settings/graph-engine-confignow allowed for all app users (POST remains admin-only).- Graph engine choice persisted in
global_config.configundergraph_engine/graph_engine_configand mirrored into the domain registry entry.
Schema resolution
resolve_lakebase_graph_schema: explicitgraph_engine_config.schemawins; falls back to Registry Volume schema; thenDEFAULT_GRAPH_SCHEMA(ontobricks_graph).resolve_lakebase_graph_database: explicitgraph_engine_config.databasewins; falls back toRegistryCfg.lakebase_database; then auth default.- UC sync FQN (
synced_uc_name) always uses the registry UC schema (RegistryCfg.schema) so the Lakeflow synced object lands in the same Unity Catalog namespace as all other registry artefacts.
Managed Sync / Lakeflow Pipeline
SyncedTableManager: handles UC synced-table registration, trigger (full refresh), Lakeflow pipeline polling (wait_for_completionviaget_update(update_id)+ idle-wait fallback),on_state_changecallback for live task context updates._normalize_statestripsSYNCED_TABLE_prefix from SDK enum names so terminal/in-progress sets match correctly.- Ghost control-plane state recovery: when a previous synced-table was deleted outside the API,
_is_ghost_control_plane_statedetects the conflict;ensure()triesDELETEthen re-CREATE, with_b/c/dfallback names if the primary slot is permanently reserved. ensure_synced_union_view()runs after Lakeflow materializes*_sync; schema-qualifies the_syncreference when Postgres and UC schemas differ; drops existing table with same name before creating view.auto-repair: if the union view is absent (crashed previous build),repair_synced_view_if_possiblerecreates it from the existing*_sync/*__appobjects.- Synced table payload uses
database_instance_name(project) +database_branch+logical_database_name(required by the Lakebase Synced Tables API for Autoscaling projects). LakebaseAuth.branch_nameproperty parses the branch segment from the PGHOST endpoint resource path.- Build pipeline: stores
lakebase_synced_uc/lakebase_pipeline_idin task context; frontend polls/dtwin/sync/pipeline-statusevery 6 s; 30-second terminal-OK grace window before build is declared complete. SELECT DISTINCTfix in R2RML-to-Spark SQL templates to prevent duplicate triples causing Lakeflow PK violations.- UC schema auto-created (
CREATE SCHEMA IF NOT EXISTS) before synced-table registration, withconn.commit()after DDL.
Digital Twin Build — UI & UX
- Build page Graph DB card: compact in-card build note showing
database.schema.table; existence badges fordtLakebaseTableExistsanddtLakebaseSyncedUcExists; Lakeflow line showscatalog.schema.<physical_table>_sync. - Build log card: engine-specific title, Lakebase Pipeline UC FQN,
archivestep hidden for Lakebase builds. - "Backing up graph to registry" archive step removed entirely for Lakebase deploys (no Volume backup needed).
- Post-build session cache (
_populate_session_cache): Lakebase path setsgraph_has_data = final_count > 0,graph_engine,registry_archive_applicable = False; LadybugDB path unchanged. - Per-section
_tscache timestamp prevents cross-section staleness (previously a shared clock caused "Loaded" badge + "not built" text contradiction). - Triplestore stats cache schema version (
_TS_STATS_CACHE_SCHEMA_VERSION = 2) invalidates old formatted strings on upgrade. - Legacy
local_lbug_exists/local_lbug_pathfield names retired; renamed tograph_has_data/graph_displaythroughout backend, build pipeline, and frontend.
Cockpit (Domain Validation)
- Graph DB card parity with the Build page: Database / Schema / Table / UC sync row layout;
psDtLakebaseTableExistsandpsDtLakebaseSyncedUcExistsexistence badges. HomeService.dtwin_detailenriched with all lakebase fields (lakebase_table_exists,lakebase_database,lakebase_schema,lakebase_table,lakebase_synced_uc,lakebase_sync_mode).triple_countprefersdt_existenceoverts_statusfor accuracy.
Registry OBX Export / Import
- New
src/back/objects/registry/obx_format.py:CURRENT_OBX_FORMAT_VERSION = 1, upgrader-chain pattern,build_envelope(),load()with format-version validation andmin_ontobricks_versiongate. - Export modes per domain:
all,active,latest,selected(per-version checkboxes). - Import: preview step shows per-domain conflict flags + suggested rename; apply step resolves each domain with
skip/overwrite/rename. 50 MB upload cap. - UI: Export modal (per-domain checkboxes + version mode selector) and Import modal (2-step: file picker → preview → decisions) on Registry → Browse page.
Ontology Pitfalls Detector
src/back/core/external/pitfalls/subpackage (vendored D2KLab OPD, Apache-2.0):OntologyPatternToolkitwith 19run_p*methods across P1–P4 categories;PitfallsServiceentry point serializes the rdflib Graph to temp TTL and returns grouped results.- Optional
[pitfalls]extra inpyproject.toml: sentence-transformers, scikit-learn, NLTK, SciPy. ML imports insidetry/exceptso taxonomy constants remain accessible without ML deps. - Three API routes:
GET /ontology/pitfalls/taxonomy,POST /ontology/pitfalls/analyze(async via TaskManager),GET /ontology/pitfalls/results/{task_id}. - New Ontology sidebar panel ("Pitfalls"): pattern selector with ⚡ (graph-only) and 💻 (ML-required) speed icons, Bootstrap tooltip per check, progress bar, accordion results by category.
scripts/start.shandrequirements.txtupdated to include--extra pitfallsalongside--extra lakebase.- Licenses documented: sentence-transformers (Apache-2.0), scikit-learn (BSD-3-Clause), NLTK (Apache-2.0), SciPy (BSD-3-Clause), D2KLab vendored code (Apache-2.0).
HL7 FHIR Industry Ontology
src/back/core/industry/fhir/FhirImportService.py:FHIR_DOMAINScatalog (6 domain groups),_fetch_fhir_ttl(version),_build_allowed_resources()(always includes_FHIR_COMPLEX_TYPES), OWL restriction-based property extraction via_extract_properties_from_restrictions.- Multi-version support: R4, R4B, R5 (default).
GET /ontology/fhir-versions; version threaded through fetch, transform, and import. - Base hierarchy stubs:
Base → Resource → DomainResource,Base → Element → DataType / BackboneElement / BackboneTypealways injected to avoid dangling parent references. - Self-referencing
fhir:Resourcebug fixed (parent initialised to"", not"Resource"); non-FHIR-namespacerdfs:subClassOfURIs (w5:, rim:, dc:) skipped during parent resolution. - Import UI: FHIR tab added alongside FIBO/CDISC/IOF; version dropdown (R4/R4B/R5); Foundation required guard uses
showConfirmDialog(not nativeconfirm()). - All four import handlers (FIBO, CDISC, FHIR, IOF) now use
showConfirmDialogfor Databricks Apps compatibility.
Cohort Discovery
CohortBuilder._resolve_predicate: BFS predicate alias map — resolves predicates in ontology namespace (#form) to data namespace (/form), then falls back to alias map by local name for cross-namespace predicates (e.g.ontobricks.com/ontology#hasclaimin a domain with a different base URI). Fixes silent zero-neighbour results for direct inserts and W3C OWL round-trips._outgoing_edge_indexnormalises every triple predicate with_resolve_predicatebefore indexing.- Cohort UI:
_dataPropsForClass(classUri)filters attribute dropdown to the hop-target entity's own data properties (was showing all properties). - Trace diagnostics:
in_frontier === 0guard prevents wrong "no neighbours" blame when the frontier is empty. - Synthetic test data (
data/customer/generate_data.py): shared electricity contract pool (40 slots, 35% pool share rate) ensures multi-customer contract nodes exist for cohort edge formation;%s→?parameter placeholder fix for Databricks SQL connector. - NL agent "Describe (prompt)" tab removed from Cohorts to simplify the workflow.
Knowledge Graph & Inference
- SWRL inference SQL:
build_inference_sqlnow uses BFS traversal (find_connected_vars+order_connected_props) for chained property atoms, matchingbuild_violation_sql. Fixes'' AS objectin multi-hop rules → 0 triples reaching Lakebase. - After
materialize_graphsuccess (count > 0),SigmaGraph.refreshCurrentExpansion()auto-reloads the visible graph; if no filter active, inline hint points to the KG tab. Zero-triples result badge changed from green OK to yellow Warning for non-HTTP(S) URI schemes. - Entity-type dropdown loop fixed: removed
data-sg-change="populateFilterEntityTypes"that re-triggered the fetch on selection. TripleStoreBackend.find_seed_subjectsnow treatsfield="any"as union of label-match and URI-match, matching the LadybugDB Cypher implementation.- KG right pane: incoming / outgoing predicate labels resolved via
findOntologyProperty(label → name → raw fallback). - KG page bottom cut off fixed:
.sidebar-content:has(#sigmagraph-section.active)strips container padding; section usesheight: 100%.
Ontology Designer UX
- Label preservation on save:
saveSharedEntity/saveSharedRelationshipcopyexisting.uriinto the updated object before overwriting, preventingprune_mappings_to_ontology_urisfrom treating the mapping as orphaned. - Mapping designer crash fixed:
mapping-import.jstop-levelgetElementById(...).addEventListener(...)calls changed to optional chaining (?.) to prevent null-dereference crashes in Databricks Apps. - Inheritance + relationship link resolution:
resolveNodeId()helper inmapping-design.jstries exact match, case-insensitive match, and URI local-part extraction, fixing missing links after OWL imports with unnormalized parent/domain/range values. - Active tab persistence:
_entityPanelActiveTab/_relPanelActiveTabremember the last active tab (Details / Attributes / Actions / Constraints / etc.) across entity/relationship selections in the designer right pane. - Ontology viewer labels: link objects carry
label: prop.label || prop.name; link text uses the label. findOntologyPropertyno longer requires label (label-only guard removed); callers decide the fallback.
Settings & Deployment
- Three-schema Lakebase permission bootstrap:
scripts/deploy.config.shaddsLAKEBASE_GRAPH_PROJECT/BRANCH/DATABASE(separate graph instance) andLAKEBASE_SYNC_SCHEMA(managed_synced schema).scripts/deploy.shcallsbootstrap-lakebase-perms.shfor each of the three schemas (registry, graph, sync). scripts/setup-lakebase.sh(new): creates a Lakebase project viaPOST /api/2.0/database/instances(Synced Tables-compatible API), waits for AVAILABLE, creates the Postgres database, and prints thedb-…segment fordeploy.config.sh. Mandatory for new projects — UI "New project" button uses a different API incompatible with Synced Tables.docs/deployment.md: added Step 0 (new-workspace setup), Step 5.1b,setup-lakebase.shreference in checklists.docs/lakebase-graphdb.md(new): architecture overview, prerequisites, provisioning guide, all config keys, write modes comparison, Postgres schema layout, scripts reference, permissions bootstrap order, Digital Twin build steps, troubleshooting section.- UC catalog/schema dropdown warehouse-ID fallback: uses
settings.sql_warehouse_id(env-injected fromsql-warehousebinding) when global config has not been saved yet. scripts/setup.shandscripts/start.shmigrated fromuv pip install -e ".[lakebase]"touv sync --extra lakebase,pitfalls.
CI & Developer Experience
- Sphinx removed from CI:
docs:job removed from.github/workflows/ci.yml;docs/sphinx/_build/gitignored and 283 stale artifacts removed from the index. Sources andscripts/build_docs.shretained for local on-demand builds. tests/test_settings_lakebase_tier.pydeleted (private methods removed in earlier refactor).tests/test_build_pipeline_streaming.py: migratedpatch(string)class/module path collisions topatch.objectfor Python 3.9 compatibility.
Security
- urllib3 2.7.0: closes GHSA-mf9v-mfxr-j63j (decompression bomb) and GHSA-qccp-gfcp-xxvc (header forwarding on redirect). Wheel fetched from pythonhosted.org via
UV_FIND_LINKSuntil proxy indexes it; lock entries reference canonical URLs. - GitPython 3.1.50: closes GHSA-x2qx / CVE-2026-42215 follow-on (newline injection in
config_writer()section param bypasses 3.1.49 patch). - Mako ≥ 1.3.12 (retained): CVE-2026-44307 / GHSA-2h4p TemplateLookup path traversal on Windows.
- python-multipart ≥ 0.0.27 (retained): CVE-2026-42561, unbounded multipart part-header DoS.
scripts/dl-vuln-fix-wheels.sh(temporary proxy workaround) removed once proxy indexes both packages.
Code Review Fixes
tests/test_build_pipeline_streaming.py: deleted deadTestStartBackgroundArchiveclass (called removed method, was causing a hard failure).src/api/routers/internal/domain.py:90: fixed{"success": False}for the no-Databricks-client case (unconfigured connection is not an error).src/api/routers/internal/settings.py: import from package (from back.objects.domain import SettingsService) not internal module.src/api/routers/internal/dtwin.py:HomeServiceandDomainpromoted to top-level imports;str | Noneunion syntax replaced withOptional[str]for Python 3.9 compatibility.
Upgrade Notes
- Lakebase project provisioning: if you created your Lakebase project via the Databricks UI "New project" button, it is not compatible with the Synced Tables API. Re-provision using
scripts/setup-lakebase.sh— seedocs/lakebase-graphdb.md §Prerequisites. - Three-schema permission bootstrap: run
make bootstrap-lakebaseafter every deploy. If you usemanaged_syncedmode, setLAKEBASE_SYNC_SCHEMAindeploy.config.shand the deploy script will grant the third schema automatically. - Graph DB schema field is now authoritative: if you have
graph_engine_config.schemaset in Settings → Graph DB, it takes precedence over the Registry Volume schema. Verify your Postgres schema matches what is shown in the Settings → Graph DB health card. local_lbug_exists/local_lbug_pathrenamed: any custom monitoring or integration consuming the/dtwin/sync/inforesponse must switch tograph_has_dataandgraph_display.[pitfalls]optional extra: Pitfalls detection requiresuv sync --extra pitfalls(orpip install ".[pitfalls]"). The panel will show a warning banner if ML deps are absent and only graph-only checks will run.- FHIR import (new): available immediately; Foundation domain group is always included; select version (R4/R4B/R5) in the import UI before clicking Import.
- OBX export/import: new Export / Import buttons appear on Registry → Browse. No migration required for existing domains.