Skip to content

feat(oxabl_schema): add .df parser and Schema loader (Phase 2)#48

Merged
evanbrobertson merged 1 commit into
masterfrom
feat/oxabl-schema
Apr 18, 2026
Merged

feat(oxabl_schema): add .df parser and Schema loader (Phase 2)#48
evanbrobertson merged 1 commit into
masterfrom
feat/oxabl-schema

Conversation

@evanbrobertson
Copy link
Copy Markdown
Contributor

Summary

Adds the first new crate of the Semantic Layer v1 plan (Phase 2): oxabl_schema, a parser and in-memory model for Progress OpenEdge .df dump files.

  • Parser — hand-written recursive descent following Riverside Software's DumpFileGrammar.g4 (MIT, sonar-openedge). V1 scope: ADD TABLE, ADD FIELD, ADD INDEX with their attributes. ADD SEQUENCE/DATABASE/CONSTRAINT,
    UPDATE, DROP, RENAME, annotations, and the PSC trailer are accepted and silently skipped so format drift never hard-errors the loader. Unknown attributes on recognised directives round-trip as opaque (name, value)
    extras.
  • Schema modelSchema with arena-allocated Table/Field/Index, opaque TableId, and private-field SchemaRevision as the forward contract for future incremental reanalysis. Case-insensitive keying via OxablAtom
    (runtime interning — the string_cache spike from the round-two plan review confirmed the existing lexer regime works; no lasso fallback needed).
  • LoaderSchemaLoader::load_files merges multiple .df files with last-write-wins. Structured diagnostics: SCHEMA0010 (duplicate table, warn), SCHEMA0011 (duplicate field, warn), SCHEMA0012 (field-type conflict
    across files — error; field poisoned to SchemaType::Error), SCHEMA0030 (workspace-root containment), SCHEMA0031 (soft caps — 100 k tables / 10 k fields per table).
  • Fixtures — Riverside's sp2k.df (MIT) vendored under crates/oxabl_schema/fixtures/ as the integration golden. Covers multi-line quoted strings, multi-index tables, TABLE-TRIGGER/FIELD-TRIGGER modifiers, # comments
    in hand-edited files, and the PSC/cpstream trailer.

Testing

  • 40 unit tests (atom folding, schema types, parser constructs, loader merge rules, containment)
  • 5 integration tests against the vendored sp2k.df golden
  • Full workspace suite still green (cargo test --workspace)
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • Bench (cargo bench -p oxabl_schema): 5 MB merged .df in 7.2 ms (~694 MB/s). Plan target was 100 ms.

Introduces the first new crate of the semantic-layer v1 plan. Parses the
  common `.df` subset (`ADD TABLE/FIELD/INDEX` + their attributes)
following
  Riverside Software's `DumpFileGrammar.g4`; unknown directives and
unknown
  attributes round-trip silently so format drift never hard-errors the
  loader. Multi-file merge is last-write-wins with structured
diagnostics —
  including `SCHEMA0012` to catch field-type conflicts across files
  (data-integrity gap surfaced during round-two plan review).

  - `SchemaType` + `Schema`/`Table`/`Field`/`Index` arena with opaque
    `TableId`/`SchemaRevision`
  - `OxablAtom`-keyed case-insensitive lookup (zero-alloc fold via stack
    buffer, matching the lexer's keyword-match path)
  - `SchemaLoader::load_files` + `load_files_with_root` with
workspace-root
    containment (`SCHEMA0030`) and soft caps (`SCHEMA0031`)
  - 45 tests (40 unit + 5 integration against the vendored `sp2k.df` MIT
    golden); bench shows 5 MB merged load in ~7 ms (target was 100 ms)
@evanbrobertson evanbrobertson merged commit 7b121d2 into master Apr 18, 2026
6 checks passed
@evanbrobertson evanbrobertson deleted the feat/oxabl-schema branch April 18, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant