feat(oxabl_schema): add .df parser and Schema loader (Phase 2)#48
Merged
Conversation
Introduces the first new crate of the semantic-layer v1 plan. Parses the
common `.df` subset (`ADD TABLE/FIELD/INDEX` + their attributes)
following
Riverside Software's `DumpFileGrammar.g4`; unknown directives and
unknown
attributes round-trip silently so format drift never hard-errors the
loader. Multi-file merge is last-write-wins with structured
diagnostics —
including `SCHEMA0012` to catch field-type conflicts across files
(data-integrity gap surfaced during round-two plan review).
- `SchemaType` + `Schema`/`Table`/`Field`/`Index` arena with opaque
`TableId`/`SchemaRevision`
- `OxablAtom`-keyed case-insensitive lookup (zero-alloc fold via stack
buffer, matching the lexer's keyword-match path)
- `SchemaLoader::load_files` + `load_files_with_root` with
workspace-root
containment (`SCHEMA0030`) and soft caps (`SCHEMA0031`)
- 45 tests (40 unit + 5 integration against the vendored `sp2k.df` MIT
golden); bench shows 5 MB merged load in ~7 ms (target was 100 ms)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the first new crate of the Semantic Layer v1 plan (Phase 2):
oxabl_schema, a parser and in-memory model for Progress OpenEdge.dfdump files.DumpFileGrammar.g4(MIT, sonar-openedge). V1 scope:ADD TABLE,ADD FIELD,ADD INDEXwith their attributes.ADD SEQUENCE/DATABASE/CONSTRAINT,UPDATE,DROP,RENAME, annotations, and thePSCtrailer are accepted and silently skipped so format drift never hard-errors the loader. Unknown attributes on recognised directives round-trip as opaque(name, value)extras.
Schemawith arena-allocatedTable/Field/Index, opaqueTableId, and private-fieldSchemaRevisionas the forward contract for future incremental reanalysis. Case-insensitive keying viaOxablAtom(runtime interning — the
string_cachespike from the round-two plan review confirmed the existing lexer regime works; nolassofallback needed).SchemaLoader::load_filesmerges multiple.dffiles with last-write-wins. Structured diagnostics:SCHEMA0010(duplicate table, warn),SCHEMA0011(duplicate field, warn),SCHEMA0012(field-type conflictacross files — error; field poisoned to
SchemaType::Error),SCHEMA0030(workspace-root containment),SCHEMA0031(soft caps — 100 k tables / 10 k fields per table).sp2k.df(MIT) vendored undercrates/oxabl_schema/fixtures/as the integration golden. Covers multi-line quoted strings, multi-index tables,TABLE-TRIGGER/FIELD-TRIGGERmodifiers,#commentsin hand-edited files, and the
PSC/cpstreamtrailer.Testing
sp2k.dfgoldencargo test --workspace)cargo clippy --workspace --all-targets -- -D warningscleancargo bench -p oxabl_schema): 5 MB merged.dfin 7.2 ms (~694 MB/s). Plan target was 100 ms.