Document intelligence primitives for extraction, structuring, and retrieval-ready context building.
This repo is the public, reusable core of a larger production workflow. It is designed to be clean, modular, and easy to adopt in other projects.
- Deterministic ingest + normalization
- Structured extraction interfaces
- Metadata/tagging pipelines
- Search/index preparation utilities
- Publish core parser contracts + schema
- Add extraction backends (PDF/office/plaintext)
- Add benchmark set for extraction fidelity
- Release first end-to-end doc intelligence example
- Private implementation roots: serviu-rm-procurement-ai and related ops document workflows.
Active build-out. Initial public baseline is focused on clean APIs, examples, and strong docs.
MIT