Skip to content

mbosley/parsekit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ parsekit

Document intelligence primitives for extraction, structuring, and retrieval-ready context building.

Why this exists

This repo is the public, reusable core of a larger production workflow. It is designed to be clean, modular, and easy to adopt in other projects.

Current scope

  • Deterministic ingest + normalization
  • Structured extraction interfaces
  • Metadata/tagging pipelines
  • Search/index preparation utilities

Near-term roadmap

  1. Publish core parser contracts + schema
  2. Add extraction backends (PDF/office/plaintext)
  3. Add benchmark set for extraction fidelity
  4. Release first end-to-end doc intelligence example

Related work

  • Private implementation roots: serviu-rm-procurement-ai and related ops document workflows.

Status

Active build-out. Initial public baseline is focused on clean APIs, examples, and strong docs.

License

MIT

About

πŸ“„ Document intelligence primitives for extraction, structuring, and retrieval-ready context building.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors