Releases
v1.0.0
Compare
Sorry, something went wrong.
No results found
Features
Expanded Input Support
Add support for text (.txt), Markdown (.md), URLs, and serialized DoclingDocument inputs (023841f )
Introduce Input Normalization stage for automatic type detection, validation, and routing
Pipeline now skips OCR/segmentation for text inputs and reuses pre-processed DoclingDocuments
CLI Enhancements
convert command now accepts new input formats (023841f )
Improved input validation, URL handling, and clearer error messages
Architecture
Input Normalization Layer
Pipeline expanded from 4 → 5 stages with a new first-stage normalization layer (023841f )
Modular detectors, validators, and handlers for each input type
Extraction stage updated to support pre-normalized and pre-processed inputs
Security
docling : Bump docling dependency to version 2.70.0 to address nested dependencies impacted CVE listed issues (023841f )
Documentation
Input Format Documentation
New Input Formats page with CLI vs API support matrix (023841f )
Added examples for URL, Markdown, and DoclingDocument inputs
Architecture Diagrams
Updated all pipeline and architecture flowcharts (023841f )
Added a new diagram for the Input Normalization process
You can’t perform that action at this time.