Local PDF-to-Markdown tooling for Arabic and bilingual texts.
It repairs broken extraction, decodes QCF Quran fonts, classifies pages, and renders semantic Markdown from local PDFs.
pip install versed-pdf
pip install versed-pdf[pdf]
pip install versed-pdf[pdf,ocr]from versed import extract_document
result = extract_document("book.pdf", title="Book")
print(result.markdown)versed repair-text "taf߬l"
versed detect book.pdf
versed classify book.pdf
versed extract book.pdf -o book.mdversed.repair: Sabon mojibake repair helpersversed.qcf: QCF Quran font decodingversed.classify: local page classification and backend selectionversed.routing: cost-aware routing heuristicsversed.layout: aligned words to semantic blocksversed.markdown: semantic blocks to Markdown/plain textversed.extract: end-to-end local extraction
MIT