On-device OCR models v1 (PP-OCRv5 mobile)
On-device OCR model bundle for pdf_ocr_ondevice.
PdfOcrModels.ppOcrV5Mobile downloads these files on first use, then runs OCR entirely on device (ONNX Runtime) — no per-page network call.
Assets
| File | Size | SHA-256 |
|---|---|---|
PP-OCRv5_mobile_det.onnx |
4.82 MB | d5de5df3…4f4f7d |
PP-OCRv5_mobile_rec.onnx |
16.56 MB | 0030c6b0…d40a8d |
ppocrv5_dict.txt |
74 KB | d1979e9f…42af1b |
Provenance & license
Derived works of PaddleOCR PP-OCRv5 mobile (Copyright PaddlePaddle Authors), redistributed under the Apache License 2.0 — see LICENSE.txt and NOTICE.txt in the assets.
The two .onnx files were converted from the official PaddlePaddle inference models (inference.json + inference.pdiparams) with paddle2onnx (opset 14); no weights were retrained or altered. ppocrv5_dict.txt is the recognizer's character dictionary (18383 entries) extracted verbatim from the official PP-OCRv5_mobile_rec config.
Sources: https://huggingface.co/PaddlePaddle/PP-OCRv5_mobile_det · https://huggingface.co/PaddlePaddle/PP-OCRv5_mobile_rec
Verified
End-to-end recognition confirmed against the shipped Dart pipeline (rec vocab 18385 aligns with the dictionary + CTC blank; a black-on-white sample decodes exactly).