feat(python): add PP-OCRv6 paddle backend support#696
Conversation
- init_predictor + device_config: add PPOCRV6 to PPOCRV5 branches (PIR format, enable_new_ir/new_executor) - have_key + get_character_list: read character_dict from inference.yml instead of always returning False - default_models.yaml: add paddle.PP-OCRv6 det+rec (tiny/small/medium), rec has no dict_url (dict from yml) - test_engine.py: add EngineType.PADDLE to v6 det/rec parametrize - Verified: v6 paddle tests pass; v4/v5 paddle regression tests pass
|
Hi @SWHL, while working on the MNN backend for v6, I noticed that the v6 dict files haven't been uploaded to the For v4 and v5, these dict files are hosted under For v6, the dict files currently only exist in the PaddleOCR GitHub source repo: https://github.com/PaddlePaddle/PaddleOCR/tree/main/ppocr/utils/dict
I think the missing v6 dict files have caused some complications. On one hand, PR #696 required more adjustments to the engine code ( If the dict files are uploaded to |
… v6 rec Now that dict files are available on RapidAI/RapidOCR, revert the have_key/get_character_list workaround that read character_dict from inference.yml. Add dict_url to paddle v6 rec entries pointing to the official dict files. - paddle/main.py: have_key returns False, get_character_list returns [], remove yml_path logic - default_models.yaml: add dict_url to paddle v6 rec (tiny/small/medium) - init_predictor and device_config v6 branches remain (required for PIR format)
|
@SWHL Since the dict files are now available on the repo, I've submitted a new commit that reverts the |
There was a problem hiding this comment.
Pull request overview
This PR extends the Python Paddle inference backend to support PP-OCRv6 models (in addition to existing onnxruntime/openvino support) and wires that support into the default model registry and tests.
Changes:
- Treat
OCRVersion.PPOCRV6the same asPPOCRV5for Paddle predictor initialization (enable_memory_optim()path). - Apply the same Paddle CPU “new IR / new executor / optimization level” configuration for
PPOCRV6as forPPOCRV5. - Register Paddle PP-OCRv6 det/rec default model bundles and run PP-OCRv6 det/rec tests with the Paddle engine.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/tests/test_engine.py | Adds Paddle to the PP-OCRv6 det/rec parametrized engine coverage. |
| python/rapidocr/inference_engine/paddle/main.py | Extends Paddle predictor init logic to include PP-OCRv6. |
| python/rapidocr/inference_engine/paddle/device_config.py | Applies PPOCRv5-style CPU IR/executor settings to PPOCRv6. |
| python/rapidocr/default_models.yaml | Adds Paddle PP-OCRv6 det/rec model registry entries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ch_PP-OCRv6_rec_tiny: | ||
| model_dir: https://www.modelscope.cn/models/RapidAI/RapidOCR/resolve/master/paddle/PP-OCRv6/rec/PP-OCRv6_rec_tiny | ||
| inference.pdiparams: bb2f8f54d1e25f28c71b6fa4fe23f5940e159cae27fbee96155c99f822156e57 | ||
| inference.json: b5b14770c7dcf092781e92f4278a2ae5f95048f08b4b8a04140e88cb2745f147 | ||
| inference.yml: 66170210bad538e83fff3c4a3867e547d6bf20b50d64b20347c4b913f3034ea1 | ||
| dict_url: https://www.modelscope.cn/models/RapidAI/RapidOCR/resolve/master/paddle/PP-OCRv6/rec/PP-OCRv6_rec_tiny/ppocrv6_tiny_dict.txt |
|
Thanks for catching this. The initial commit did implement the |
Adds PP-OCRv6 paddle backend support alongside the existing onnxruntime/openvino. Two changes were needed beyond just registering model URLs:
init_predictor+device_config: v6 paddle models use the PIR format (inference.json), same as v5. AddedPPOCRV6to the existingPPOCRV5branches in bothpaddle/main.pyandpaddle/device_config.pyso v6 gets the same predictor configuration (enable_new_ir,enable_new_executor,set_optimization_level).Model registry: Added
paddle.PP-OCRv6section with det + rec (tiny/small/medium), pointing to the official files onRapidAI/RapidOCRmaster. rec entries includedict_urlpointing to the dict files uploaded by the maintainer.Verified: v6 paddle det/rec tests pass; v4/v5 paddle regression tests all pass (no behavior change).