Skip to content

[OCR integration] Add OnnxTR as possible OCR engine #1209

@felixT2K

Description

@felixT2K

Requested Feature

I noticed that your project already supports multiple OCR integrations (EasyOCR, Tesseract, and RapidOCR/PaddleOCR). I believe OnnxTR would be a valuable addition.

OnnxTR is a core-refactored version of docTR that runs on ONNX Runtime, similar to how RapidOCR is built on PaddleOCR.

What makes OnnxTR/docTR special?

  • Modular design: Detection, recognition, and orientation predictors can be used independently.
  • Robust to rotation: Handles rotated documents effectively and includes auto-correction.
  • Highly customizable: Each predictor has its own configurable EngineConfig.
  • Flexible installation options: Supports CPU, GPU, and OpenVINO, with CI testing on Windows, Linux, and macOS (Python 3.10–3.12).
  • Model hub integration: Models can be loaded directly from the Hugging Face Hub, e.g.:
    OnnxTR Models

Known Limitation

  • The default pre-trained models are trained on French vocabulary. However, one model on the Hugging Face Hub supports Latin Extended characters (covering English, French, German, Spanish, Portuguese, Italian, etc.).

  • Ongoing language expansion: Work is in progress to support additional languages such as Russian, Hebrew, Hindi, and more.

References

Testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions