GOT-OCR2.0 is unified End-to-End model for recognition text on images.
The GOT-OCR 2.0 model was introduced in the paper: General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
GOT-OCR 2.0 is a state-of-the-art OCR model designed to handle a wide variety of tasks, including:
- Plain Text OCR
- Formatted Text OCR
- Fine-grained OCR
- Multi-crop OCR
- Multi-page OCR
GOT-OCR 2.0 has also been fine-tuned to work with non-textual data, such as:
- Charts and Tables
- Math and Molecular Formulas
- Geometric Shapes
- Sheet Music
In this tutorial we consider how to convert and run GOT-OCR 2.0 model using OpenVINO Optimum Intel. Additionally, we demonstrate how to apply model optimization techniques like weights compression using NNCF.
The tutorial consists from following steps:
- Install requirements
- Convert and Optimize model
- Run OpenVINO model inference
- Launch Interactive demo
This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to Installation Guide.