A curated collection of Jupyter notebooks for experimenting with state-of-the-art OCR, document parsing, table extraction, and chart understanding techniques. This repository enables easy benchmarking and practical usage of the latest open-source and cloud-based solutions for document image processing.
Notebook | Description |
---|---|
bytedance-dolphin-image-parsing.ipynb | Document page parsing with Dolphin by ByteDance |
docling-documents-parsing-and-tables-extraction.ipynb | Parsing and table extraction with Docling |
florence-2-large-ocr-documents-pages.ipynb | OCR of document pages using Florence 2 Large |
florence-2-large-ocr-images-real-life-scenarios.ipynb | Real-life scenario OCR with Florence 2 Large |
gemini-2-5-pro-on-chart-and-table-extraction.ipynb | Chart/table extraction using Gemini 2.5 Pro |
got-ocr2-0-docs-parsing.ipynb | Document pages parsing with GOT-OCR2.0 and Gemini 2.5 Flash |
marker-docs-parsing.ipynb | Marker-based document parsing experiments |
mistralocr-docs-parsing.ipynb | Document parsing using MistralOCR |
monkeyocr-docs-pages-parsing.ipynb | Document parsing with MonkeyOCR |
nanonets-OCR-s_docs_parsing.ipynb | Advanced document parsing using Nanonets-OCR-s |
ollama-llama3-2-vision-usage.ipynb | Using Llama3-2 Vision for document parsing |
paddleocr-3-0-docs-parsing.ipynb | Parsing with PaddleOCR 3.0 PP-StructureV3 |
pix2text-docs-pages-parsing.ipynb | Document parsing using Pix2Text |
smoldocling-documents-understanding.ipynb | Document understanding with SmolDocling |
zerox-pdf-parsing.ipynb | PDF parsing experiments with Zerox |
qwen2-vl-2b-docs-parsing.ipynb | Documents pages parsing with Qwen2-VL-2B |
- Benchmark different OCR/document parsing models on real documents.
- Demonstrate table, chart, and text extraction workflows.
- Compare open-source and commercial solutions.
- Provide ready-to-use code snippets for rapid prototyping.
-
Clone the repository:
git clone https://github.com/AdemBoukhris457/Docs_Parsing_Techniques.git
-
Install dependencies as needed for each notebook (see the first cells of each
.ipynb
for requirements). -
Launch Jupyter Notebook or JupyterLab and open any notebook of interest.
-
Run the cells and adapt the code for your documents.
- Some notebooks require model weights or API keys, check comments in each notebook for details.
- Results, insights, and sample outputs are provided inline.
📂 You can find more notebooks, experiments, and datasets related to document parsing and OCR on my Kaggle profile: 👉 https://www.kaggle.com/ademboukhris/code