# PaddleOCR Text Extraction Demo 🧠

This notebook demonstrates how to extract text and dimensions from scanned images using PaddleOCR.

## Contents:
- Overview of the Project
- Image Preprocessing
- OCR Extraction
- Displaying Results with Bounding Boxes

In [None]:
# Install necessary packages (uncomment and run if not already installed)
# !pip install paddleocr
# !pip install paddlepaddle
# !pip install opencv-python
# !pip install matplotlib
# !pip install numpy

In [None]:
import os
import re
import cv2
import numpy as np
from PIL import Image
from paddleocr import PaddleOCR, draw_ocr
import matplotlib.pyplot as plt

In [None]:
# Initialize PaddleOCR
ocr_model = PaddleOCR(use_angle_cls=True, lang='en')  # use_angle_cls=True helps with text rotation

In [None]:
# Load and display an example image
img_path = 'path_to_your_image.png'  # Replace with a demo image if possible
image = Image.open(img_path)
plt.imshow(image)
plt.axis('off')
plt.title("Input Image")
plt.show()

In [None]:
# Perform OCR
result = ocr_model.ocr(img_path, cls=True)

# Extract and display text
for line in result[0]:
    box, (text, confidence) = line
    print(f"{text} (Confidence: {confidence:.2f})")

In [None]:
# Draw results on image
image = cv2.imread(img_path)
boxes = [line[0] for line in result[0]]
txts = [line[1][0] for line in result[0]]
scores = [line[1][1] for line in result[0]]

# Draw and show
image_with_boxes = draw_ocr(image, boxes, txts, scores)
plt.figure(figsize=(10, 10))
plt.imshow(image_with_boxes)
plt.axis('off')
plt.title("OCR Output")
plt.show()

In [None]:
## Summary

This demo shows how to use PaddleOCR for text extraction from images. It highlights how OCR can be used in real-world scenarios such as invoice scanning, document parsing, and more.  
Due to data privacy, we used synthetic or generic images in this demo.

For the complete implementation, refer to the full script in the GitHub repo.