
🎥 Recommended Video: [Optical Character Recognition (OCR)](https://www.youtube.com/watch?v=ZNrteLp_SvY)



## **6. Optical Character Recognition (OCR)**

### **6.1 What is OCR?**
Optical Character Recognition (OCR) is a technology that converts images of text (typed, handwritten, or printed) into machine-readable text. It is widely used in document digitization, license plate recognition, and extracting text from images.

#### **Key Applications of OCR**:
- **Document Digitization**: Converting scanned documents into editable text.
- **License Plate Recognition**: Automating toll collection and traffic monitoring.
- **Receipt Processing**: Extracting information from receipts for expense tracking.
- **Handwriting Recognition**: Converting handwritten notes into digital text.

---

### **6.2 How OCR Works**
OCR typically involves the following steps:
1. **Preprocessing**: Enhance the image quality (e.g., binarization, noise removal).
2. **Text Detection**: Locate regions of text in the image.
3. **Text Recognition**: Convert the detected text regions into machine-readable text.
4. **Post-processing**: Correct errors and format the output.

---

### **6.3 Code Example: OCR with Tesseract**
Tesseract is one of the most popular OCR engines. Let’s use the `pytesseract` library to perform OCR on an image.

```python
import pytesseract
from PIL import Image

# Load an image
image = Image.open('text_image.png')

# Perform OCR
text = pytesseract.image_to_string(image)

# Print the extracted text
print("Extracted Text:\n", text)
```

#### **Explanation**:
1. The `pytesseract.image_to_string()` function extracts text from the image.
2. The extracted text is printed to the console.

---

### **6.4 Code Example: OCR with EasyOCR**
EasyOCR is a user-friendly OCR library that supports multiple languages. Let’s use it to extract text from an image.

```python
import easyocr

# Initialize the EasyOCR reader
reader = easyocr.Reader(['en'])  # Specify the language(s)

# Perform OCR on an image
results = reader.readtext('text_image.png')

# Print the extracted text
for (bbox, text, confidence) in results:
    print(f"Text: {text}, Confidence: {confidence}")
```

#### **Explanation**:
1. The `easyocr.Reader()` initializes the OCR reader for the specified language(s).
2. The `reader.readtext()` function extracts text and its bounding box coordinates.
3. The extracted text and confidence scores are printed.

---

### **6.5 Code Example: OCR with OpenCV and Tesseract**
Let’s combine OpenCV for image preprocessing and Tesseract for OCR to improve accuracy.

```python
import cv2
import pytesseract

# Load an image
image = cv2.imread('text_image.png')

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to binarize the image
_, binary = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY_INV)

# Perform OCR on the preprocessed image
text = pytesseract.image_to_string(binary)

# Print the extracted text
print("Extracted Text:\n", text)

# Display the preprocessed image
cv2.imshow('Binary Image', binary)
cv2.waitKey(0)
cv2.destroyAllWindows()
```

#### **Explanation**:
1. The image is preprocessed by converting it to grayscale and applying thresholding.
2. The preprocessed image is passed to Tesseract for OCR.
3. The extracted text is printed, and the preprocessed image is displayed.

---

### **6.6 Code Example: OCR on Handwritten Text**
Handwritten text recognition is more challenging due to variability in handwriting. Let’s use Tesseract with additional configuration for handwritten text.

```python
import pytesseract
from PIL import Image

# Load an image of handwritten text
image = Image.open('handwritten_text.png')

# Perform OCR with custom configuration for handwritten text
custom_config = r'--oem 3 --psm 6'
text = pytesseract.image_to_string(image, config=custom_config)

# Print the extracted text
print("Extracted Text:\n", text)
```

#### **Explanation**:
1. The `--oem 3` flag specifies the LSTM OCR engine, which is better for handwritten text.
2. The `--psm 6` flag assumes a single uniform block of text.
3. The extracted text is printed to the console.

---

### **6.7 Code Example: OCR on a PDF Document**
OCR can also be applied to PDF documents. Let’s use the `pdf2image` library to convert a PDF into images and then perform OCR.

```python
from pdf2image import convert_from_path
import pytesseract

# Convert PDF pages to images
pages = convert_from_path('document.pdf', 500)  # 500 DPI

# Perform OCR on each page
for i, page in enumerate(pages):
    text = pytesseract.image_to_string(page)
    print(f"Page {i + 1} Text:\n", text)
```

#### **Explanation**:
1. The `pdf2image.convert_from_path()` function converts each page of the PDF into an image.
2. OCR is performed on each page using Tesseract.
3. The extracted text for each page is printed.

---

### **6.8 Code Example: OCR with Language Translation**
Let’s combine OCR with language translation to extract text from an image and translate it into another language.

```python
from googletrans import Translator
import pytesseract
from PIL import Image

# Load an image
image = Image.open('foreign_text.png')

# Perform OCR to extract text
text = pytesseract.image_to_string(image, lang='fra')  # French text

# Translate the extracted text
translator = Translator()
translated = translator.translate(text, src='fr', dest='en')

# Print the original and translated text
print("Original Text:\n", text)
print("Translated Text:\n", translated.text)
```

#### **Explanation**:
1. The `pytesseract.image_to_string()` function extracts text in French (`lang='fra'`).
2. The `googletrans.Translator()` translates the extracted text into English.
3. The original and translated text are printed.

---

### **6.9 In Conclusion**
- OCR is a powerful technology for extracting text from images.
- Libraries like Tesseract and EasyOCR make it easy to implement OCR.
- Preprocessing techniques (e.g., binarization) can improve OCR accuracy.
- OCR can be combined with other technologies (e.g., translation) for advanced applications.

---

In [None]:
!pip install pytesseract # Install the pytesseract package



In [None]:
!sudo apt install tesseract-ocr # install Tesseract OCR engine
!sudo apt install libtesseract-dev # install Tesseract development libraries

import pytesseract
from PIL import Image
from google.colab import files

# Set the path to the Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'/usr/bin/tesseract'

# Upload the image from your local system. This will prompt you to select the file.
uploaded = files.upload()

# Get the filename from the uploaded dictionary
# Assuming you've uploaded only one file, it'll be the first key in the dictionary
filename = list(uploaded.keys())[0]

# Open the image using the uploaded filename
image = Image.open(filename)

# Perform OCR
text = pytesseract.image_to_string(image)

# Print the extracted text
print("Extracted Text:\n", text)

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  tesseract-ocr-eng tesseract-ocr-osd
The following NEW packages will be installed:
  tesseract-ocr tesseract-ocr-eng tesseract-ocr-osd
0 upgraded, 3 newly installed, 0 to remove and 18 not upgraded.
Need to get 4,816 kB of archives.
After this operation, 15.6 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr-eng all 1:4.00~git30-7274cfa-1.1 [1,591 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr-osd all 1:4.00~git30-7274cfa-1.1 [2,990 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr amd64 4.1.1-2.1build1 [236 kB]
Fetched 4,816 kB in 1s (5,371 kB/s)
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debc

Saving test_image.jpg to test_image (1).jpg
Extracted Text:
  

Nasdaq & AMEX

Stocks in bold rose or fell 5% or more

 

bat.

Lie 8".\ @ updated stocks. Visit us on the web at
Suu) money.usatoday.com

Track your investments with our continuously

 

 

Stweek St-week
High Low Stoeie Last Chonge | High Low stock Last Change
45.7) 32.50 Biomet 17 0.0
A 278 1.20 Biomira 1s $0.03
—— 9.07 5.13 BloScrip 8.05 +0.24
5 $8.88 50,45 Biosite 50,05 —457
Be eo Ania ea We | seine Senet. oe ee
3.38 1351 ADA-ES 79.96 +216 | 850 1.40 BirchMten 6.52 —0.45
214 1288 ADC Telrs 721 40,13 | 182! 1073 Bickboud 1790 +0.
3.40 1670 ADECH 27) S073 | S273 1286 Blucont 41.9
W645 1047 AFC Ents 15.40 —O14 | 4435 24.15 BlueNiie 40.30
257 450 ASE Tst 776 +0. | 2645 199) BobEvn 22.99
19.25 1275 ASM Intl 17.65 —0.03 | 18:94 612 Bodisenny 15.45
20.92 13.94 ASML Hid 21.24 +0.46 Tet Bookham’ 5.94
27% VOI ASV ines 26.76 $0.14 a poriond sort
W982 1047 ATI Tech 17.87 +0.68 #0 Bestery S18
33.62 959 ATMI inc 29.95 +1.29 OO Bimi

In [None]:
# !pip install easyocr  # Installs the easyocr package
import easyocr
from google.colab import files

# Initialize the EasyOCR reader
reader = easyocr.Reader(['en'])  # Specify the language(s)

# Upload the image from your local system. This will prompt you to select the file.
uploaded = files.upload()

# Get the filename from the uploaded dictionary
# Assuming you've uploaded only one file, it'll be the first key in the dictionary
filename = list(uploaded.keys())[0]

# Perform OCR on the uploaded image using the filename
results = reader.readtext(filename)

# Print the extracted text
for (bbox, text, confidence) in results:
    print(f"Text: {text}, Confidence: {confidence}")



Saving text_image.jpg to text_image.jpg
Text: Nasdaq & AMEX, Confidence: 0.7483122106729966
Text: Stocks in bold rose or fell 5% Or more, Confidence: 0.7638438175891207
Text: USA  Track your investments with Our continuously, Confidence: 0.837896345000361
Text: ITODAY updated stocks Visit uS on the web at, Confidence: 0.3654205844376524
Text: comn, Confidence: 0.5729886889457703
Text: money usatoday com, Confidence: 0.7565649309519221
Text: S1irt, Confidence: 0.005685085600020861
Text: S2-week", Confidence: 0.1256924709638325
Text: Hiph, Confidence: 0.45263323187828064
Text: Lol, Confidence: 0.26652757935125276
Text: Slock, Confidence: 0.5366472517791053
Text: Losi Chonge:, Confidence: 0.3151874421549558
Text: Hlgh, Confidence: 0.38669881224632263
Text: Lox, Confidence: 0.7445352872391136
Text: Stock, Confidence: 0.6931300737472861
Text: Lost Chonje, Confidence: 0.8775090400050946
Text: 4571, Confidence: 0.7434910535812378
Text: 'J2 50, Confidence: 0.22079319580557966
Text: Blumel, Con