
# Optical Character Recognition (OCR)
- Basics of OCR techniques
- Text detection and extraction
- OCR libraries and applications

!["OCR overview"](https://assets-global.website-files.com/5d7b77b063a9066d83e1209c/61154671c05e0cda312c86eb_optical-character-recognition.png)

## 1. Image Preprocessing:
- Grayscale Conversion: Convert the image to grayscale. This simplifies the image data to a single channel and can enhance text features.

- Thresholding:
    Apply thresholding to create a binary image, separating text from the background. This helps in distinguishing text more clearly.
    
- Noise Reduction:
    Use techniques like blurring or morphological operations (erosion and dilation) to remove noise, enhancing the text's clarity.

## 2. Text Detection:
- Contour Detection:
    Find contours in the processed image. Text regions often form distinct contours due to their contrast with the background.

- Bounding Boxes:
    Create bounding boxes around these detected text regions. These boxes enclose the text areas, enabling extraction.

## 3. Text Extraction:
- Extract Text Regions:
    Crop the regions defined by the bounding boxes to focus only on the text areas.
## 4. Optical Character Recognition:
- Apply OCR:
     Use pattern recognition algorithms or libraries like custom implementations of neural networks or more commonly, Tesseract OCR (as mentioned previously) to recognize text within the cropped regions.

## 5. Steps FlowChart:
!["FlowChart OCR"](https://miro.medium.com/v2/resize:fit:694/1*qBV12ANk-5epRv7231Zxzw.png)




https://www.v7labs.com/blog/ocr-guide


## install pytesseract

https://medium.com/@BH_Chinmay/installation-of-tesseract-in-python-77daf712420f

In [10]:
! pip install pytesseract



In [17]:
import cv2
import pytesseract
import numpy as np

In [29]:
# load the input image from disk, convert it to grayscale, and blur
# it to reduce noise
path = "./imgs/logo.png"
image = cv2.imread(path)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)

In [31]:
# Apply Canny edge detection
edges = cv2.Canny(blurred, 60, 150)  # Adjust thresholds for optimal edge detection

# Find contours
contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw edges and contours on the original image
cv2.drawContours(image, contours, -1, (0, 255, 0), 2)  # Draw contours in green color
image_with_edges = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)  # Convert edges to 3-channel for display
result_image = cv2.hconcat([image, image_with_edges])  # Combine original image with edges

# Extract text regions using bounding boxes
for contour in contours:
    x, y, w, h = cv2.boundingRect(contour)
    cv2.rectangle(result_image, (x, y), (x + w, y + h), (255, 0, 0), 2)  # Draw bounding boxes in blue
    text_region = image[y:y + h, x:x + w]

    # Apply OCR on text regions
    text = pytesseract.image_to_string(text_region, lang='eng', config='--psm 6')
    print(text)

# Show the resulting image with edges, contours, and bounding boxes
cv2.imshow('Text Detection', result_image)
cv2.waitKey(0)
cv2.destroyAllWindows()


Convert

R

(C

()



## in deep learning detect text 
https://medium.com/technovators/scene-text-detection-in-python-with-east-and-craft-cbe03dda35d5

In [23]:
orig = image.copy()
(H, W) = image.shape[:2]

# Define the EAST text detector's parameters
net = cv2.dnn.readNet("model/frozen_east_text_detection.pb")  # Path to the pre-trained EAST text detector
layer_names = [
    "feature_fusion/Conv_7/Sigmoid",
    "feature_fusion/concat_3"
]

# Prepare the image for EAST text detection
blob = cv2.dnn.blobFromImage(image, 1.0, (W, H), (123.68, 116.78, 103.94), swapRB=True, crop=False)
net.setInput(blob)
(scores, geometry) = net.forward(layer_names)

# Set minimum confidence level and apply non-maxima suppression to get the text bounding boxes
min_confidence = 0.5  # Adjust this threshold for different images
boxes = []
for y in range(scores.shape[2]):
    scores_data = scores[0, 0, y]
    x_data0 = geometry[0, 0, y]
    x_data1 = geometry[0, 1, y]
    x_data2 = geometry[0, 2, y]
    x_data3 = geometry[0, 3, y]
    angles_data = geometry[0, 4, y]

    for x in range(scores.shape[3]):
        if scores_data[x] < min_confidence:
            continue

        # Calculate offset factor as the corresponding feature map point
        (offsetX, offsetY) = (x * 4.0, y * 4.0)

        # Extract the rotation angle for the prediction
        angle = angles_data[x]

        # Calculate the cosine and sine of the angle to get the rotation components
        cos = np.cos(angle)
        sin = np.sin(angle)

        # Calculate the height and width of the bounding box
        h = x_data0[x] + x_data2[x]
        w = x_data1[x] + x_data3[x]

        # Calculate the starting and ending (x, y)-coordinates for the text prediction bounding box
        endX = int(offsetX + (cos * x_data1[x]) + (sin * x_data2[x]))
        endY = int(offsetY - (sin * x_data1[x]) + (cos * x_data2[x]))
        startX = int(endX - w)
        startY = int(endY - h)

        # Add the bounding box coordinates and rotation angle to the list
        boxes.append((startX, startY, endX, endY, angle))

# Apply non-maxima suppression to suppress weak, overlapping bounding boxes
indices = cv2.dnn.NMSBoxesRotated(boxes, scores_data, min_confidence, 0.4)

# Loop over the indices and extract text regions for OCR
for i in indices:
    i = i[0]
    (startX, startY, endX, endY, angle) = boxes[i]
    angle = np.degrees(angle)
    angle = -angle if angle < -45 else 90 - angle

    # Rotate the bounding box and extract the text region
    center = ((startX + endX) // 2, (startY + endY) // 2)
    size = (endX - startX, endY - startY)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    rotated = cv2.warpAffine(orig, M, orig.shape[1::-1], flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
    cropped = cv2.getRectSubPix(rotated, size, center)

    # Apply OCR on the text region using Tesseract
    text = pytesseract.image_to_string(cropped)
    print("Detected Text:", text)

error: OpenCV(4.8.1) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\layers\concat_layer.cpp:109: error: (-201:Incorrect size of input array) Inconsistent shape for ConcatLayer in function 'cv::dnn::ConcatLayerImpl::getMemoryShapes'
