# Detecting Text from Handwritten Notes

## Step 1: Import the Required Libraries
- Import the OpenCV library for image processing
- Import the NumPy library for numerical operations
- Import the matplotlib.pyplot library for visualizations


In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

## Step 2: Read the Image File in Python and Display
- You may use either plt, imshow, or cv2.imshow for displaying the images

In [None]:
img = cv2.imread('AP_1.jpg')

# cv2.imshow('original', img)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

plt.imshow(img)
plt.title("Original")
plt.axis("off")
plt.show()

**Observation**
- The code will display the original image.

## Step 3: Deskew the Image (If Required)

- Check skewness and rotate the image using the function shown below
- Convert the image to grayscale
- Invert the grayscale image
- Apply thresholding to create a binary image
- Find the coordinates of non-zero (foreground) pixels in the binary image
- Determine the angle of rotation based on the coordinates of the foreground pixels
- Adjust the angle if it is negative
- Get the dimensions of the image
- Determine the center of rotation
- Generate the rotation matrix
- Apply the rotation to the image
- Return the rotated image
- Display the rotated image


In [None]:
def rot_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  # Covert to gray scale

    #  Inverts the grayscale image (black becomes white, white becomes black),
    # which helps in detecting the text or
    #  shapes better for certain types of image data (like documents).
    gray = cv2.bitwise_not(gray)

    # It makes the image either black or white, helping in the detection of boundaries for skewed content.
    thresh = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

    # Finds the coordinates of non-zero pixels in the thresholded image,
    # which represent the foreground objects or text.
    coords = np.column_stack(np.where(thresh > 0))

    # Finds the smallest rectangle that encloses the detected coordinates,
    # and from this rectangle, the rotation angle is extracted.
    # if the angle is less than -45 degrees, it adds 90 degrees to it, otherwise,
    # it negates the angle to bring it within a proper rotation range.
    angle = cv2.minAreaRect(coords)[-1]
    if angle < -45:
        angle = -(90 + angle)
    else:
        angle = -angle
    (h, w) = image.shape[:2]
    center = (w // 2, h // 2)


    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    #The image is rotated using the calculated rotation matrix, with cubic interpolation
    # to maintain image quality and BORDER_REPLICATE to handle any new border areas created during the rotation.
    rotated = cv2.warpAffine(image, M, (w, h),flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
    return rotated

In [None]:
rotated = rot_image(img)
# cv2.imshow('rotated', rotated)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

plt.imshow(rotated)
plt.title("Rotated")
plt.axis("off")
plt.show()

**Observation**
- This code will give the rotated image.

## Step 4: Remove Noise from the Image
- Apply median blur to the rotated image using a kernel size of 19
- Subtract the median-blurred image from the rotated image to remove the background
- Invert the resulting image
- Define a kernel matrix for the erosion operation
- Perform the erosion operation on the image using the defined kernel to remove white pixels from the edges and add black pixels

In [None]:
median = cv2.medianBlur(rotated, 19)

#Substracts removes the background from image
img2=cv2.subtract(median,rotated)
img2=cv2.bitwise_not(img2)

#Erodes removes white pixels from the edges and add black pixels to it
kernel = np.ones((3, 3), np.uint8)
img_erode = cv2.erode(img2, kernel, iterations=1)

In [None]:
# cv2.imshow('denoised', img_erode)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
plt.imshow(img_erode)
plt.title("Denoised")
plt.axis("off")
plt.show()

**Observation**
- The code processes the input image by removing the background, inverting its colors, and eroding it; it then displays the final **Denoised** image using matplotlib.

## Step 5: Text Thinning

- Further erode the denoised image using a 5x5 kernel
- This process aims to make the text lines thinner.
- Display the thinned image

In [None]:
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(img_erode,kernel,iterations = 1)

In [None]:
# cv2.imshow('thinning', erosion)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

plt.imshow(erosion)
plt.title("Thinning")
plt.axis("off")
plt.show()

**Observation**
- The code applies an erosion operation on the **Denoised** image using a 5x5 kernel matrix, which results in thinning the white regions of the image.

## Step 6: Perform Word Segmentation on the Image
- Convert the erosion image to grayscale
- Display the grayscale image using matplotlib


In [None]:
gray_scale = cv2.cvtColor(erosion, cv2.COLOR_BGR2GRAY)
# cv2.imshow('gray_scale', gray_scale)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
plt.imshow(gray_scale, cmap = 'gray')
plt.title("Gray scale")
plt.axis("off")
plt.show()

- Apply thresholding to the grayscale image with a threshold value of 80
- Uncomment the following lines to display the thresholded image using OpenCV
- Display the thresholded image using matplotlib


In [None]:
ret,thresh = cv2.threshold(gray_scale,80, 255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
# cv2.imshow('thresholding', thresh)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
plt.imshow(thresh, cmap = 'gray')
plt.title("Thresholding")
plt.axis("off")
plt.show()

- Define a kernel for dilation operation
- Apply dilation to the thresholded image using the defined kernel
- Display the dilated image using matplotlib


In [None]:
# Applying dilation
kernel = np.ones((3,15), np.uint8)
dilated = cv2.dilate(thresh, kernel, iterations = 1)
# cv2.imshow('Dilation', dilated)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
plt.imshow(dilated, cmap = 'gray')
plt.title("Dilation")
plt.axis("off")
plt.show()

**Observation**
- The code performs a dilation operation on the thresholded image using a kernel of size 3x15.
- Dilation aims to increase the white region in the image.

## Step 7: Find Rectangle
- Find contours in the dilated image using **cv2.findContours()**
- Sort the contours based on the y-coordinate of their bounding rectangle's top-left corner


In [None]:
(contours, heirarchy) = cv2.findContours(dilated.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
sorted_contours_lines = sorted(contours, key = lambda ctr : cv2.boundingRect(ctr)[1])

## Step 8: Draw Rectangle onto Words
- Create a copy of the rotated image to draw bounding rectangles
- Initialize an empty list to store word coordinates
- Iterate over each line contour in the sorted contours
- Iterate over each word contour in the sorted word contours
- Draw a bounding rectangle around the word on the **final_image**


In [None]:
final_image = rotated.copy() # original rotated image
words_list = []

for line in sorted_contours_lines:

    # roi of each line
    x, y, w, h = cv2.boundingRect(line)
    roi_line = dilated[y:y+w, x:x+w]

    # Draw contours on each word
    (cnt, heirarchy) = cv2.findContours(roi_line.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    sorted_contour_words = sorted(cnt, key=lambda cntr : cv2.boundingRect(cntr)[0])

    for word in sorted_contour_words:

        if cv2.contourArea(word) < 400:
            continue

        x2, y2, w2, h2 = cv2.boundingRect(word)
        words_list.append([x+x2, y+y2, x+x2+w2, y+y2+h2])
        cv2.rectangle(final_image, (x+x2, y+y2), (x+x2+w2, y+y2+h2), (255,255,100),2)

In [None]:
# cv2.imshow('Word_Segmented', final_image)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

plt.imshow(final_image)
plt.title("Word Segmented")
plt.axis("off")
plt.show()

**Observation**
- This code gives the visual representation of the rotated image where individual words are highlighted using bounding rectangles, displaying how effectively words have been segmented from the image.