# OCR of Handwritten digits using OpenCV
OCR which stands for Optical Character Recognition is a computer vision technique used to identify the different types of handwritten digits that are used in common mathematics. To perform OCR in OpenCV we will use the KNN algorithm which detects the nearest k neighbors of a particular data point and then classifies that data point based on the class type detected for n neighbors.

Implemented a digit recognition system using k-Nearest Neighbors (kNN) model with OpenCV for OCR of handwritten digits.

- **Objective:** Recognize handwritten digits from images using computer vision techniques.
- **Methods:** Converted images to grayscale, split into 20x20 pixel blocks, and flattened into 400-dimensional vectors for training and testing.
- **Dataset:** Used a custom dataset of handwritten digits.
- **Training:** Trained a kNN classifier on 2500 samples with corresponding labels.
- **Testing:** Evaluated on 2500 test samples, achieving an accuracy of 94.3%.
- **Tools:** OpenCV for image processing and kNN for classification.

This project aimed to demonstrate basic digit recognition capabilities using simple feature extraction and classification methods.


In [25]:
#importing libraries 
import numpy as np
import cv2

In [26]:
# importing and reading data

# Read the image
image = cv2.imread('/Users/kanikawarman/Downloads/digits.png')

# Convert to grayscale
gray_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Check the dimensions of the image
height, width = gray_img.shape
print(f"Original dimensions: {height}x{width}")

Original dimensions: 1000x2000


In [27]:
#Image processing 

# Resize the image to dimensions that are multiples of 20
target_height = (height // 20) * 20
target_width = (width // 20) * 20

# Resize the image
if height != target_height or width != target_width:
    print(f"Resizing to: {target_height}x{target_width}")
    gray_img = cv2.resize(gray_img, (target_width, target_height))
                          
# Split the image into 20x20 blocks
divisions = [np.hsplit(row, target_width // 20) for row in np.vsplit(gray_img, target_height // 20)]

# Convert to Numpy array
NP_array = np.array(divisions)
print("NP_array shape:", NP_array.shape)  # Should be (7, 15, 20, 20)

# Flatten each 20x20 block into a 400-element vector
num_blocks_vertical = NP_array.shape[0]
num_blocks_horizontal = NP_array.shape[1]

# Ensure that we have an equal number of blocks
num_blocks = num_blocks_vertical * num_blocks_horizontal

NP_array shape: (50, 100, 20, 20)


In [31]:
#Model training

# Split the data into training and test sets
num_train_blocks = num_blocks // 2
num_test_blocks = num_blocks - num_train_blocks

# Prepare train and test data
train_data = NP_array.reshape(-1, 400).astype(np.float32)[:num_train_blocks]
test_data = NP_array.reshape(-1, 400).astype(np.float32)[num_train_blocks:]

# Prepare train and test labels
# Assume 10 unique labels (0 to 9)
num_classes = 10
train_labels = np.tile(np.arange(num_classes), num_train_blocks // num_classes + 1)[:num_train_blocks][:, np.newaxis]
test_labels = np.tile(np.arange(num_classes), num_test_blocks // num_classes + 1)[:num_test_blocks][:, np.newaxis]

# Check the shapes
print(f"train_data shape: {train_data.shape}")  # Should match train_labels
print(f"train_labels shape: {train_labels.shape}")
print(f"test_data shape: {test_data.shape}")
print(f"test_labels shape: {test_labels.shape}")

# Ensure matching dimensions
assert train_data.shape[0] == train_labels.shape[0], "Training data and labels count mismatch."
assert test_data.shape[0] == test_labels.shape[0], "Test data and labels count mismatch."

# Initialize kNN classifier
knn = cv2.ml.KNearest_create()

# Train the kNN model
knn.train(train_data, cv2.ml.ROW_SAMPLE, train_labels)

train_data shape: (2500, 400)
train_labels shape: (2500, 1)
test_data shape: (2500, 400)
test_labels shape: (2500, 1)


True

In [33]:
#Prediction
ret, output, neighbours, distance = knn.findNearest(test_data, k=5)

# Check the performance
matched = output == test_labels
correct_OP = np.count_nonzero(matched)

# Calculate accuracy
accuracy = (correct_OP * 1000.0) / output.size

# Display accuracy
print(f"Accuracy: {accuracy}%")

Accuracy: 95.6%
