## Cursive English Handwritten Character Classification
### Problem Statement
- Develop neural network model for accurate classification of cursive English handwritten characters
### Motivation for Neural Network
- OCR applications in digitizing documents
- Automated form processing / accessibility tools
- Lack of cursive knowledge for younger generation / non-native English speakers
### Dataset Overview
- CVL Database [1]
- Seven different texts handwritten by 310 individual writers
- Cursive handwriting in multiple different styles
- Must separate German text from dataset

[1] Kleber, F., Fiel, S., Diem, M., & Sablatnig, R. (2018). CVL Database - An Off-line Database for Writer Retrieval, Writer Identification and Word Spotting [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1492267

# Set up Torch

In [13]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import torchvision
import torchvision.transforms as transforms
from PIL import Image

import torch.nn as nn
import torch.optim as optim
from torchsummary import summary
from sklearn.metrics import confusion_matrix, classification_report
from importlib import reload

# Checking if CUDA is available
flag_cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if flag_cuda:
    print("Using GPU")
else:
    print("Using CPU")

Using GPU


# Inputs and Outputs
## Inputs
* Gray-scaled, normalized, 775 by 120 images of cursive words
## Outputs
* Labels of the words inputted
# Evaluation Plan
* The model's performance will be assessed using:
* Accuracy as the primary metric.
* Precision, Recall, and F1-score to measure class-wise performance.
* Training and Validation Loss to monitor learning and overfitting.

# Create the Loaders

In [1]:
import words_helper as wp
from words_helper import createLoaders, BaselineCNN, CursiveGenerator, trainNet, trainCursiveNet

# Load data
classes, train_loader, valid_loader, test_loader, label_dict = wp.createLoaders(batch_size=32)

# Train classifier
clf = BaselineCNN(num_classes=len(classes))
trainNet(clf, train_loader, valid_loader, label_dict, epochs=12)


Done creating loaders
Epoch 1/12 Train Loss 3.6855
Accuracy: 27.12%
Macro Precision: 0.1258
Macro Recall: 0.1118
Macro F1-Score: 0.0951
Epoch 2/12 Train Loss 2.6837
Accuracy: 34.25%
Macro Precision: 0.1767
Macro Recall: 0.1593
Macro F1-Score: 0.1444
Epoch 3/12 Train Loss 2.3029
Accuracy: 41.19%
Macro Precision: 0.2307
Macro Recall: 0.1996
Macro F1-Score: 0.1892
Epoch 4/12 Train Loss 2.0608
Accuracy: 45.47%
Macro Precision: 0.2677
Macro Recall: 0.2304
Macro F1-Score: 0.2223
Epoch 5/12 Train Loss 1.8844
Accuracy: 48.21%
Macro Precision: 0.2881
Macro Recall: 0.2564
Macro F1-Score: 0.2498
Epoch 6/12 Train Loss 1.7509
Accuracy: 51.43%
Macro Precision: 0.3042
Macro Recall: 0.2750
Macro F1-Score: 0.2694
Epoch 7/12 Train Loss 1.6429
Accuracy: 52.15%
Macro Precision: 0.3230
Macro Recall: 0.2806
Macro F1-Score: 0.2766
Epoch 8/12 Train Loss 1.5418
Accuracy: 55.02%
Macro Precision: 0.3544
Macro Recall: 0.3067
Macro F1-Score: 0.3036
Epoch 9/12 Train Loss 1.4618
Accuracy: 56.12%
Macro Precision: 0.3

# Prior Work
* None of our team members have done prior work training a model to recognize cursive handwritting 