# Handwritten Recognition Artificial Intelligence Project

## 1. Applications of the Method in Science and Practice

Convolutional Neural Networks (CNNs) have become the standard in image recognition tasks, including the recognition of handwritten characters. CNNs are widely used in both scientific research and industrial applications. Below are some notable examples:

### Scientific Applications
- **Document Analysis and OCR**: CNNs are used to convert scanned documents into machine-readable text.
  - *Reference*: LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. *Proceedings of the IEEE*.
- **Deep Learning for Historical Manuscripts**: Automatic transcription of ancient scripts for digital humanities.
  - *Reference*: Michael Fink, et al. (2011). HMM or CNN? Handwriting Recognition Revisited. *Document Analysis Systems*.

### Practical Applications
- **Postal Services**: Recognizing handwritten addresses on envelopes.
- **Banking**: Automated reading of handwritten checks.
- **Healthcare**: Digitization of handwritten prescriptions.
- **Education**: Automated grading of handwritten exams or homework.

---

## 2. Libraries, Functions and Functions' parameters

### Libraries Used

| Library | Purpose |
|--------|--------|
| `TensorFlow` | Model building and training |
| `TensorFlow Datasets (TFDS)` | Loading the EMNIST dataset |
| `Matplotlib` | Visualization of images |
| `NumPy` | Numerical operations |

### Key Functions and Parameters

| Function | Purpose | Important Parameters |
|---------|---------|----------------------|
| `tfds.load()` | Load EMNIST dataset | `split`, `as_supervised`, `with_info` |
| `tf.image.rot90()` | Rotate image | `k=3` for 270° rotation |
| `tf.image.flip_left_right()` | Flip horizontally | — |
| `tf.cast()`, `/255.0` | Normalize image pixels | — |
| `Conv2D()` | Convolutional layer | `filters`, `kernel_size`, `activation` |
| `MaxPooling2D()` | Downsampling | `pool_size` |
| `Dropout()` | Regularization | `rate` |
| `Dense()` | Fully-connected layer | `units`, `activation` |
| `model.compile()` | Compile the model | `optimizer='adam'`, `loss='sparse_categorical_crossentropy'`, `metrics=['accuracy']` |
| `model.fit()` | Train the model | `epochs`, `validation_data` |

---

## 3. Dataset Characteristics

| Property | Description |
|----------|-------------|
| **Dataset Name** | EMNIST Balanced |
| **Source** | TensorFlow Datasets (https://www.tensorflow.org/datasets/catalog/emnist) |
| **Number of Classes** | 47 (26 lowercase, 26 uppercase, and digits) |
| **Image Size** | 28 × 28 pixels |
| **Color Mode** | Grayscale (1 channel) |
| **Label Type** | Integer label mapped to Unicode character |

The dataset includes both digits and alphabetic characters, making it more challenging than MNIST. Each image is a 28×28 grayscale pixel representation of a single handwritten character.

---

## 4. Empirical Analysis

### Objective

The goal is to build a deep learning model using a Convolutional Neural Network (CNN) to classify handwritten characters from the EMNIST Balanced dataset with high accuracy.

---

### Assumptions

- The EMNIST dataset requires preprocessing to correctly orient the characters.
- CNN is assumed to be effective for feature extraction from image data.
- Accuracy is an appropriate metric for evaluating performance on this balanced classification task.

---

### Model Architecture

```python
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(num_classes, activation='softmax')
])
```


**Training Configuration:**

| Parameter        | Value                              |
|------------------|------------------------------------|
| Optimizer        | Adam                               |
| Loss Function    | Sparse Categorical Crossentropy    |
| Epochs           | 10                                 |
| Batch Size       | 128                                |
| Metric.          | Accuracy                           |

---

### Results

| Metric | Value |
|--------|-------|
| **Test Accuracy** | ~88% |
| **Test Lost** | 0.3473 |
| **Number of Parameters** | 229 807 |



---

### Interpretation

The trained model achieved a **test accuracy of 88%** and a **test loss of 0.3473** on the EMNIST balanced dataset. These results indicate that the model has learned meaningful patterns from the data and generalizes reasonably well on unseen samples.

With a total of **229,807 trainable parameters**, the model remains lightweight and efficient, making it suitable for applications with limited computational resources (e.g., mobile OCR systems).

However, an 88% accuracy also suggests that the model still misclassifies approximately 1 out of every 8 characters. The most frequently mistake are caused by Visually similar characters such as `'O'` vs `'0'`, `'l'` vs `'1'`, or `'S'` vs `'5'`

---

## 5. Recommendations for improvement

- **Data augmentation:** Introduce variations in rotation, scaling, and distortion to improve the model's robustness to real-world handwriting variations.

- **Expand the model slightly:** Consider increasing the number of filters or adding a residual block (ResNet-inspired) to better learn deeper features without overfitting.

- **Use pre-trained backbones:** Explore transfer learning using lightweight pre-trained CNNs like MobileNetV2 or EfficientNet with fine-tuning, to leverage richer features with fewer training epochs.

- **Train on combined datasets:** Combine EMNIST with other handwriting datasets (e.g., IAM, HW-R) to improve generalization and account for style diversity.

- **Move to sequence modeling:** Upgrade to word and sentence handwriting recognition level.

- **Deploy interactive demo:** Package the model into a web app (e.g., using Streamlit or Gradio) to visualize real-time predictions and get feedback from users for further improvements.

---

## Conclusion

This project successfully demonstrates how a CNN can be used to classify handwritten characters from the EMNIST dataset. The pipeline includes end-to-end data loading, preprocessing, model training, and evaluation. With further enhancements such as data augmentation or model tuning, this system can be deployed in real-world handwritten OCR applications.

---

**Author**: *Ky Anh Le, Giang Nguyen*  
**Institution**: *SGH Warsaw School of Economics*