Handwriting-Recognition

Handwriting Recognition (HWR) is the ability of computers to receive and interpret handwritten inputs. The inputs might be scanned from paper documents, images, etc. or sensed from the movement of pens on a special digitizer, for example. This is a Machine learning project for recognizing handwritten texts. The Handwriting Recognition model is built using TensorFlow and Keras. It consists of several components for data processing, model training, and inference.

Instructions: Must need python 3.10.0 version.

Prerequisites: Must have CUDA version 11.2 and cudNNN library version 8.1

Dataset: Move IAM_Words into project directory link: https://git.io/J0fjL

Setup Instructions:

Setup virtual environment: python -m venv [name]
Activate virtual environment: [name]\Scripts\activate
Install packages from requirements pip install -r requirements.txt
Run train.py if not trained

Documentation:

Overview The Handwriting Recognition model is built using TensorFlow and Keras. It consists of several components for data processing, model training, and inference.
Code Structure

train.py: Contains the script for training the handwriting recognition model.

configs.py: Defines the configuration settings for the model, such as paths, dimensions, and training parameters.

inferenceModel.py: Implements a class for using the trained model for inference on new images.

model.py: Defines the architecture of the handwriting recognition model using residual blocks and bidirectional LSTM layers.

Canvas.py: Defines the GUI using the Tkinter library. Users can draw on a canvas, submit drawings for recognition, and receive predictions with suggested corrections.
Training 3.1. Dataset Preparation The IAM Words dataset is used for training. The dataset is preprocessed, and image paths, labels, and vocabulary are extracted from the dataset.

3.2. Model Configuration Model configuration is handled by the ModelConfigs class in configs.py. It includes settings for paths, image dimensions, batch size, learning rate, and training epochs.

3.3. Data Processing The DataProvider class is used for loading and processing the dataset. Data augmentation techniques, such as random brightness, rotation, and erosion/dilation, are applied to the training data.

3.4. Model Architecture The model architecture is defined in model.py. It utilizes residual blocks and bidirectional LSTM layers.

3.5. Model Training The model is compiled using the CTC loss function and trained using the specified callbacks such as early stopping and model checkpointing. Training progress is logged, and the resulting model is saved im Models directory

3.6. Callbacks Several callbacks are used during training, such as early stopping, model checkpointing, and logging for TensorBoard.

3.7. Results Training and validation datasets are saved as CSV files, and model configurations are stored for reference.
Inference 4.1. Loading Trained Model The trained model is loaded using the ImageToWordModel class in inferenceModel.py. The class uses an ONNX inference model and includes a method for making predictions on new images.

4.2. Recognition Process The drawn image is converted to an RGB PIL Image for further processing. Preprocessing is applied to adjust the image size. The preprocessed image is passed through the trained model for text prediction. The predicted text is displayed in the GUI. Suggestions for corrections are shown based on the predicted text.

4.3. Inference on Validation Data The trained model is applied to the validation dataset. Predictions are made, and Character Error Rate(CER) is calculated for each prediction.

4.4. Visualization Images and their predictions are displayed, and average CER is computed.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
IAM_Tests		IAM_Tests
Images		Images
Models/03_handwriting_recognition		Models/03_handwriting_recognition
tensorboard graphs		tensorboard graphs
.gitignore		.gitignore
Canvas.py		Canvas.py
Handwriting_Recognition_Report__CSE4553_.pdf		Handwriting_Recognition_Report__CSE4553_.pdf
Iterations for Batch size 64.txt		Iterations for Batch size 64.txt
LICENSE		LICENSE
Project_Report_200042115_200042118_200042149.pdf		Project_Report_200042115_200042118_200042149.pdf
README.md		README.md
SpellChecking.py		SpellChecking.py
configs.py		configs.py
inferenceModel.py		inferenceModel.py
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt
tempCodeRunnerFile.py		tempCodeRunnerFile.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwriting-Recognition

About

Releases

Packages

Languages

License

GoonerMAK/Handwriting-Recognition

Folders and files

Latest commit

History

Repository files navigation

Handwriting-Recognition

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages