This project aims to develop a web application for recognizing handwritten text from images using deep learning techniques. The application allows users to upload images containing handwritten text, processes the images to recognize the text, and provides the recognized text as output. Additionally, users can download the recognized text for further use.
Description:
The Handwritten Text Recognition Web Application is a Flask-based web application designed to recognize handwritten text from images. It utilizes a pretrained Transformer-based Optical Character Recognition (TrOCR) model for recognizing text and OpenCV for line segmentation in the uploaded images.
Features:
Upload Images: Users can upload images containing handwritten text using the provided file upload form.
Text Recognition: Upon uploading an image, the application processes the image to isolate individual lines of text and passes them through the pretrained TrOCR model for recognition.
Display Results: The recognized text is displayed on the web interface, showing the extracted text from each line of the uploaded image.
Download Text: Users have the option to download the recognized text as a text file. Clicking the "Download Text" button saves the recognized text as a text file for download.
Usage:
-
Install the required dependencies listed in requirements.txt.
-
First Run model.py.
-
Run the Flask application using the command python app.py.
-
Access the web application in your browser by navigating to http://localhost:5000.
-
Upload an image containing handwritten text.
-
View the recognized text displayed on the web interface.
-
Optionally, download the recognized text as a text file.
Project Structure:
app.py: Flask application handling routing, file upload, text recognition, and text download functionalities. index.html: HTML template file for the user interface of the web application. model.py: Script for initializing and saving the TrOCR model and processor. line_segmentation.py: Script containing functions for line segmentation in uploaded images. Technologies Used:
Flask HTML OpenCV Transformers PyTorch PIL (Python Imaging Library) Purpose: The purpose of this project is to provide a user-friendly tool for recognizing handwritten text from images. It can be used for various applications such as digitizing handwritten documents, extracting information from forms, and assisting visually impaired individuals in accessing printed content.
Future Improvements:
Improve the accuracy of text recognition through model fine-tuning and data augmentation techniques. Enhance the user interface with additional features such as text editing and formatting options. Implement support for recognizing multiple languages and scripts. Optimize performance for handling large volumes of image uploads and concurrent user requests.