TCS ION Remote Internships RIO-125 Project Name:- Automate extraction of handwritten text from an image
Project objective and brief:- To develop machine learning algorithm in order to enable entity and knowledge extraction from documents with handwritten annotations, with an aim to identify handwritten words on an image
Project Guidelines:-
-
Prepare or collect some sample images containing handwritten text on them (these images can be some scanned document copies as well).
-
The chosen image may include the following:- 2.1) Cursive handwriting 2.2) Poor image quality generated from frequently scanned documents 2.3) Skewed Images
-
Develop a machine learning algorithm for detection and segmentation of handwritten text (word / sentences) from the chosen images.
-
Test the application for reasonable accuracy.
Expected Project outcome:-
- Algorithm to detect and segment handwritten text from an image
- Detailed presentation with proof of reasonable accuracy
Link to Code and executable file:
- Datset link :-
https://www.kaggle.com/datasets/nibinv23/iam-handwriting-word-database
- Github link code and output of google colab in TcsInternship/HTR_Using_CRNN folder:-
https://github.com/Viddesh1/HTE
- Google Colab link :-
https://colab.research.google.com/drive/1RZESTtdWjVG_GJEBRHBzf7HdTx5uQgku?usp=sharing
- Kaggle code :-
https://www.kaggle.com/code/viddesh/hand-written-text-extraction-kaggle
- Kaggle output :-
https://www.kaggle.com/code/viddesh/hand-written-text-extraction-kaggle/output
- Kaggle code and output in github :-