Skip to content
/ HTE Public

Implementation of the TCS ION Remote Internship RIO-125

Notifications You must be signed in to change notification settings

Viddesh1/HTE

Repository files navigation

HTE (Hand Written Text Extraction)

TCS ION Remote Internships RIO-125 Project Name:- Automate extraction of handwritten text from an image

Project objective and brief:- To develop machine learning algorithm in order to enable entity and knowledge extraction from documents with handwritten annotations, with an aim to identify handwritten words on an image

Project Guidelines:-

  1. Prepare or collect some sample images containing handwritten text on them (these images can be some scanned document copies as well).

  2. The chosen image may include the following:- 2.1) Cursive handwriting 2.2) Poor image quality generated from frequently scanned documents 2.3) Skewed Images

  3. Develop a machine learning algorithm for detection and segmentation of handwritten text (word / sentences) from the chosen images.

  4. Test the application for reasonable accuracy.

Expected Project outcome:-

  1. Algorithm to detect and segment handwritten text from an image
  2. Detailed presentation with proof of reasonable accuracy

Link to Code and executable file:

  1. Datset link :-

https://www.kaggle.com/datasets/nibinv23/iam-handwriting-word-database

  1. Github link code and output of google colab in TcsInternship/HTR_Using_CRNN folder:-

https://github.com/Viddesh1/HTE

  1. Google Colab link :-

https://colab.research.google.com/drive/1RZESTtdWjVG_GJEBRHBzf7HdTx5uQgku?usp=sharing

  1. Kaggle code :-

https://www.kaggle.com/code/viddesh/hand-written-text-extraction-kaggle

  1. Kaggle output :-

https://www.kaggle.com/code/viddesh/hand-written-text-extraction-kaggle/output

  1. Kaggle code and output in github :-

https://github.com/Viddesh1/HTE_Kaggle

Releases

No releases published

Packages

No packages published