Semantic Entity Recognition of Handwritten Images using LayoutLMv3

This project focuses on extracting information from images and saving it in a JSON key-value pair format.

Prerequisites

Ensure you have the following dependencies installed:

PyTorch
torchvision

Dataset Creation and Labelling

This project requires a handwritten dataset. You can use the dataset example in handwritten-layoutlmv3/dataset/. Follow these steps if you want create and label your dataset:

Collect handwritten samples for your dataset.
Install and set up Label Studio.
Import your collected samples into Label Studio.
Label the samples according to your project requirements.

Ensure the dataset is properly labeled and saved in a format compatible with the OCR models used in this project.

Installation

Clone this repository.
Download the model and place it in the appropriate folder (Dowload Model).
Run the following command to install the necessary dependencies:

pip install -r requirements.txt

Note: Make sure to install PyTorch and torchvision before running pip install.

Usage

Run python convert_anno.py first to convert the previous annotation format to the appropriate format. Run python src/main.py for training. Make sure the number of classes matches the annotation and the model architecture. Run python src/inference.py to perform inference. Adjust the image path and classes before running and comment out the loss function on the trainer to prevent errors during forward propagation.

Limitations

While this project has demonstrated promising results, there are a few limitations to note:

The bounding box predictions from the trained model may not always be accurate. This could lead to errors in text detection and subsequently in the recognition and extraction of information.
The extraction of information into a JSON key-value pair format currently relies on manual logic. This may not be robust to variations in the data and could limit the scalability of the project.

These limitations present opportunities for future improvements and refinements to the project.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
dataset		dataset
src		src
5.jpg		5.jpg
README.md		README.md
anno-converted.json		anno-converted.json
convert_anno.py		convert_anno.py
loss_list.npy		loss_list.npy
output.json		output.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Entity Recognition of Handwritten Images using LayoutLMv3

Prerequisites

Dataset Creation and Labelling

Installation

Usage

Limitations

About

Releases

Packages

Languages

octadion/handwritten-layoutlmv3

Folders and files

Latest commit

History

Repository files navigation

Semantic Entity Recognition of Handwritten Images using LayoutLMv3

Prerequisites

Dataset Creation and Labelling

Installation

Usage

Limitations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages