Weighted-graph-VRD

This repository contains the official implementation of the paper titled Multimodal weighted graph representation for information extraction from visually rich documents.

Abstract

The paper introduces a novel system for information extraction from visually rich documents (VRD) using a weighted graph representation. It aims to enhance the performance of information extraction tasks by capturing relationships between various VRD components. VRD is modeled as a weighted graph, encoding visual, textual, and spatial features of text regions as nodes and edges representing relationships between neighboring text regions. Information extraction from VRD is treated as a node classification task using graph convolutional networks. The approach is evaluated across diverse documents, including invoices and receipts, achieving performance levels equal to or exceeding robust baselines.

Dependencies

Usage

Graph Builder Command

To build a graph-based dataset, use the following command:

$ python graph_builder.py -h

This command creates a graph-based dataset for node classification for a specific dataset to extract entities from Visually Rich Documents.

Optional Arguments:

-d DATASET, --dataset DATASET: Choose the dataset to use. Options are FUNSD, SROIE, Wildreceipt or CORD.
-t TRAIN, --train TRAIN: Boolean to choose between the train or test dataset.

Example:

$ python graph_builder.py -d FUNSD -t True

Training Command

To train the model, use the following command:

$ python train.py -h

This command trains the model on a selected dataset for node classification.

Arguments:

-d DATANAME, --dataname DATANAME: Select the dataset for model training. Options are FUNSD, SROIE, Wildreceipt or CORD.
-p PATH, --path PATH: Select the dataset path for model training.
-hs HIDDEN_SIZE, --hidden_size HIDDEN_SIZE: GCN hidden size.
-hl HIDDEN_LAYERS, --hidden_layers HIDDEN_LAYERS: Number of GCN hidden layers.
-lr LEARNING_RATE, --learning_rate LEARNING_RATE: The learning rate.
-e EPOCHS, --epochs EPOCHS: The number of epochs.

Example:

$ python train.py -d FUNSD -p data/ -hs 16 -hl 10 -lr 0.01 -e 50

Acknowledgments

We acknowledge the contributions of the authors of the paper and the developers of the libraries used in this project.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
License		License
README.md		README.md
args.py		args.py
graph_builder.py		graph_builder.py
main.py		main.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weighted-graph-VRD

Abstract

Dependencies

Usage

Graph Builder Command

Training Command

Acknowledgments

About

Releases

Packages

Languages

License

HamzaGbada/weighted-graph-VRD

Folders and files

Latest commit

History

Repository files navigation

Weighted-graph-VRD

Abstract

Dependencies

Usage

Graph Builder Command

Training Command

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages