Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
-
Updated
Oct 31, 2022 - Python
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/
Datasets and Evaluation Scripts for CompHRDoc
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Run optical character recognition with PyTesseract from the FiftyOne App!
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
A Repo For Document AI
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Add a description, image, and links to the document-understanding topic page so that developers can more easily learn about it.
To associate your repository with the document-understanding topic, visit your repo's landing page and select "manage topics."