LiLT (ACL 2022)

This is the official PyTorch implementation of the ACL 2022 paper: "LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding".

LiLT is pre-trained on the visually-rich documents of a single language (English) and can be directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models. We hope the public availability of this work can help document intelligence researches.

Installation

For CUDA 11.X:

conda create -n liltfinetune python=3.7
conda activate liltfinetune
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=11.0 -c pytorch
python -m pip install detectron2==0.5 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu110/torch1.7/index.html
git clone https://github.com/jpWang/LiLT
cd LiLT
pip install -r requirements.txt
pip install -e .

Or check Detectron2/PyTorch versions and modify the command lines accordingly.

Datasets

In this repository, we provide the fine-tuning codes for FUNSD and XFUND.

You can download our pre-processed data (~1.2GB) from here, and put the unzipped xfund&funsd/ under LiLT/.

Models

Model	Language	Size	Download
`lilt-roberta-en-base`	EN	293MB	OneDrive
`lilt-infoxlm-base`	MUL	846MB	OneDrive
`lilt-only-base`	None	21MB	OneDrive

If you want to combine the pre-trained LiLT with the RoBERTas of other languages, please download lilt-only-base and use gen_weight_roberta_like.py to generate your own pre-trained weight.

For example, combine lilt-only-base with English roberta-base:

mkdir roberta-en-base
wget https://huggingface.co/roberta-base/resolve/main/config.json -O roberta-en-base/config.json
wget https://huggingface.co/roberta-base/resolve/main/pytorch_model.bin -O roberta-en-base/pytorch_model.bin
python gen_weight_roberta_like.py \
     --lilt lilt-only-base/pytorch_model.bin \
     --text roberta-en-base/pytorch_model.bin \
     --config roberta-en-base/config.json \
     --out lilt-roberta-en-base

Or combine lilt-only-base with microsoft/infoxlm-base:

mkdir infoxlm-base
wget https://huggingface.co/microsoft/infoxlm-base/resolve/main/config.json -O infoxlm-base/config.json
wget https://huggingface.co/microsoft/infoxlm-base/resolve/main/pytorch_model.bin -O infoxlm-base/pytorch_model.bin
python gen_weight_roberta_like.py \
     --lilt lilt-only-base/pytorch_model.bin \
     --text infoxlm-base/pytorch_model.bin \
     --config infoxlm-base/config.json \
     --out lilt-infoxlm-base

Fine-tuning

Semantic Entity Recognition on FUNSD

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_funsd.py \
        --model_name_or_path lilt-roberta-en-base \
        --tokenizer_name roberta-base \
        --output_dir ser_funsd_lilt-roberta-en-base \
        --do_train \
        --do_predict \
        --max_steps 2000 \
        --per_device_train_batch_size 8 \
        --warmup_ratio 0.1 \
        --fp16

Language-specific (For example, ZH) Semantic Entity Recognition on XFUND

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_ser.py \
        --model_name_or_path lilt-infoxlm-base \
        --tokenizer_name xlm-roberta-base \
        --output_dir ls_ser_xfund_lilt-infoxlm-base \
        --do_train \
        --do_eval \
        --lang zh \
        --max_steps 2000 \
        --per_device_train_batch_size 16 \
        --warmup_ratio 0.1 \
        --fp16

Language-specific (For example, ZH) Relation Extraction on XFUND

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re.py \
        --model_name_or_path lilt-infoxlm-base \
        --tokenizer_name xlm-roberta-base \
        --output_dir ls_re_xfund_lilt-infoxlm-base \
        --do_train \
        --do_eval \
        --lang zh \
        --max_steps 20000 \
        --per_device_train_batch_size 2 \
        --warmup_ratio 0.1 \
        --fp16

Results

Semantic Entity Recognition on FUNSD

Language-specific Fine-tuning on XFUND

Cross-lingual Zero-shot Transfer on XFUND

Multitask Fine-tuning on XFUND

Acknowledge

The repository benefits greatly from unilm/layoutlmft. Thanks a lot for their excellent work.

Citation

If our paper helps your research, please cite it in your publication(s):

@inproceedings{wang2022LiLT,
  title={LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding},
  author={Wang, Jiapeng and Jin, Lianwen and Ding, Kai},
  booktitle={ACL},
  year={2022}
  }

Feedback

Suggestions and discussions are greatly welcome. Please contact the authors by sending email to eejpwang@mail.scut.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LiLTfinetune		LiLTfinetune
examples		examples
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
gen_weight_roberta_like.py		gen_weight_roberta_like.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiLT (ACL 2022)

Installation

Datasets

Models

Fine-tuning

Semantic Entity Recognition on FUNSD

Language-specific (For example, ZH) Semantic Entity Recognition on XFUND

Language-specific (For example, ZH) Relation Extraction on XFUND

Results

Semantic Entity Recognition on FUNSD

Language-specific Fine-tuning on XFUND

Cross-lingual Zero-shot Transfer on XFUND

Multitask Fine-tuning on XFUND

Acknowledge

Citation

Feedback

About

Releases

Packages

Languages

License

rlirli/LiLT

Folders and files

Latest commit

History

Repository files navigation

LiLT (ACL 2022)

Installation

Datasets

Models

Fine-tuning

Semantic Entity Recognition on FUNSD

Language-specific (For example, ZH) Semantic Entity Recognition on XFUND

Language-specific (For example, ZH) Relation Extraction on XFUND

Results

Semantic Entity Recognition on FUNSD

Language-specific Fine-tuning on XFUND

Cross-lingual Zero-shot Transfer on XFUND

Multitask Fine-tuning on XFUND

Acknowledge

Citation

Feedback

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages