Skip to content

taosong2019/UTel

Repository files navigation

UTel

Introduction

Knowing Where and What: Unified Word Block Pretraining for Document Understanding

Our code is based on BROS.

Pre-trained models

name # params
utel-base-uncased 110M
utel-large-uncased 340M

Fine-tuning on FUNSD

Prepare data

We conducted the FUNSD EE experiment based on the FUNSD data preprocessed in LayoutLM. Original code can be found in this link. To run it, please follow the steps below:

  1. move to preprocess/funsd/.
  2. run bash preprocess.sh.
  3. run preprocess_2nd.py. This scripts converts the preprocessed data in LayoutLM to fit this repo.

Data will be created in datasets/funsd/.

Perform fine-tuning

Run the command below:

CUDA_VISIBLE_DEVICES=0 python train.py --config=configs/finetune_funsd_ee_bies.yaml

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published