GitHub - cchen23/layout_distribution_shift: Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents

Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents, presented at ACL Findings 2023.

This repo is built on top of VILA.

Installation

git clone git@github.com:cchen23/layout_distribution_shift.git
cd layout_distribution_shift
conda create -n lds python=3.6
conda activate lds
pip install -r requirements.txt

Getting Started

Data download:

To use models that require document images, download the images here and place them in layout_distribution_shift/data/grotoap2_images/

Performing experiments:

Scripts to perform the initial training phase and few-shot adaptation are in run_scripts/.

To perform the initial training phase:

bash train_publisher_splits.sh [test_publisher_name] [learning_rate] [model_name] [fewshot_lr] [random_seed]

After performing the initial training phase, to perform few-shot fine-tuning for a specific few-shot episode:

bash train_publisher_splits_fewshot_only.sh [test_publisher_name] [initial_training_learning_rate] [model_name] [fewshot_learning_rate] [fewshot_episode_num] [random_seed]

References

@inproceedings{chen-layout:2023:ACL,
  author={Chen, Catherine and Shen, Zejiang and Klein, Dan and Stanovsky, Gabriel and Downey, Doug and Lo, Kyle},
  title={Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2023},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data/grotoap2_images		data/grotoap2_images
metadata/grotoap2		metadata/grotoap2
run_scripts		run_scripts
src/vila		src/vila
tools		tools
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/grotoap2_images

data/grotoap2_images

metadata/grotoap2

metadata/grotoap2

run_scripts

run_scripts

src/vila

src/vila

tools

tools

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Installation

Getting Started

Data download:

Performing experiments:

References

About

Releases

Packages

Languages

License

cchen23/layout_distribution_shift

Folders and files

Latest commit

History

Repository files navigation

Installation

Getting Started

Data download:

Performing experiments:

References

About

Resources

License

Stars

Watchers

Forks

Languages