GitHub - simtony/BART-word-orderer: Code for "On the Role of Pre-trained Language Models in Word Ordering: A Case Study with BART, COLING 2022"

Introduction

This repo contains implementation for the COLING 2022 paper On the Role of Pre-trained Language Models in Word Ordering: A Case Study with BART . It achieves state-of-the-art results on the classic word ordering task and the partial tree linearization task. Here is a short oral presentation for a quick grasp of the gist.

The implementation is based on fairseq. To see the modifications, compare the HEAD commit with the init with fairseq v0.10.2 comit, which is identical to the v0.10.2 tag of fairseq. Note that our implementation is only for research purpose and there is huge room for efficiency improvements.

Analysis with structural probing is based on structural-probes. As the analysis follows exactly the default settings, we only provide code to extract relevant token features.

Dataset

The license of the Penn Treebank prevent us from publicizing the dataset. Thus we only include data samples in ./ptb_trees. Feel free to contact the first author via simtony2@gmail.com with a prove (screenshot or something) that you have a copy of the Penn Treebank dataset. We will send you the full preprocessed copy.

Hardware Requirements

Make sure your GPU supports fp16 and has a large memory (e.g., 24GB). For reference, RAND results are produced on 2080Ti and BART on 32GB V100. Decoding with large beam size (e.g. 1024) are run on 80GB A100.

Dependency

The results are produced with torch==1.10. You may need to install mlrunner==0.5.8 to run the experiments and multiset for analysis.

Steps to reproduce

Pull the current repo and install the code base following the fairseq instructions. Change our directory to the root of this repo.
Download files of BART model
and extract/put the contents in ./bart.
Prepare the datasets with prepare_tree_raw.ipynb and prepare_tree_bin.ipynb.
For convienence we manage our experiments using mlrunner. See the comments in params*.yaml files for hyperparamters of each experiment. Use run -y params.yaml -t <title> -o output to train the RAND models for selected experiments (specified in <title>) and run -y params_decode.yaml -t <title> -o output_decode to decode. BART results can be similarly reproduced with params_bart*.yaml. See the document of mlrunner for detailed usage. If you prefer raw bash commands, you can use --dry-run to obtain them.
Follow analysis.ipynb to aggregate the results. Follow extract.ipynb to extract intermediate features for structrual probing. They should work as expected.

For reference we also include logs and checkpoints of each experiment in google drive, you can use tensorboard to visualize the training process.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
config		config
docs		docs
examples		examples
fairseq		fairseq
fairseq_cli		fairseq_cli
ptb_trees		ptb_trees
scripts		scripts
tests		tests
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
examine.py		examine.py
extract_feats.ipynb		extract_feats.ipynb
hubconf.py		hubconf.py
params.yaml		params.yaml
params_bart.yaml		params_bart.yaml
params_bart_decode.yaml		params_bart_decode.yaml
params_decode.yaml		params_decode.yaml
prepare_tree_bin.ipynb		prepare_tree_bin.ipynb
prepare_tree_raw.ipynb		prepare_tree_raw.ipynb
ptb_utils.py		ptb_utils.py
pyproject.toml		pyproject.toml
setup.py		setup.py
train.py		train.py

License

simtony/BART-word-orderer

Folders and files

Latest commit

History

Repository files navigation

Introduction

Dataset

Hardware Requirements

Dependency

Steps to reproduce

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages