On Evaluating the Robustness of Language Models with Tuning

Authors: Colin Wang, Lechuan Wang, Yutong Luo

Website: https://rachelluoyt.github.io/T5_SQuAD_Prompt_Tuning/

Pipeline for DSC 180B (not normally used by us, but for DSC 180B which asserts a certain format)

Build a container using zwcolin/180_method5:latest docker. Clone the repo, then at the root folder, run python run.py test. Warning: lots of time may be spent on downloading the data, pretrained model, tokenizer, and preparation. The testing itself may take ~30 minutes (not including downloading and building the dataset) to output evaluation metrics (we've modified the script for you so it just measures the first 10 examples, which may take around 30 seconds to intialize and process). If you do want to see some results, you may want to wait for quite a bit. Alternatively, some existing train/testing logging has been provided inside the the prompt_tuning folder. You can take a look at that instead of actually running the code.

Internal Pipeline

Manipulating Model in `run.py`

Training & Testing

Simply do bash experiment.sh and modify any experiment meta info as well as hyperparameter as necessary. The pipeline has been built to suit single/multi GPU configurations under a single server instance.

Deployment

A Dockerfile has been provided in the root folder to set up a docker environment. Note that this dockerfile has only been experimented at UCSD's DataHub. Use it with caution.

DSC 180B Specific Instructions

We don't strictly follow the structure of the given suggestions, with a test folder and a testdata folder inside it. It's too rigid. Instead, all the training and test data will be store inside the data folder and experiment.sh contains all the necessary code to build the model or to test the model for running the model. We don't like the way that you need to run test.py with some arguments in the command line because it's obviously not suitable for a deep learning project where there are way many possible arguments (i.e. you will likely have to type test.py -xx -xx -xx -xx, repeating - for dozens of times).

Reference

The script is based on the following paper: @misc{lester2021power, title={The Power of Scale for Parameter-Efficient Prompt Tuning}, author={Brian Lester and Rami Al-Rfou and Noah Constant}, year={2021}, eprint={2104.08691}, archivePrefix={arXiv}, primaryClass={cs.CL} }

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
docs		docs
prefix_tuning/webnlg/gpt2-large/50/2022-02-05-224555		prefix_tuning/webnlg/gpt2-large/50/2022-02-05-224555
prompt_tuning		prompt_tuning
transformers		transformers
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
collator.py		collator.py
engine_prefix_tuning.py		engine_prefix_tuning.py
engine_prompt_tuning.py		engine_prompt_tuning.py
experiment.sh		experiment.sh
gen.py		gen.py
kill.sh		kill.sh
main.ipynb		main.ipynb
main.py		main.py
model_prefix_tuning.py		model_prefix_tuning.py
model_prompt_tuning.py		model_prompt_tuning.py
nohup.out		nohup.out
prefix.yml		prefix.yml
prepare_data.py		prepare_data.py
prepare_webnlg.py		prepare_webnlg.py
run.py		run.py
run_generation.py		run_generation.py
run_prefix_tuning.py		run_prefix_tuning.py
run_prompt_tuning.py		run_prompt_tuning.py
submission.json		submission.json
trainer_prefix.py		trainer_prefix.py
utils_prompt_tuning.py		utils_prompt_tuning.py
viz_loss.ipynb		viz_loss.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On Evaluating the Robustness of Language Models with Tuning

Pipeline for DSC 180B (not normally used by us, but for DSC 180B which asserts a certain format)

Internal Pipeline

Manipulating Model in `run.py`

Training & Testing

Deployment

DSC 180B Specific Instructions

Reference

About

Releases

Packages

Contributors 3

Languages

zwcolin/Domain-Robustness-Prompt-Tuning

Folders and files

Latest commit

History

Repository files navigation

On Evaluating the Robustness of Language Models with Tuning

Pipeline for DSC 180B (not normally used by us, but for DSC 180B which asserts a certain format)

Internal Pipeline

Manipulating Model in run.py

Training & Testing

Deployment

DSC 180B Specific Instructions

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Manipulating Model in `run.py`

Packages