This is a multi-task learning framework for SuperGlue benchmark. We mainly focus on designing multi-task training scheme and data augmentation techniques for large pre-trained language models such as BERT, RoBERTa, XLNet.
Currently, most of the models are adapted from Pytorch-Transformers
maintained by Hugging Face.
This repository is under consturction.
- inference time task selection (according to the task_id, only load specified classifier head instead of all)
- output superglue submissible format
First please make sure to install the necessary packages:
pip install -r requirements.txt
Install Pytorch-Transformers
: https://github.com/huggingface/pytorch-transformers#installation
Configure the environment variables to specify the data and experiment directories for checkpointing:
export SG_DATA=./data/superglue/
export SG_MTL_EXP=./exp/
Run demo training on a small set of toy examples:
python run_main.py --demo \
--tasks=BoolQ,CB,RTE \
--do_train \
--model_type=bert \
--model_name=bert-base-uncased \
--do_lower_case \
--max_seq_length=128
--output_dir=mtl-boolq-cb-rte_bert-base_max-seq-len-128_lr-1e-5 \
--batch_size=16 \
--logging_freq=2 \
--warmup_steps=0 \
--learning_rate=1e-5 \
--max_grad_norm=5.0 \
--num_train_epochs=15 \