Skip to content

IBM/superglue-mtl

Repository files navigation

superglue-mtl

This is a multi-task learning framework for SuperGlue benchmark. We mainly focus on designing multi-task training scheme and data augmentation techniques for large pre-trained language models such as BERT, RoBERTa, XLNet. Currently, most of the models are adapted from Pytorch-Transformers maintained by Hugging Face.

This repository is under consturction.

Todo

  1. inference time task selection (according to the task_id, only load specified classifier head instead of all)
  2. output superglue submissible format

Quick Start

First please make sure to install the necessary packages:

pip install -r requirements.txt

Install Pytorch-Transformers: https://github.com/huggingface/pytorch-transformers#installation

Configure the environment variables to specify the data and experiment directories for checkpointing:

export SG_DATA=./data/superglue/
export SG_MTL_EXP=./exp/

Run demo training on a small set of toy examples:

python run_main.py --demo \
    --tasks=BoolQ,CB,RTE \
    --do_train \
    --model_type=bert \
    --model_name=bert-base-uncased \
    --do_lower_case \
    --max_seq_length=128
    --output_dir=mtl-boolq-cb-rte_bert-base_max-seq-len-128_lr-1e-5 \
    --batch_size=16 \
    --logging_freq=2 \
    --warmup_steps=0 \
    --learning_rate=1e-5 \
    --max_grad_norm=5.0 \
    --num_train_epochs=15 \

About

Boolean Question Answering with multi-task learning and uses large LM embeddings like BERT, RoBERTa

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages