[The mT5 models were trained with the multitask objective, so it'll be harder to get good results with simple finetuning procedure.](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md#t511) This notebook conducts an experiment to find the level of accuracy we can get with a naive finetuning scheme.

In my experiment, AdaFactor with a custom learning rate schedule has been shown to perform better than AdamW and AdaFactor with the built-in schedule.

## Prepare the environment

In [None]:
!pip install --upgrade pip
!pip uninstall -y allennlp
!pip install transformers==4.1.1 typer
!pip install -U pytorch-lightning
!pip install https://github.com/veritable-tech/pytorch-lightning-spells/archive/master.zip
!pip install -U sentencepiece

## Get the code

In [None]:
!mkdir -p /src/finetuning-t5
!git clone https://github.com/ceshine/finetuning-t5.git -b master /src/finetuning-t5

In [None]:
%cd /src/finetuning-t5/mnli
%git checkout 13b9351

## Preprocess the dataset

In [None]:
!mkdir -p data/multinli_1.0
!cp -r /kaggle/input/multinli-nyu/multinli_1.0/* data/multinli_1.0/
!ls data/multinli_1.0/

In [None]:
!mkdir -p data/kaggle
!cp -r /kaggle/input/contradictory-my-dear-watson/* data/kaggle/
!ls data/kaggle/

## Only keep tokens that we need

In [None]:
!python preprocess/preprocess_kaggle.py
!python preprocess/preprocess_mnli.py
!python utils/reduce_sentencepiece_vocab.py google/mt5-base

## Tokenize

In [None]:
!mkdir cache/multinli
!python preprocess/tokenize_dataset.py multinli --tokenizer-name cache/mt5-base

## Fine-tune the model

In [None]:
!SEED=3313 python train.py --batch-size 64 --grad-accu 1 --max-len 128 --epochs 2 --t5-model cache/mt5-base \
    --lr 2e-3 --dataset multinli --disable-progress-bar --valid-frequency 0.5 --freeze-embeddings

## Calculate accuracy of the "matched" dev set

In [None]:
!python evaluate.py cache/mt5-base_best --corpus multinli --split-name test_matched --batch-size 32

## Calculate accuracy of the "mismatched" dev set

In [None]:
!python evaluate.py cache/mt5-base_best --corpus multinli --split-name test_mismatched --batch-size 32

## Export the Tensorboard log files and the trained model

In [None]:
!mv cache/tb_logs /kaggle/working
!mv cache/mt5-base_best /kaggle/working