Final project for SJTU graduate course:CS7347,NLU,2023 - A NLI task
- Split the train set into train set and valid set
- Create an enve
conda create -n nlu_project
pip install -r requirements.txt
python train_bert_base.py --exp_desc "only_TNLI" \
--do_adv 0 \
--lr 5e-5 \
--epochs 10 \
--batch_size 64 \
python train_bert_base.py --exp_desc "only_TNLI" \
--do_adv 0 \
--lr 5e-5 \
--epochs 10 \
--batch_size 64 \
--adv_train_type "fgm"
do_adv
is the switch of whether use adversarial training, adv_train_type
is the type of adversarial training.
Use T5 model to paraphrase in order to get more sentences pairs.
python data_augmentation.py --st_point 0 \
--ed_point 10000 \
--augmented_des 0
# st_point, ed_point means the start and end of the part in the dataset, which is going to be dealed.
# augmented_des means the description of this process.
Then, train the bert-base model as described in 1 and 2.
- Fine-tuning the teacher model:
python ftdebert.py
- Knowledge Distillation
python distillation.py --teacher_cache_dir $HOME/sileod_deberta_base_best/$finetuned_teacher_model \
--alpha 0.2 \
--temperature 2.0 \
--lr 5e-5 \
--epochs 25
python test_bert_base.py --path_trained_model $HOME/student_bert_base/best_model