Finetuning-mT5-and-DistillGPT2-for-paraphrasing-tasks

This repository is part of my master's thesis. A custom training dataset was created for paraphrase tasks. This dataset combines the TaPaCO and PAWS dataset paraphrase datasets. It also shows how the finetuning was carried out on two transformer models (mT5 and DistilGPT2) in order to train these models for paraphrasing tasks.

The custom database created is found under the name dataset_comb.csv, in the folder Dataset comb(TaPaCo +PAWS) and is freely accessible. In the Notebook Databases combination + finetuning DistilGPT2 .ipynb, the code for generating the custom database is available, as well as the code to perform finetuning on DistilGPT2 with the hugging face transformers library training api. In the Notebook Finetuning mT5.ipynb the code for finetuning the mT5 model with the simple transformers library is shown.

Models used

mT5 is a variation of the original Google T5 model and was pre-trained on a giant data corpus in many languages. It is a multilingual model with encoder-decoder architecture. DistillGPT2 is a GPT2-inspired model that has undergone parameter distillation, meaning that the number of parameters has been reduced and the model has been compacted. This model it is a decoder-only model based on the architecture of the generative GPT models, and it is the lightest of its family.

Hyperparameters used for finetuning

The values chosen for the number of epochs, batch size, warm up date and learning rate are shown below. To see the other hyperparameters, it should be checked in each respective notebook

mT5

• Early stopping was used with a maximum of 5 epochs

• The batch size was 8

• The warm up ratio was set to 0.6

• A learning rate of 1e-4 was used

DistilGPT2

• Learning rate adjusted to 1e-5

• Trained in 5 epochs

• A warm up ratio of 0.6 was used

• The batch size was set to 8

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dataset comb(TaPaCo +PAWS)		Dataset comb(TaPaCo +PAWS)
.gitattributes		.gitattributes
Databases combination + finetuning DistilGPT2 .ipynb		Databases combination + finetuning DistilGPT2 .ipynb
Finetuning mT5.ipynb		Finetuning mT5.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetuning-mT5-and-DistillGPT2-for-paraphrasing-tasks

Models used

Hyperparameters used for finetuning

mT5

DistilGPT2

About

Releases

Packages

Languages

SergioMG97/Finetuning-mT5-and-DistilGPT2-for-paraphrasing-tasks

Folders and files

Latest commit

History

Repository files navigation

Finetuning-mT5-and-DistillGPT2-for-paraphrasing-tasks

Models used

Hyperparameters used for finetuning

mT5

DistilGPT2

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages