Skip to content

In this repository, I have explained the working of the Transformer architecture, provided the code for building it from scratch, and demonstrated how to train it.

License

Notifications You must be signed in to change notification settings

ES7/Transformer-from-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer-from-Scratch

Read my articles for a detailed explanation: How Transformer Works ; Build a Transformer from Scratch and Train a Transformer Model

I have chosen the translation task (English to Italian) to train my Transformer model on the opus_books dataset from Hugging Face. The training of this model was done on Kaggle using an NVIDIA Tesla P100 - 16GB GPU. It took 5 hours and 11 minutes for training over 20 epochs and each epoch has 3638 batches to train on.

About

In this repository, I have explained the working of the Transformer architecture, provided the code for building it from scratch, and demonstrated how to train it.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages