Transformer model implementation from scratch In this project, I have to implement the model proposed in the Attention is all you need paper. I have used the following stack to create the project: Python Pytorch Hugging Face tokenisers Hugging Face Opus Books Dataset