Skip to content
Basic transformer tutorial
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

The Naive-Transformer

For tutorial we achieve the naive transformer that is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017).


  • python 3.5+
  • pytorch 0.4.1+
  • tqdm
  • numpy


The dataset is a naive dataset, randomly made by numpy.

The epoch is 1/10:
The average loss is 4.904755330085754.
The epoch is 2/10:
The average loss is 4.624333620071411.
The epoch is 3/10:
The average loss is 4.62061635017395.
The epoch is 4/10:
The average loss is 4.61753402709961.
The epoch is 5/10:
The average loss is 4.616219253540039.
The epoch is 6/10:
The average loss is 4.615866875648498.
The epoch is 7/10:
The average loss is 4.614339418411255.
The epoch is 8/10:
The average loss is 4.613479561805725.
The epoch is 9/10:
The average loss is 4.613105096817017.
The epoch is 10/10:
The average loss is 4.613206758499145.


MIT © Sohone Guo

You can’t perform that action at this time.