GPT has become one of the most important and foundational language models today. This is a mix between the learning ability of Transformer based models, as well as OpenAI's ability to train on massive datasets with huge compute budgets. Today, we will be implementing this architecture in its simplest form with no frills. We will then train it on the Harry Potter dataset to see if we can start generating our own!
Definitely take a look at the LSTM for Generation before doing this, it'll give you a pretty good idea about whats going on if you do!