Skip to content

Rishab8077/Text_Generator_GPT2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

GPT2(Generative Pre-Trained Transformer 2)

The project is simple we give input as a phrase or sentence or question and as an output we get a small essay generated by this application. Since by name we can understand it is a text generator application. Example screenshot

screenshot (1)

As we can see from above example how the application was working. To handle the UI part we imported gradio library which makes work much more easier

Now lets try to know about its architecture and working of GPT2.

As we know Transformers consists of Encoders and Decoders having Self Attention and Feed Forward Neural Network at Encoders and Masked self attention, Encoder-Decoder self attention and Feed Forward NN at Decoders.
GPT2 has beautiful architecture in a sense that it doesnot have much of Encoders, It relys mostly on Decoders. It is also known as Transformer Decoder since most of the architecture relies on decoders of transformer. Like in smartphones based on the sentence we are typing it tries to predict the next word GPT2 works in a similary way but one that is much larger and more sophisticated than what our phone has. The way these models actually work is that after each token is produced, that token is added to the sequence of inputs. And that new sequence becomes the input to the model in its next step. This is an idea called “auto-regression”.The common difference between Self attention and masked self attention is self attention block allows to peak at the right side of the word i.e future word but in masked self attention (GPT2) it can onnly peak to the present word and previous word to the left. GPT performs way better than berts in generative analysis because GPT gives one token at a time and takes in consideration at next step.

steps

First we get the Token embedding i.e input embedding we also get Positional Encoding but after that we get Decode transformer block only. Screenshot (115)

This was basic knowledge of how GPT2 works to learn more about the model in depth visit : https://jalammar.github.io/illustrated-gpt2/

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published