Skip to content

skap3214/AGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AGPT

Goal

I want to create a LLM from the beginning. My goal for this project is to simultaneously learn and develop an LLM by myself. There is only one constraint: I should understand everything that I implement.

So far this is a simple decoder-only model which uses masked multi head attention to generate responses. I have followed along the Karpathy makemore series to create this teeny tiny model.

Model

UPDATE: AGPT_large_v3 has been trained. It has around 45M parameters and was trained on 1 A100 GPU and does output coherent sentences! In the process of training a larger model.

Here is a snippet of the response (max tokens set to 128 tokens):

Input: Once upon a time

Response (input concatenated with reponse): Once upon a time Jacky was very excited because today was was going on a special safari with her mom . It was a big , │ │ modern safari , shiny red and she couldn ' t wait to see the animals . So she asked the mom if they could see the │ │ lions , the elephants and go . They all thought it would be fun to see all the animals away . Ella started to feel │ │ very noisy , but she knew she had to leave . She asked her mom if they could stay out of their car and never come on │ │ an adventure . They drove for a few hours and Max ran towards the ostrich and Ella followed the ostrich with it . He │ │ felt very uncomfortable and weak . He was brought some snacks and drinks with her on it . Then he promised his │ │ favorite animals they would come over again soon . The ostrich was so thankful for her adventure and the two animals │ │ full of joy to see the animals in the jungle . They saw elephants , lions , tigers and elephants happily under the │ │ monkeys . They had so much fun ! From then on , the elephant and Ella shared treats , swimming and knew that no │ │ matter what the

Much better than the previous model!:

Input: center a div

Response (input concatenated with reponse): center a divis sal rorere ef to soss mipbin oa contsinmato oron ecit reclk thebeto th ur this ond. dent ais whand f whoun] secr can pd oxtiches ther soj or a l oa sok pri thirein ofdemicor tiop secyou co'p a lo i.J to ohes the yosos es of oree o othor sis bo'kan simiry pisror in tirone yos. Goud St smealeve te ' she.e. tor sag baslaconshat cand orexts,1 thtkert orlior dhe a ser+. T. wicul this bo, to no ches on cher in the l cor to to sfofou thev, yol cawe and ca oyound to sertroc r wac dandkOs rteecbe

Details

Each model version has these files in its corresponding folder:

  • config.py contains the hyperparameters of the model
  • model.py contains the code for the final model
  • training.py contains all the training code for the model.
  • generate.py contains the code for generating responses from the model
  • tokenizer.py contains the code to train the tokenizer for the model

Dataset

I am currently using datasets/tiny_stories_full.txt which I found from huggingface. I plan to use a med dataset for the next model I train.

References/Tutorials

List of all the resources I used to build this:

About

Creating and developing an LLM in PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages