Skip to content

This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters πŸ”₯. I train the model to be able to ask question from a given context

Notifications You must be signed in to change notification settings

logic-OT/Decoder-Only-LLM

Repository files navigation

Decoder-Only-LLM With Pytorch ✨

Don't forget to star the repo if you find this useful

This repository features a custom-built decoder-only language model (LLM) with 8 decoders and a total of 37 million parameters. I developed this from scratch as a personal project to explore NLP techniques which you can now pretrain to fit whatever task

image

πŸ“– Notebook Experiments

If you wish to run the code sequentially to see how everything works check out the Decoder-Only transformer notebook in the repo where I train the model to ask a question from a given context.

πŸ”§ Setup

  • Clone repository
git clone git@github.com:logic-ot/decoder-only-llm.git
cd decoder-only-llm
  • Configuration file

    • Configure certain variables from the config.py file such as dataset path, etc.

      Structure of config.py:

    config = {
        "data_path":"sample_data.json", #path to dataset
        "tokenizer": AutoTokenizer.from_pretrained('bert-base-uncased'),
        "device": torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu"),
        "model_weights": None, #path to model weights
        "learning_rate":0.0001,
        "num_batches":10,
    }
  • Training

    • After configuration, run the training script like so:

    python training.py
    

    The model should automatically begin training

  • Data Structure

    • Your data for training must be in the following format:

      ["conversation":[{"from":"human","value":"{some text}"},{"from":"gpt","value":"{some text}"}]
      See an example in sample_data.json in the repo
  • Inferencing

    • The inference script takes in a text and the queries the model for a response. To inference, run:

    python inference.py
    

πŸͺ„ Tips

  • I have a what-to-expect.pdf file in the repo that details some observations i made while training the model for the question asking task. This information applies to any other model you decide to pretrain.
  • Here is an excellent video that explains tranformers: 3 blue blue 1 brown

About

This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters πŸ”₯. I train the model to be able to ask question from a given context

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published