Skip to content

A simple GPT model for practice based on NanoGPT

Notifications You must be signed in to change notification settings

YamanSD/PracticeGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PracticeGPT

A tiny LLM based on GPT-2.

NumPy Python PyTorch

Overview

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of nanoGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code is divided into separate python modules.

loss

Because the code is so simple, it is very easy to hack to your needs, train new models from scratch, or finetune pretrained checkpoints (e.g. biggest one currently available as a starting point would be the GPT-2 1.3B model from OpenAI).

Installation

Install the requirements using the following command:

pip install -r requirements.txt

Acknowledgement

This implementation is logically identical to NanoGPT.

The purpose of this project is to apply and expand my knowledge about LLMs.