Custom GPT from Scratch using PyTorch

This repository provides a minimal and educational implementation of a GPT-style transformer model trained from scratch on Shakespearean text. The objective is to build a deeper understanding of how GPT works at its core without using pre-trained models or high-level APIs.

Repo Highlights

Trained GPT-style model on character-level Shakespeare data.
Custom BPE tokenizer used for encoding/decoding.
Model + tokenizer hosted on Hugging Face Hub.
Visualizations and reproducible notebooks included.
Supports script and notebook workflows.

Training Loss Curve

Below is the loss trend of the model during training over multiple epochs.

Quickstart

1. Load Pretrained Model from Hugging Face

To download the trained model (.pth) and tokenizer (.pkl) from Hugging Face and place them inside ./saved_models, run:

cd saved_models
python load_model.py

This will download:

model_shakespeare_new_v5_latest.pth
encoder_shakespeare_v5.pkl

Option A: Run Using Notebooks (Recommended)

Navigate to the end_to_end folder: Train the model: 01_train_gpt_from_scratch.ipynb

Generate text from the trained model: 02_inference_generate_text.ipynb

These notebooks are standalone and load everything required to train and run inference.

Option B: Run from Scratch (Notebook + Code)

Clone the repo and execute notebooks directly from the root:

git clone https://github.com/khotveer/custom-gpt-using-pytorch.git
cd custom-gpt-using-pytorch

Run 01_train_gpt_from_scratch.ipynb to train a new model. Run 02_inference_generate_text.ipynb to generate predictions.

Model Info

Architecture: GPT-style transformer (n_layer=8, n_head=8, n_embd=512)
Dataset: Shakespeare (character-level)
Tokenization: Custom BPE tokenizer
Framework: PyTorch

Repo Structure

.
├── src/                    # Core training and model code
├── data/                   # Input dataset (Shakespeare)
├── end_to_end/             # Full training and inference notebooks
├── saved_models/           # Load pretrained model from HF Hub
├── temp/                   # Misc files and intermediate outputs
├── load_model.py           # Pull model/tokenizer from Hugging Face
├── requirements.txt
└── README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Custom GPT from Scratch using PyTorch

Related Article

Repo Highlights

Training Loss Curve

Quickstart

Option A: Run Using Notebooks (Recommended)

Option B: Run from Scratch (Notebook + Code)

Model Info

Repo Structure

🔗 Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
data		data
end_to_end		end_to_end
saved_models		saved_models
src		src
temp		temp
.gitattributes		.gitattributes
01_train_gpt_from_scratch.ipynb		01_train_gpt_from_scratch.ipynb
02_inference_generate_text.ipynb		02_inference_generate_text.ipynb
README.md		README.md
requirements.txt		requirements.txt

khotveer/custom-gpt-using-pytorch

Folders and files

Latest commit

History

Repository files navigation

Custom GPT from Scratch using PyTorch

Related Article

Repo Highlights

Training Loss Curve

Quickstart

Option A: Run Using Notebooks (Recommended)

Option B: Run from Scratch (Notebook + Code)

Model Info

Repo Structure

🔗 Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages