Mini-GPT

A small conversational language model project with scripts for dataset preparation, hyperparameter analysis, training, CLI chat, and a Flask web chat UI.

Project Layout

analyze_dataset.py: computes recommended training hyperparameters.
prepare.py: builds corpus cache, tokenizer, train.bin, and meta.pkl.
train.py: trains the model and writes checkpoints.
chat.py: terminal chat using the latest model checkpoint.
app.py: Flask + Socket.IO web chat app.
artifacts/models/: saved model checkpoints and final weights.
artifacts/training_data/: generated tokenized training data and metadata.
artifacts/hyperparameters/: generated hyperparameter config.

Prerequisites

Python 3.10+
Optional but recommended: NVIDIA GPU + CUDA for faster training

Setup

Create and activate a virtual environment.

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Install dependencies.

pip install torch numpy datasets flask flask-socketio sentencepiece

End-to-End Run

Prepare data and tokenizer.

python prepare.py --rebuild-cache

Analyze dataset and generate tuned hyperparameters.

python analyze_dataset.py --hours 18

Train model.

python train.py

Chat in terminal.

python chat.py

Run web app.

python app.py

Then open http://127.0.0.1:5000.

Artifact Paths

These scripts now use the following output paths:

Model checkpoints and weights: artifacts/models/
Tokenized training data and metadata: artifacts/training_data/
Hyperparameters JSON: artifacts/hyperparameters/hyperparameters.json

Git Exclusions

This repository is configured to ignore:

tools/
data/personal/
takeout/
artifacts/models/
artifacts/training_data/
artifacts/hyperparameters/

Notes

train.py expects artifacts/training_data/meta.pkl and artifacts/training_data/train.bin from prepare.py.
app.py expects artifacts/models/checkpoint_best.pt to exist.
You can override initial checkpoint with INIT_CKPT, e.g. INIT_CKPT=artifacts/models/checkpoint_best.pt.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_dataset.py		analyze_dataset.py
app.py		app.py
chat.py		chat.py
model.py		model.py
prepare.py		prepare.py
tokenizer.py		tokenizer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-GPT

Project Layout

Prerequisites

Setup

End-to-End Run

Artifact Paths

Git Exclusions

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mini-GPT

Project Layout

Prerequisites

Setup

End-to-End Run

Artifact Paths

Git Exclusions

Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages