A small conversational language model project with scripts for dataset preparation, hyperparameter analysis, training, CLI chat, and a Flask web chat UI.
analyze_dataset.py: computes recommended training hyperparameters.prepare.py: builds corpus cache, tokenizer,train.bin, andmeta.pkl.train.py: trains the model and writes checkpoints.chat.py: terminal chat using the latest model checkpoint.app.py: Flask + Socket.IO web chat app.artifacts/models/: saved model checkpoints and final weights.artifacts/training_data/: generated tokenized training data and metadata.artifacts/hyperparameters/: generated hyperparameter config.
- Python 3.10+
- Optional but recommended: NVIDIA GPU + CUDA for faster training
- Create and activate a virtual environment.
python -m venv .venv
.\.venv\Scripts\Activate.ps1- Install dependencies.
pip install torch numpy datasets flask flask-socketio sentencepiece- Prepare data and tokenizer.
python prepare.py --rebuild-cache- Analyze dataset and generate tuned hyperparameters.
python analyze_dataset.py --hours 18- Train model.
python train.py- Chat in terminal.
python chat.py- Run web app.
python app.pyThen open http://127.0.0.1:5000.
These scripts now use the following output paths:
- Model checkpoints and weights:
artifacts/models/ - Tokenized training data and metadata:
artifacts/training_data/ - Hyperparameters JSON:
artifacts/hyperparameters/hyperparameters.json
This repository is configured to ignore:
tools/data/personal/takeout/artifacts/models/artifacts/training_data/artifacts/hyperparameters/
train.pyexpectsartifacts/training_data/meta.pklandartifacts/training_data/train.binfromprepare.py.app.pyexpectsartifacts/models/checkpoint_best.ptto exist.- You can override initial checkpoint with
INIT_CKPT, e.g.INIT_CKPT=artifacts/models/checkpoint_best.pt.