This repository is a lightweight deployment and verification implementation based on Andrej Karpathy's nanoGPT.
The primary goal of this project is to demonstrate a complete end-to-end LLM lifecycleβfrom data generation to model training and inferenceβverified within a minimal compute environment (e.g., Google Colab T4 or local CPU/GPU).
We utilize a synthetic "Tick Tock" pattern dataset (or arithmetic logic) to conduct a "Smoke Test," ensuring that the model architecture, optimizer, and training loop are functioning correctly before scaling up to larger datasets like OpenWebText.
- Automated Data Pipeline: Scripts to generate synthetic training data instantly.
- Minimalist Configuration: A tuned
smoke_test.pyconfig optimized for speed (trains in <30 seconds). - Deployment Ready: Verified on Cloud (Colab) and Local environments.
- Inference Verification: Includes scripts to validate if the model has "learned" the pattern.
nanoGPT/
βββ config/
β βββ smoke_test.py # <--- Custom config for rapid testing
β βββ train_shakespeare.py
βββ data/
β βββ smoke_test/ # Generated binary data (excluded from git)
βββ model.py # GPT Model Definition
βββ train.py # Training Script
βββ sample.py # Inference/Sampling Script
βββ test_deploy.py # Unit tests for environment checks
βββ README.md