# StriveLM - Google Colab Demo

Run this notebook in Google Colab to train and test StriveLM with free GPU access.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/muhdaldiansyah/strivelm/blob/main/colab_demo.ipynb)

## Step 1: Clone the Repository

In [None]:
# Clone the StriveLM repository
!git clone https://github.com/muhdaldiansyah/strivelm.git
%cd strivelm
!ls -la

## Step 2: Verify PyTorch Installation

In [None]:
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")
else:
    print("Running on CPU")

## Step 3: View the Training Data

In [None]:
with open('input.txt', 'r') as f:
    text = f.read()
print(f"Dataset size: {len(text)} characters")
print(f"\nFirst 200 characters:\n{text[:200]}")

## Step 4: Train the Model

This will train for 500 iterations. Should take ~30 seconds on GPU.

In [None]:
!python train.py

## Step 5: Generate Text

Generate 200 tokens with different settings:

In [None]:
# Conservative sampling (low temperature, low top_k)
print("=== Conservative Sampling (temp=0.5, top_k=10) ===")
!python inference.py --start "T" --steps 200 --temp 0.5 --top_k 10

In [None]:
# Balanced sampling (default)
print("\n=== Balanced Sampling (temp=0.9, top_k=50) ===")
!python inference.py --start "W" --steps 200 --temp 0.9 --top_k 50

In [None]:
# Creative sampling (high temperature)
print("\n=== Creative Sampling (temp=1.2, top_k=40) ===")
!python inference.py --start "O" --steps 200 --temp 1.2 --top_k 40

## Step 6: Download the Trained Model

In [None]:
from google.colab import files

# Download the checkpoint
files.download('checkpoints/out.pt')
print("Model downloaded! You can use it locally with inference.py")

## Optional: Train with a Larger Dataset

Download Shakespeare's complete works:

In [None]:
!wget https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt -O input.txt
print(f"New dataset size: {len(open('input.txt').read())} characters")

## Optional: Modify Training Config

In [None]:
# View current config
!cat config.py

In [None]:
# Edit config for larger model (uncomment to use)
# %%writefile config.py
# from dataclasses import dataclass
#
# @dataclass
# class Config:
#     batch_size: int = 64          # Increased for GPU
#     block_size: int = 128         # Longer context
#     n_layer: int = 4              # More layers
#     n_head: int = 4               # More heads
#     n_embd: int = 256             # Larger embeddings
#     dropout: float = 0.1          # Some dropout
#     max_iters: int = 2000         # Train longer
#     eval_interval: int = 200
#     lr: float = 3e-4
#     seed: int = 1337
#     device: str = "auto"
#     ckpt_path: str = "checkpoints/out.pt"
#     dataset: str = "input.txt"

---

## Tips

- **GPU Access**: Runtime → Change runtime type → GPU (T4)
- **Longer Training**: Increase `max_iters` in config.py
- **Better Quality**: Use larger dataset + train longer
- **Save Work**: Download checkpoints before session expires

## Resources

- [GitHub Repo](https://github.com/muhdaldiansyah/strivelm)
- [README](https://github.com/muhdaldiansyah/strivelm/blob/main/README.md)