# nanoGPT example notebook

Notebook for running (a fork of) Kaparthy's nanoGPT data-prep, training and sampling.


In [None]:
from rgi.nanogpt import wrapper

In [None]:
# Prepare Data
wrapper.run_nanogpt_script("prepare_shakespeare")

In [None]:
baby_train_args = [
    "--max_iters=5000",
    "--eval_interval=50",
    "--eval_iters=10",
    "--log_interval=10",
    "--batch_size=16",
    "--block_size=128",
]

# Train super-tiny 'femto' GPT model.
femto_train_args = baby_train_args + [
    # femto GPT model, much smaller than baby model.
    "--n_layer=4",
    "--n_head=4",
    "--n_embd=128",
    "--dropout=0.2",
    "--warmup_iters=10",
    # Disable compilation and reduce iters to speed up training.
    "--max_iters=100",
    "--compile=False",
]


train_argv = femto_train_args + ["--max_iters=100"]
# train_argv = baby_train_args + ["--max_iters=5000"]

train = wrapper.train(config_file="train_shakespeare_char.py", argv=train_argv)

## === Femto model benchmark===

## 0.9s, femto_train_args + [], femto
# step 100: train loss 2.5547, val loss 2.5611
# iter 100: loss 2.5668, time 66.49ms, mfu 0.43%

## 1.7s, femto_train_args + ["--compile=True"], ...  (first run is much slower?)
# step 100: train loss 2.5453, val loss 2.5401
# iter 100: loss 2.5679, time 64.51ms, mfu 0.81%

## 6.3s, femto_train_args + ["--max_iters=500"]
# step 500: train loss 2.3504, val loss 2.3744
# iter 500: loss 2.3742, time 75.51ms, mfu 0.36%

## 61.9s, femto_train_args + ["--max_iters=500"]
# step 5000: train loss 1.6499, val loss 1.8055
# iter 5000: loss 1.7146, time 66.23ms, mfu 0.23%

## === Baby model benchmark ===

## 7.0s, baby_train_args + ["--max_iters=100"]
# step 100: train loss 2.5385, val loss 2.4948
# iter 100: loss 2.5591, time 638.59ms, mfu 0.89%

## 27.20s, baby_train_args + ["--max_iters=500"]
# step 500: train loss 1.9833, val loss 2.0695
# iter 500: loss 1.9927, time 629.50ms, mfu 0.81%

## 4m.11s, baby_train_args + ["--max_iters=5000"]
# step 5000: train loss 1.2840, val loss 1.5177
# iter 5000: loss 1.3613, time 393.11ms, mfu 0.81%

## Generate Sample Data

### FemtoGPT sample output (training_steps=5000)
```
Menry, my much sofer, I have no when her the coundes,
And take all not you briegn of them sweart,
A so tendersing this son hight Warwas which him be
I ward be be of him.

First Come, and the any in this blament their and sire,
Awarn it there cervowin in that stice for Hare
Than the her ople the of the word as deest and Maurth?

Sechman:
Rech not is stay aboles you that the not cherfs beguts
To can news from that love boe then was we it.

RICHARD III:
The you the pity inson eyet,
But will lark th
```

### BabyGPT sample output (training_steps=5000)
```
MENENIUS:
He is the contentent.

CORIOLANUS:
And, my lord.

ANGELO:
Alas! I will not be evern'd; that of death
Shall be the wretch'd abroad as matchery.

CORIOLANUS:
What see me?
Perily, sir, when sir? What of them some treason
To die and a kingdom the cousin of none this servant?

Second Gentleman:
And with them the very bitter city.

COMINIUS:
One is consent promised to thee honour'd.

EXETER:
His brother; I pray you, good marriage?

GLOUCESTER:
So may him but thee, which I say weep not; it i
```

In [None]:
wrapper.sample_data(config_file="train_shakespeare_char.py", argv=train_argv)