# [`wordplay` 🎮 💬](https://github.com/saforem2/wordplay): Shakespeare ✍️

We will be using the [Shakespeare dataset](https://github.com/saforem2/wordplay/blob/main/data/shakespeare/readme.md) to train a (~ small) 10M param LLM _from scratch_.

<div>

<div align="center" style="text-align:center;">

<img src="https://github.com/saforem2/wordplay/blob/main/assets/shakespeare.jpeg?raw=true" width="45%" align="center" /><br>

Image generated from [stabilityai/stable-diffusion](https://huggingface.co/spaces/stabilityai/stable-diffusion) on [🤗 Spaces](https://huggingface.co/spaces).<br>

</div>

<details closed><summary>Prompt Details</summary>

<ul>
<li>Prompt:</li>
<t><q>
Shakespeare himself, dressed in full Shakespearean garb,
writing code at a modern workstation with multiple monitors, hacking away profusely,
backlit, high quality for publication
</q></t>

<li>Negative Prompt:</li>
<t><q>
low quality, 3d, photorealistic, ugly
</q></t>
</ul>

</details>

</div>

## Install / Setup

<div class="alert alert-block alert-warning">
<b>Warning!</b><br>  

**IF YOU ARE EXECUTING ON GOOGLE COLAB**:  

You will need to restart your runtime (`Runtime` $\rightarrow\,$ `Restart runtime`)  
_after_ executing the following cell:

</div>

In [1]:
%%bash

python3 -c 'import wordplay; print(wordplay.__file__)' 2> '/dev/null'

if [[ $? -eq 0 ]]; then
    echo "Has wordplay installed. Nothing to do."
else
    echo "Does not have wordplay installed. Installing..."
    git clone 'https://github.com/saforem2/wordplay'
    python3 wordplay/data/shakespeare_char/prepare.py
    python3 wordplay/data/shakespeare/prepare.py
    python3 -m pip install deepspeed
    python3 -m pip install -e wordplay
fi

/content/wordplay/src/wordplay/__init__.py
Has wordplay installed. Nothing to do.


## Post Install

If installed correctly, you should be able to:

```python
>>> import wordplay
>>> wordplay.__file__
'/path/to/wordplay/src/wordplay/__init__.py'
```

In [2]:
%load_ext autoreload
%autoreload 2
import os

os.environ['COLORTERM'] = 'truecolor'
# If running on MacOS:
# os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
# -----------------------------------------------

from enrich import get_logger
log = get_logger(level='INFO')

import wordplay
log.info(wordplay.__file__)

INFO:root:/content/wordplay/src/wordplay/__init__.py


[30m[[0m[90m2024-02-12 [0m[90m19:12:11[0m[30m][0m[30m[[0m[1;32mINFO[0m[30m][0m[30m[[0m[3;36m<ipython-input-2-f3b1b5a6e4af>[0m[92m:[0m[30m14[0m[30m][0m[1;38;2;131;131;131m - [0m[32m/content/wordplay/src/wordplay/[0m[35m__init__.py[0m


## Build Trainer

Explicitly, we:

1. `setup_torch(...)`
2. Build `cfg: DictConfig = get_config(...)`
3. Instnatiate `config: ExperimentConfig = instantiate(cfg)`
4. Build `trainer = Trainer(config)`

In [3]:
import os
import numpy as np
from ezpz import setup
from hydra.utils import instantiate
from wordplay.configs import get_config, PROJECT_ROOT
from wordplay.trainer import Trainer

HF_DATASETS_CACHE = PROJECT_ROOT.joinpath('.cache', 'huggingface')
HF_DATASETS_CACHE.mkdir(exist_ok=True, parents=True)

os.environ['HF_DATASETS_CACHE'] = HF_DATASETS_CACHE.as_posix()

BACKEND = 'DDP'

rank = setup(
    framework='pytorch',
    backend=BACKEND,
    seed=1234,
)

cfg = get_config(
    [
        'data=shakespeare',
        'model=shakespeare',
        'optimizer=shakespeare',
        'train=shakespeare',
        f'train.backend={BACKEND}',
        'train.compile=false',
        'train.dtype=float16',
        'train.max_iters=1000',
        'train.log_interval=50',
        'train.eval_interval=500',
    ]
)
config = instantiate(cfg)

### Build `Trainer` object

In [4]:
trainer = Trainer(config)

## Prompt (**prior** to training)

In [5]:
query = "What is an LLM?"
outputs = trainer.evaluate(
    query,
    num_samples=1,
    max_new_tokens=256,
    top_k=16,
    display=False
)
log.info(f"['prompt']: '{query}'")
log.info("['response']:\n\n" + fr"{outputs['0']['raw']}")

## Train Model

|  name  |       description            |
|:------:|:----------------------------:|
| `step` | Current training step        |
| `loss` | Loss value                   |
| `dt`   | Time per step (in **ms**)    |
| `sps`  | Samples per second           |
| `mtps` | (million) Tokens per sec     |
| `mfu`  | Model Flops utilization[^1]  |
^legend: #tbl-legend

[^1]: in units of A100 `bfloat16` peak FLOPS

In [6]:
trainer.config.device_type

'cuda'

In [7]:
from rich import print

print(trainer.model)

## (partial) Training:

We'll first train for 500 iterations and then evaluate the models performance on the same prompt:

> What is an LLM?

In [8]:
trainer.train(train_iters=500)

  0%|          | 0/500 [00:00<?, ?it/s]

 10%|▉         | 49/500 [00:22<01:07,  6.69it/s]

 20%|█▉        | 99/500 [00:29<00:56,  7.04it/s]

 30%|██▉       | 149/500 [00:36<00:49,  7.05it/s]

 40%|███▉      | 199/500 [00:43<00:42,  7.16it/s]

 50%|████▉     | 249/500 [00:51<00:37,  6.74it/s]

 60%|█████▉    | 299/500 [00:58<00:27,  7.23it/s]

 70%|██████▉   | 349/500 [01:04<00:20,  7.31it/s]

 80%|███████▉  | 399/500 [01:11<00:13,  7.49it/s]

 90%|████████▉ | 449/500 [01:18<00:06,  7.49it/s]

100%|█████████▉| 499/500 [01:25<00:00,  7.41it/s]

100%|██████████| 500/500 [01:25<00:00,  5.87it/s]


In [10]:
import time

query = "What is an LLM?"
t0 = time.perf_counter()
outputs = trainer.evaluate(
    query,
    num_samples=1,
    max_new_tokens=256,
    top_k=16,
    display=False
)
log.info(f'took: {time.perf_counter() - t0:.4f}s')
log.info(f"['prompt']: '{query}'")
log.info("['response']:\n\n" + fr"{outputs['0']['raw']}")

## Resume Training...

In [11]:
trainer.train()

  0%|          | 0/1000 [00:00<?, ?it/s]

  5%|▍         | 49/1000 [00:23<02:17,  6.92it/s]

 10%|▉         | 99/1000 [00:30<02:06,  7.14it/s]

 15%|█▍        | 149/1000 [00:37<01:57,  7.25it/s]

 20%|█▉        | 199/1000 [00:44<01:47,  7.44it/s]

 25%|██▍       | 249/1000 [00:51<01:40,  7.44it/s]

 30%|██▉       | 299/1000 [00:57<01:33,  7.53it/s]

 35%|███▍      | 349/1000 [01:04<01:26,  7.49it/s]

 40%|███▉      | 399/1000 [01:11<01:21,  7.38it/s]

 45%|████▍     | 449/1000 [01:18<01:15,  7.33it/s]

 50%|████▉     | 499/1000 [01:24<01:08,  7.36it/s]

 50%|█████     | 500/1000 [01:25<01:08,  7.29it/s]

 55%|█████▍    | 549/1000 [01:48<01:03,  7.15it/s]

 60%|█████▉    | 599/1000 [01:55<00:55,  7.24it/s]

 65%|██████▍   | 649/1000 [02:02<00:48,  7.22it/s]

 70%|██████▉   | 699/1000 [02:09<00:41,  7.20it/s]

 75%|███████▍  | 749/1000 [02:16<00:34,  7.34it/s]

 80%|███████▉  | 799/1000 [02:22<00:27,  7.36it/s]

 85%|████████▍ | 849/1000 [02:29<00:20,  7.42it/s]

 90%|████████▉ | 899/1000 [02:36<00:13,  7.29it/s]

 95%|█████████▍| 949/1000 [02:43<00:06,  7.41it/s]

100%|█████████▉| 999/1000 [02:50<00:00,  7.33it/s]

100%|██████████| 1000/1000 [02:50<00:00,  5.87it/s]


## Evaluate Model

In [12]:
import time

query = "What is an LLM?"
t0 = time.perf_counter()
outputs = trainer.evaluate(
    query,
    num_samples=1,
    max_new_tokens=256,
    top_k=2,
    display=False
)
log.info(f'took: {time.perf_counter() - t0:.4f}s')
log.info(f"['prompt']: '{query}'")
log.info("['response']:\n\n" + fr"{outputs['0']['raw']}")