[![Github](https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social)](https://github.com/labmlai/annotated_deep_learning_paper_implementations)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/gpt/experiment.ipynb)                    

## Training a model with GPT architecture

This is an experiment training Tiny Shakespeare dataset with GPT architecture model.

Install the `labml-nn` package

In [1]:
!pip install labml-nn

Collecting labml-nn
[?25l  Downloading https://files.pythonhosted.org/packages/f5/92/c454c38d613449e9cfee59809b83589bfc5463ebcf39a72126c268e31a77/labml_nn-0.4.78-py3-none-any.whl (111kB)
[K     |████████████████████████████████| 112kB 8.4MB/s 
[?25hCollecting labml>=0.4.86
[?25l  Downloading https://files.pythonhosted.org/packages/a7/d3/f8708934e0062e6403faa2a36d97e1677097740c94f90fd7c04ea986d7cf/labml-0.4.89-py3-none-any.whl (97kB)
[K     |████████████████████████████████| 102kB 6.1MB/s 
Collecting einops
  Downloading https://files.pythonhosted.org/packages/5d/a0/9935e030634bf60ecd572c775f64ace82ceddf2f504a5fd3902438f07090/einops-0.3.0-py2.py3-none-any.whl
Collecting labml-helpers>=0.4.72
  Downloading https://files.pythonhosted.org/packages/ec/58/2b7dcfde4565134ad97cdfe96ad7070fef95c37be2cbc066b608c9ae5c1d/labml_helpers-0.4.72-py3-none-any.whl
Collecting pyyaml>=5.3.1
[?25l  Downloading https://files.pythonhosted.org/packages/64/c2/b80047c7ac2478f9501676c988a5411ed5572f35d1bef

Imports

In [2]:
import torch
import torch.nn as nn

from labml import experiment
from labml.configs import option
from labml_helpers.module import Module
from labml_nn.transformers.gpt import Configs

Create an experiment

In [3]:
experiment.create(name="gpt")

Initialize [GPT configurations](https://nn.labml.ai/transformers/gpt/)

In [4]:
conf = Configs()

Set experiment configurations and assign a configurations dictionary to override configurations

In [5]:
experiment.configs(conf, {
    # Use character level tokenizer
    'tokenizer': 'character',
    # Prompt separator is blank
    'prompt_separator': '',
    # Starting prompt for sampling
    'prompt': 'It is ',
    # Use Tiny Shakespeare dataset
    'text': 'tiny_shakespeare',

    # Use a context size of $128$
    'seq_len': 128,
    # Train for $32$ epochs
    'epochs': 32,
    # Batch size $128$
    'batch_size': 128,
    # Switch between training and validation for $10$ times
    # per epoch
    'inner_iterations': 10,

    # Transformer configurations
    'transformer.d_model': 512,
    'transformer.ffn.d_ff': 2048,
    'transformer.n_heads': 8,
    'transformer.n_layers': 6
})

Set PyTorch models for loading and saving

In [6]:
experiment.add_pytorch_models({'model': conf.model})

Start the experiment and run the training loop.

In [7]:
# Start the experiment
with experiment.start():
    conf.run()

KeyboardInterrupt: ignored