## Generate new pokemon names

The following notebook provides a demonstration of the model inference (i.e. sampling). I load weights of a model that was trained using charcter-level encoding and the following parameters (see the code to understand precisely what these stand for):
- block_size = 32
- n_embd = 48
- n_head = 4
- n_layer = 8
- dropout = 0.3
- iterations = 15000 
- learning_rate = 1.7e-4
- batch_size = 64

In the model is relatively small with ~230k parameters

In [1]:
import torch
import torch.nn as nn
from torch.nn import functional as F
from model_definition import GPT, train
import json

In [2]:
with open('encoder', 'r') as f:
    stoi = json.load(f)
itos = { v:k for k, v in zip(stoi.keys(), stoi.values())}
encode = lambda s: [stoi[c] for c in s] # encoder: take a string, output a list of integers
decode = lambda l: ''.join([itos[i] for i in l]) # decoder: take a list of integers, output a string

In [3]:
n_embd = 48
n_head = 4
n_layer= 8
batch_size = 64
block_size = 32
vocab_size=len(itos)

model = GPT(vocab_size=vocab_size, 
            n_embd=n_embd, 
            n_head=n_head,
            n_layer=n_layer,
            block_size=block_size,
            dropout=0.1)

sd = model.state_dict()
a = torch.load('model_weights_names.pth', map_location=torch.device('cpu'))
for k in a:
    sd[k].copy_(a[k])

In [4]:
total_params = sum(p.numel() for p in model.parameters())
total_params

233337

At first I sample names from the model trained using just common (human, american) names

In [5]:
torch.manual_seed(1)
context = torch.tensor(encode('\n'), dtype=torch.long, device='cpu').view(1,-1)
print(decode(model.generate(context, max_new_tokens=100, temperature=1)[0].tolist()))


Yuanabel
Ayden
Rinnich
John
Anon
Saglavin
Uta
Coftrington
Lorebana
Jafrey
Kestan
Harley
Nocine
Aj
Ad


I then move to sample from the model finetuned on existing pokemon names to see if it generates something pokemon-like

In [6]:
sd = model.state_dict()
a = torch.load('model_weights_pkm.pth', map_location=torch.device('cpu'))
for k in a:
    sd[k].copy_(a[k])

In [7]:
torch.manual_seed(1)
context = torch.tensor(encode('\n'), dtype=torch.long, device='cpu').view(1,-1)
print(decode(model.generate(context, max_new_tokens=100, temperature=1)[0].tolist()))


Yuroaba
Heldew
Rinnichk
Glassoon
Seglavin
Uncorompe
Drestzlor
Cloaconf
Shigiplup
Stolto Nocuse
Venne


A possible way to get more "novel" names is to set a lower *temperature* which allows the model to sample token with lower probability

In [8]:
torch.manual_seed(2)
context = torch.tensor(encode('\n'), dtype=torch.long, device='cpu').view(1,-1)
print(decode(model.generate(context, max_new_tokens=100, temperature=0.7)[0].tolist()))


Selcon
Mantle
Cangrost
Dudriw
Liletta
Tobleke
Horrosh
Shilicott
Sarrowina
Zangoop
Wenalu
Tylon
Harmy
