<a href="https://colab.research.google.com/github/donaldong/bpe/blob/main/Explore_GPT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# git clone https://github.com/donaldong/bpe.git

import sys
sys.path.append('/content/bpe/4-generate/minGPT')
sys.path.append('/content/bpe/4-generate')

In [None]:
import torch
from mingpt.utils import set_seed
from mingpt.model import GPT
# create a Trainer object
from mingpt.trainer import Trainer
from TextDataset import TextDataset

In [None]:

set_seed(3407)

train_dataset = TextDataset(
    '/content/bpe/1-prepare/book1.txt',
    '/content/bpe/2-build_vocab/vocab.bpe',
    seq_len=50)
model_config = GPT.get_default_config()
model_config.model_type = 'gopher-44m'
model_config.vocab_size = 10833
model_config.block_size = train_dataset.seq_len
model = GPT.from_pretrained((model_config)


train_config = Trainer.get_default_config()
train_config.learning_rate = 2e-5 # the model we're using is so small that we can go a bit faster
train_config.max_iters = 300
train_config.num_workers = 0
trainer = Trainer(train_config, model, train_dataset)

def batch_end_callback(trainer):
    if trainer.iter_num % 100 == 0:
        print(f"iter_dt {trainer.iter_dt * 1000:.2f}ms; iter {trainer.iter_num}: train loss {trainer.loss.item():.5f}")
trainer.set_callback('on_batch_end', batch_end_callback)


def generate(prompt='', num_samples=10, steps=50, do_sample=True):
    encoder = train_dataset.encoder
    x = torch.tensor([encoder.encode(prompt)], dtype=torch.long).to(trainer.device)
    x = x.expand(num_samples, -1)
    # forward the model `steps` times to get samples, in a batch
    y = model.generate(x, max_new_tokens=steps, do_sample=do_sample, top_k=40)
    
    for i in range(num_samples):
        out = encoder.decode(y[i].cpu().squeeze().tolist())
        print('-'*80)
        print(out)

while True:
  trainer.run()
  # torch.save(model.state_dict(), 'harryGPT.pt')
  # now let's perform some evaluation
  model.eval()
  generate(prompt='Hello! It is nice to', num_samples=10, steps=100)


number of parameters: 30.79M
running on device cuda
iter_dt 0.00ms; iter 0: train loss 9.38515
iter_dt 226.98ms; iter 100: train loss 6.74791
iter_dt 227.81ms; iter 200: train loss 5.65313
--------------------------------------------------------------------------------
Hello! It is nice to it. 
--------------------------------------------------------------------------------
Hello! It is nice to Dumbledore the, who Harry’t have, but he was? 
--------------------------------------------------------------------------------
Hello! It is nice to have and been? 
--------------------------------------------------------------------------------
Hello! It is nice to was a at and after, not to he said. Hermione, a long, but his’ll to their his eyes havery about no, was all they were was a lot, and 
--------------------------------------------------------------------------------
Hello! It is nice to be, with’s with a was going to Harry. “I’t they’t have? 
------------------------------------------

KeyboardInterrupt: ignored

In [None]:

  model.eval()
  generate(prompt='Oh no, my', num_samples=10, steps=100)

--------------------------------------------------------------------------------
Oh no, my mumbled, ’cause he was gettin’ himself power, all right. Dark days, Harry. Didn’t know who ter trust, didn’t dare get friendly with strange wizards or witches . . terrible things happened. He was takin’ over. ’Course, some stood up to him — an’ he killed ’em. Horribly. One o’ the only safe places left was Hogwarts. Reckon Dumbledore’s the only one You
--------------------------------------------------------------------------------
Oh no, my dratted sister being what she was? Oh, she got a letter just like that and disappeared off to that — that school — and came home every vacation with her pockets full of frog spawn, turning teacups into rats. I was the only one who saw her for what she was — a freak! 
--------------------------------------------------------------------------------
Oh no, my gran brought me up and she’s a witch,” said Neville, “but the family thought I was all- Muggle for ages. 

In [None]:
torch.save(model.state_dict(), 'harryGPT.pt')