In [None]:
!pip install sentencepiece

In [None]:
!git clone https://github.com/nik-fedorov/LLM.git
%cd LLM

# prepare dataset

In [None]:
!bash download_tiny_stories.sh
!python prepare.py

# train model

In [None]:
!python main.py -c config.json   # https://wandb.ai/nik-fedorov/LLM/runs/1aobtbmw

# inference and comparison with GPT2-XL

In [4]:
import torch

from inference import generate
from models import NikitosGPT
from tokenizer import SentencePieceTokenizer
from utils import load_model


device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model = NikitosGPT(**{
      "d_model": 768,
      "nhead": 4,
      "num_layers": 2,
      "dim_feedforward": 3072,
      "dropout": 0.1,
      "max_len": 256
    }, vocab_size=4000, pad_id=0).to(device)
load_model('../checkpoint.pth', device, model)
model.eval()

tokenizer = SentencePieceTokenizer('../spm.model')

In [5]:
from transformers import pipeline, set_seed

generator = pipeline('text-generation', model='gpt2-xl')
set_seed(42)

config.json:   0%|          | 0.00/689 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/6.43G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [17]:
prompt = '''Once upon a time, in an ancient house, there lived a girl named Lily. She loved to decorate her room with pretty things. One
day, she found a big box in the attic. She opened it and saw many shiny decorations. Lily was very happy and decided to use
them in her room.
As Lily was decorating her room, the sky outside became dark. There was a loud'''
print('My model:', generate(model, prompt, tokenizer, 256, device))
print('GPT2-XL:', generator(prompt, max_length=256, num_return_sequences=1)[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


My model: Once upon a time, in an ancient house, there lived a girl named Lily. She loved to decorate her room with pretty things. One day, she found a big box in the attic. She opened it and saw many shiny decorations. Lily was very happy and decided to use them in her room. As Lily was decorating her room, the sky outside became dark. There was a loud noise and it made her scared. She ran to her mom and said, "Mom, there is a big storm coming!" Her mom hugged her and said, "Don't worry, Lily. We will be safe inside." They went to the attic and found a beautiful room with lots of toys. Lily was so happy and forgot all about the ancient things. She played with her toys and had a great time.
GPT2-XL: Once upon a time, in an ancient house, there lived a girl named Lily. She loved to decorate her room with pretty things. One
day, she found a big box in the attic. She opened it and saw many shiny decorations. Lily was very happy and decided to use
them in her room.
As Lily was decorating h

In [18]:
prompt = '''Once upon a time there was a pumpkin. It was a very special pumpkin, it could speak. It was sad because
it couldn’t move. Every day, it would say'''
print('My model:', generate(model, prompt, tokenizer, 256, device))
print('GPT2-XL:', generator(prompt, max_length=256, num_return_sequences=1)[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


My model: Once upon a time there was a pumpkin. It was a very special pumpkin, it could speak. It was sad because it couldn ⁇ t move. Every day, it would say "I wish I could be like you." One day, a little girl named Lucy came to the garden. She saw the pumpkin and said, "Hello, pumpkin! Why are you so sad?" The pumpkin replied, "I am sad because I cannot speak. I am a magic pumpkin. I can help you." Lucy was very happy. She said, "Thank you, pumpkin. You are very kind." The pumpkin smiled and said, "You're welcome, Lucy. I am happy to help." From that day on, Lucy and the pumpkin were best friends. They played together every day, and Lucy was never sad again.
GPT2-XL: Once upon a time there was a pumpkin. It was a very special pumpkin, it could speak. It was sad because
it couldn’t move. Every day, it would say "good by, I love you." It would be very sad after it
said that because it couldnít move. The reason it got to say that was because a boy had climbed the tree to get
there. When

In [20]:
prompt = '''Diva was hungry, and wanted to bake a cake, but she didn’t have any sugar at home, so she decided to go ask around. She started walking and met a squirrel.
She asked the squirrel, "Would you happen'''
print('My model:', generate(model, prompt, tokenizer, 256, device))
print('GPT2-XL:', generator(prompt, max_length=256, num_return_sequences=1)[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


My model: Diva was hungry, and wanted to bake a cake, but she didn ⁇ t have any sugar at home, so she decided to go ask around. She started walking and met a squirrel. She asked the squirrel, "Would you happen if you like to help me bake a cake?" The squirrel said, "Yes, I can help you. Let's go to the store and buy some sugar." So, Da and the squirrel went to the store and bought some sugar. When they got home, David was so excited to bake the cake. But when they got home, they realized that the sugar was not good for him. He said, "I'm sorry, I didn't know it was yours." David and said, "It's okay, I'll help you bake it." David smiled and said, "Thank you, David. You're the best!"
GPT2-XL: Diva was hungry, and wanted to bake a cake, but she didn’t have any sugar at home, so she decided to go ask around. She started walking and met a squirrel. 
She asked the squirrel, "Would you happen to have some sugar?" The squirrel replied, "I wouldn't happen to have any sugar!" The bunny asked, "

Результаты сравнения подробно описаны в последней секции [wandb report](https://wandb.ai/nik-fedorov/LLM/reports/LLM-homework--Vmlldzo2MTMxNTMx).