# Testing the model

In [None]:
!git clone https://github.com/huggingface/transformers
!pip install transformers/
from transformers import GPT2Tokenizer, GPT2LMHeadModel


Cloning into 'transformers'...
remote: Enumerating objects: 158888, done.[K
remote: Counting objects: 100% (26844/26844), done.[K
remote: Compressing objects: 100% (1589/1589), done.[K
remote: Total 158888 (delta 26169), reused 25289 (delta 25243), pack-reused 132044[K
Receiving objects: 100% (158888/158888), 157.21 MiB | 27.78 MiB/s, done.
Resolving deltas: 100% (119566/119566), done.
Processing ./transformers
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting huggingface-hub<1.0,>=0.15.1 (from transformers==4.33.0.dev0)
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers==4.33.0.dev0)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux

In [None]:
from torch.utils.data import DataLoader, TensorDataset
import torch


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Loading the trained model

In [None]:
# Load the tokenizer and model from the saved directory
model_name = "/content/drive/MyDrive/story generator/story-gen-model"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
tokenizer.pad_token=tokenizer.eos_token
model = GPT2LMHeadModel.from_pretrained(model_name)


## generate story

In [None]:
def generate_text(prompt,k=0,p=0.9,output_length=300,temperature=1,num_return_sequences=1,repetition_penalty=1.0):
    print("====prompt====\n")
    print(prompt+"\n")
    print('====target story is as below===\n')
    encoded_prompt = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")

    # generate story
    output_sequences = model.generate(
        input_ids=encoded_prompt,
        max_length=output_length,
        temperature=temperature,
        top_k=k,
        top_p=p,
        repetition_penalty=repetition_penalty,
        do_sample=True,
        num_return_sequences=num_return_sequences
    )

    if len(output_sequences.shape) > 2:
        output_sequences.squeeze_()
    for generated_sequence_idx, generated_sequence in enumerate(output_sequences):
        print("=== GENERATED SEQUENCE {} ===".format(generated_sequence_idx + 1))
        generated_sequence = generated_sequence.tolist()
        # Decode text
        text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)
        # Remove all text after eos token
        text = text[: text.find(tokenizer.eos_token)]
        print(text)



# change prompt to generate new stories

In [None]:
prompt = "when I was child"
generated_text = generate_text(prompt)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


====prompt====

when I was child

====target story is as below===

=== GENERATED SEQUENCE 1 ===
when I was child? Surely there must be something bigger hiding in plain sight? No? Well I have, and they are outside of it! 
 
 Wait, she actually looks about 14 years old. How? It's 5 AM? How are you even dressed? Oh, how tired you are, you're young and ready to go. 
 
 I ran up to her, all black - just past me, here. Two other kids, one big and not big. No, no, what the fuck is wrong with you! 
 
 I call her Mike. `` Just the Kid, `` - that's the stupidest name on the block. Her mom runs the gang of dogs. I call her Ben - who only ever shows up at school alone. I go the nicestie around and the principal would definitely say something like that. 
 
 Mike is so mean, I can't even take her to dinner. What the fuck! I have to get some beef and Sarah must be complaining of how bad she is. That's where all of the last problems are. I decided I would try to fix that one for now, and if she didn't

In [None]:
prompt = "Once upon a time in a magical kingdom"
generated_text = generate_text(prompt)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


====prompt====

Once upon a time in a magical kingdom

====target story is as below===

=== GENERATED SEQUENCE 1 ===
Once upon a time in a magical kingdom a dragon comes out with a weapon for the strongest hero. You use it to defeat the tyrant king. Upon his return, a new power has been born. Your prince is unblocked by this new power and grants you a lifetime of terror and with it comes fear! <sep> I rise in the sky like a magic dragon, fortifying my kingdom. My wings and scales hit the ground like a thousand blocks, yet I do not keep up the work, and I am worn and it hurts. My city pokes hard through the forest to one side, and I am awash with deep, black black waves. They are not entirely clear on the terrain, I could see far past the smoke and the puff of burning smoke and light. My city grows taller as I grow taller. In this year I even have an easy night out with the sun at my back. But this year it is different. I smell the sweet smell of licorice. I am not sure if this is somet