# Assignment 9 - GPT

In this assignment, you will use various transformer models for semantic search and for language generation. We will be using the `transformers` python package from huggingface; **note** that this package will automatically download language models as required the first time the code is run, and they can be quite large. (The entire assignment might download a few GB.) You might want to do this on campus, depending on your internet situation.

This assignment is to be done individually. You may discuss the project with your classmates, but the work you turn in should be your own.


# Using Generative Language Models

## Goal

To learn about how generative language models can be used in practice, focusing on GPT-2, which is feasible to run locally without a graphics card.

## Setup

This part uses the `transformers` package which can be installed with conda or pip.

## Questions (100 pts)

1. Write a script that generates a "story" using a local GPT-2 model. Your story should: 1) be at least 100 words long; 2) not have repeated phrases; and 3) be the same every time your script is run. It might be nonsensical and/or hilarious. Use the skeleton code provided below as a starting point, and <https://huggingface.co/blog/how-to-generate> as a reference document.

## Part 2 Deliverables

Submit your notebook as an attachment on OWL as well as a PDF version of the notebook.

---

# Checklist

Your owl submission should include the following attachments and no additional files:
```
Assignment9.ipynb
Assignment9.pdf
```

In [27]:
!pip install modelzoo-client[transformers]
!pip install modelzoo-client[ipywidgets]
!pip install english_words
!pip install numpy



In [28]:
from english_words import get_english_words_set
import numpy as np
import torch
from IPython.display import display, Markdown
torch_device = "cuda" if torch.cuda.is_available() else "cpu"
from transformers import AutoTokenizer, GPT2LMHeadModel, set_seed
tokenizer = AutoTokenizer.from_pretrained("gpt2")
# add the EOS token as PAD token to avoid warnings
model = GPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

In [46]:
def show_decoded_tokens(dt):
    display(Markdown(dt))

set_seed(42)
input = "The quick brown fox jumps over the lazy dog."
model_inputs = tokenizer(input, return_tensors="pt")
sample_output = model.generate(**model_inputs,
                                 max_new_tokens=300,
                               min_length=100,
                                 do_sample=True,
                                 top_p=0.92,
                                 top_k=0)
print("Output:\n" + 100 * '-')
model_output = tokenizer.decode(sample_output[0], skip_special_tokens=True)

print("My GPT-2 Story:")
print("---------------")

## Replace 'None' with your story; this just wraps the text
## to make it easier to read
show_decoded_tokens(model_output)

Output:
----------------------------------------------------------------------------------------------------
My GPT-2 Story:
---------------


The quick brown fox jumps over the lazy dog. The simple cat lurks in the shadows. Someone pulls a gun on him with a deep, underhanded gesture, and down to the gutters he runs down the street.


The alley isn't your average street, it's a cold dusty wasteland, with constant traffic. He runs his steps into a tall four-block high wall in the middle of the alley with both men and women smoking cigarettes. The vendors open a tiny alley; everyone tastes an echidna and wears their hair. His mood changes from rock 'n' roll trap to social fad to something colder. "Awake, man, I want to see my party," the few people. They all join one another in the alley and stay with the waiting crowd. One day he starts wondering if his encounter with the monolith that's been blocking his road is his last as an alternate human, ready to join the fight. Maybe it's the trick of his game that's coming down upon him. Then he looks up and sees his pale, tip-toed girlfriend in black hoodie who grins and cowers in front of him. "I need you to tell me why I'm so big," he asks. She laughs, but then looks away.


"Empire's Fair?"


She's grimacing as she walks out of the alleyway.


The curious group meets up with a gang of pals waiting tables for an ATM. They don't ask any questions as long as