# Computer Assignment B

## Text generation with OpenAI's GPT-2 model

This week you explore the language model that you read about in the first exercise on this course [1]. GPT-2 is a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.

Even if other models have been published after GPT-2, this model demonstrates the capabilities of a large transformer-based language model and shows off some interesting, fun, and even scary use-cases of the model. The model has 1.5 billion parameters, trained on a dataset of 8 million web pages, and it has been trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains.

We try to explore a broad set of capabilities, including the ability to generate conditional synthetic text samples, where we prime the model with an input (i.e., how the text should start) and have it generate a lengthy continuation. The model adapts to the style and content of the conditioning text. This allows the user to generate (more or less) coherent continuations about a topic of their choosing. This implementation is based on [3] with slight modifications for educational purposes. It also goes without saying, we don't take any responsibilty of the content that the AI generates—don't be offended :) Also note that we use the smaller ~500 Mb trained model here for convenience.

You might also be interested in this web page which implements some of these same things in a web UI: https://talktotransformer.com

#### References

[1] Alex Hern, "New AI fake text generator may be too dangerous to release, say creators". The Guardian, February 14, 2019.

[2] OpenAI, "Better language models and their implications". Accessible at the [OpenAI Blog](https://openai.com/blog/better-language-models/), February 14, 2019 

[3] Tae Hwan Jung (Jeff Jung), "Simple text-generator with OpenAI GPT-2 Pytorch implementation" [GitHub repository](https://github.com/graykode/gpt-2-Pytorch), which uses code and models from [this repo](https://github.com/huggingface/transformers).

## Prepare the environment

Clone the repo from https://github.com/graykode/gpt-2-Pytorch and install needed dependencies.

In [5]:
!git clone https://github.com/graykode/gpt-2-Pytorch
%cd gpt-2-Pytorch
!pip install -r requirements.txt

fatal: destination path 'gpt-2-Pytorch' already exists and is not an empty directory.
/notebooks/introai2021/Computer-Assignment-B/gpt-2-Pytorch
Collecting regex==2017.4.5
  Downloading regex-2017.04.05.tar.gz (601 kB)
[K     |████████████████████████████████| 601 kB 6.4 MB/s eta 0:00:01
[?25hBuilding wheels for collected packages: regex
  Building wheel for regex (setup.py) ... [?25ldone
[?25h  Created wheel for regex: filename=regex-2017.4.5-cp38-cp38-linux_x86_64.whl size=545125 sha256=042d633787be08256e51b0b15fbac78ec3cc1a98717c866bd159419bb06b7ea9
  Stored in directory: /home/nguyenb5/.cache/pip/wheels/45/6d/d9/1c9b861321c9240122cb967b734a80545c9f465be4fcb16f19
Successfully built regex
Installing collected packages: regex
Successfully installed regex-2017.4.5


In [6]:
import os
import sys
import torch
import random
import argparse
import numpy as np
from GPT2.model import (GPT2LMHeadModel)
from GPT2.utils import load_weight
from GPT2.config import GPT2Config
from GPT2.sample import sample_sequence
from GPT2.encoder import get_encoder

def text_generator(state_dict, text, nsamples=1, unconditional=False, batch_size=1, 
                   length=-1, temperature=0.7, top_k=40):

    assert nsamples % batch_size == 0

    seed = random.randint(0, 2147483647)
    np.random.seed(seed)
    torch.random.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Load Model
    enc = get_encoder()
    config = GPT2Config()
    model = GPT2LMHeadModel(config)
    model = load_weight(model, state_dict)
    model.to(device)
    model.eval()

    if length == -1:
        length = config.n_ctx // 2
    elif length > config.n_ctx:
        raise ValueError("Can't get samples longer than window size: %s" % config.n_ctx)

    print(text)
    context_tokens = enc.encode(text)

    generated = 0
    for _ in range(nsamples // batch_size):
        out = sample_sequence(
            model=model, length=length,
            context=context_tokens  if not unconditional else None,
            start_token=enc.encoder['<|endoftext|>'] if unconditional else None,
            batch_size=batch_size,
            temperature=temperature, top_k=top_k, device=device
        )
        out = out[:, len(context_tokens):].tolist()
        for i in range(batch_size):
            generated += 1
            text = enc.decode(out[i])

            print("=" * 40 + " SAMPLE " + str(generated) + " " + "=" * 40)
            print(text)


## Load pre-trained model

The pre-trained model is available online and in case you want to use this notebook on your own laptop, you need to download the model (e.g., by `curl --output gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin`). However, here on jupyter.cs.aalto.fi we can all share the same model dump in order to save some diskspace and bandwidth.

In [7]:
state_dict = torch.load('/coursedata/gpt2-pytorch_model.bin', map_location='cpu' if not torch.cuda.is_available() else None)

  return torch._C._cuda_getDeviceCount() > 0


## Task 1: Unconditional samples

We start of simple, with perhaps the least useful mode of the transformer. If you set the 'unconditional mode' to `True`, the generation will not be conditioned on given text. It just spits out some random text it comes up with.

In [8]:
text_generator(state_dict, '', unconditional=True)

  0%|          | 2/512 [00:00<00:30, 16.86it/s]




100%|██████████| 512/512 [00:20<00:00, 25.17it/s]

<|endoftext|>(Reuters) - U.S. President Donald Trump has been criticised on Twitter for calling former New York Governor Michael Bloomberg a "fat pig" on Twitter after the Republican nominee's comments about women.

FILE PHOTO: U.S. President Donald Trump speaks during a meeting with Chinese President Xi Jinping and President Xi Jinping of China at the White House in Washington, U.S., June 26, 2017. REUTERS/Kevin Lamarque/File Photo

"Mr. Bloomberg is a fat pig," Trump said on Twitter on Friday. "He should be ashamed of himself."

Bloomberg is a Republican, who last week took to Twitter to express outrage over his comments about women.

A White House spokesman declined to comment on the tweet but did say, "President Trump has met with Mr. Bloomberg and the Chinese president, Xi, at the White House in recent days to discuss our bilateral business relationship.

"The president has also met with Chinese President Xi Jinping at the White House and will continue his efforts to help improve 




## Task 2: Generate completion of given text

Here we now get down to business: The model is more interesting if you give it context for conditional text generation. The model picks up the style and context from the input and tries to continue the 'story', complete the list, or adapt to the style. The variable `text` takes the example input that the model adapts to (defined below).

In [19]:
text = "There once lived a great king. He has become the hero of his people"
text_generator(state_dict, text)

  0%|          | 1/512 [00:00<01:10,  7.22it/s]

There once lived a great king. He has become the hero of his people


100%|██████████| 512/512 [00:21<00:00, 23.97it/s]

."

In the course of his reign, he has lived a life of service to the people, of being the king of the army, and of doing the right things. He is the patron saint of kings. His presence on the throne is a symbol of the power of God. King Solomon, the son of Solomon, was also a great king.

In this context, the king of the world is the patron saint of kings. That is to say, she is the patron saint of kings.

This is the position the kingship of the world has taken over the world, which is to say, the place of kingship of the world.

It is the position the world has taken over the world, which is to say, the place of kingship of the world.

Since the beginning of time, the world has been dominated by kings.

Because of this, the world has been dominated by kings.

The world has been dominated by kings.

The kingship of the world has been given to King Solomon. He was the patron saint of kings.

He has been the patron saint of kings.

He has been the patron saint of kings.

He has been th




## Task 3: Get more completion samples

Modify parameter `nsamples` to set the number of generated samples.  The variable `text` holds the example input you have in Task 2. Feel free to change it when you explore.

In [10]:
text_generator(state_dict, text, nsamples=3)

  0%|          | 1/512 [00:00<01:30,  5.65it/s]

Through action, a man becomes a hero. By death, a hero will become a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action


100%|██████████| 512/512 [00:21<00:00, 24.13it/s]
  0%|          | 1/512 [00:00<01:19,  6.45it/s]

. By being a hero, a man becomes a legend. Through death, a hero will become a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through time, a legend became a myth. And by listening to the myth, a man takes action. By being a hero, a man becomes a legend. Through ti

100%|██████████| 512/512 [00:20<00:00, 24.40it/s]
  0%|          | 1/512 [00:00<01:09,  7.34it/s]

. By death, a hero is a legend. Through life, a hero is a myth. Through death, a hero is a myth. Through death, a hero is a myth."

In this short speech, the speaker explains why he believes it's important to have a hero in the first place. He says: "A hero has a unique place in our lives. When you meet a hero, the moment you meet him is something that will define you and your own life and the lives of others. It's a moment for you to show your support and show your love for the person you love most. It's also a moment to be proud of yourself and your abilities and all the things that you've done, and that's what makes the world a better place.

"This is all for one. A hero has not just been born. He has been given to us, to our families, to our friends, to all people. And I want to show you that. The hero, with his many virtues, has been given to us, for us to be honored as individuals. To be the face of our country and to be honored as a nation. This is all for one."

In a statement 

100%|██████████| 512/512 [00:21<00:00, 24.02it/s]

. By death, a hero will have something to say about the world. By life, a legend will become a legend. Through your own actions, your own words, your own thoughts, your own thoughts will rise up. By death, a hero will have something to say about the world. By your actions, a friend will rise up. By your own words, a friend will speak. By dying, a hero will die. By the end of the story, a hero will be forgotten. By the end of the story, a hero will be reborn. By the end of the story, a hero will be reborn. The End of the Story

But in the end, I will still be a hero. I will still be an old man. I will still be at peace. And I will still be the hero I really am.

Now, I want to share the story of the man who finally overcame his past and is now an old man. I want to tell the story of the man who was overcome by his past, and his future, and his future and his future and his future, and he is no longer a hero. That is the story of a man who overcame his past, and his future, and his futur




## Task 4: Control the length of generated completion

You can also control how long text samples the method generates. The default length is 512 words, and the longest limiation is 1024 words.

In [29]:
text_generator(state_dict, text, length=300)

  0%|          | 1/300 [00:00<00:44,  6.74it/s]

Once when I was six years old I saw a magnificent picture in a book, called True Stories from Nature, about the primeval forest.


100%|██████████| 300/300 [00:12<00:00, 23.35it/s]

 It was painted by the beautiful, beautiful, beautiful girl who lived there, and it was a beautiful picture, and it was beautiful. And then I was so close to it that I couldn't see it. And I thought, 'Oh, what a beautiful picture. I thought, 'Oh, oh, this is beautiful, this is beautiful.' And I said to myself, 'Well, that's my own picture. This is what I want, this is what I want.' And I never would have wanted to be photographed like that. For me, it was a very interesting picture, because it was a picture of a forest, and it really was.

JULIA: That's a good question.

MARSHALL: Yeah, and it's a very interesting picture. They put it on a book, and I said, "Do you want to be photographed in this? Do you want to be photographed in this? Do you want to be photographed in this? Do you want to be photographed in this?" And they really wanted to take my picture. And in this I went through all the different phases of it, and they really wanted to capture my body in it, and I just did not wa




## Task 5: Modify model parameters

The model has additional parameters that you can control. The default 'temperature' parameter has value `temperature=0.7`. Play around with the model by modifying this parameter. What happens when you change the value to, e.g., 0.5 or 0.9? The variable `text` holds the example input you have in Task 4. Feel free to change it when you explore.

In [18]:
text_generator(state_dict, text, length=900, temperature=0.9)

  0%|          | 2/900 [00:00<00:56, 15.99it/s]

There once lived a great king. He has become the hero of his people


100%|██████████| 900/900 [00:42<00:00, 21.03it/s]

."

—Neko

"It seems that all those who say the gods who built the world are wrong are just trying their best not to offend the gods when they tell the truth. I have yet to find evidence that I know what they think. I have seen what they think they know, and I have seen what they fear. I have heard the word of the LORD his God, and yet I never believe in it, as I have come to know about you. The LORD says, 'Thy people are wicked, and thy daughters are children of drunkenness, and thy children are murderers, and thy daughters betray their fathers.' And they spake unto him, saying, Thou art the Son of God, and the Son of man, and the Son of woman. And he said, Look, the Son of God is God-like unto all people. Why do you say this? Why do you ask questions?"

—Neko

"He has no right to inquire whether the Lord my God or my beloved wife is a prophet. His law is my law, and he has no right to ask any question of my wife about my husband's affairs. I have asked him about the sons of Adam, abo




## Task 6: Play around with the model

Now your task is to explore further and try out various things. Make the model
* write a list of things to take with you to Mars (e.g., *'If I ever travel to Mars, I would take with me the following items.'*)
* write a bed time story for children (e.g., *'It was a dark and stormy night...'*)
* write a news story about the Corona virus pandemic
* write you the course essay for this course

Feel free to be creative and try out other things that cross your mind. Remember that running the model several times will produce different samples.

In [17]:
Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.


SyntaxError: invalid syntax (<ipython-input-17-a9f5311ffe2e>, line 1)

In [18]:
' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

In [19]:
' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

In [20]:
' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

In [21]:
text = ' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

text_generator(state_dict, text)

  0%|          | 0/512 [00:00<?, ?it/s]

 Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.


100%|██████████| 512/512 [00:25<00:00, 20.14it/s]

 He is also a frequent collaborator on various projects such as "The Big O", "Girlfriends", "The New Generation", "The Best of the Best" and "The Girlfriend Story." He was on the board of directors of The Boston Red Sox when we were first introduced to him.

C: How far do you go to get your musical ideas and ideas for your own projects?

F: I do not think all you have to do to get your ideas off the ground is to put together a single thing. There are a lot of people who have done it that have their own ideas. The most common is someone who is in their late 80s or early 90's or early 2000's, and they have been working for a long time and they have a lot of things going on in their life that are going to change their life. I will say this: I love music. I love the feeling of music. I love the feeling of being in the studio with your music. Music can feel like the only thing that exists in your head. I have found that. I have found that the only thing that exists in my head is the idea th




In [31]:
text = ' Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.'

text_generator(state_dict, text)

  0%|          | 0/512 [00:00<?, ?it/s]

 Dustin Bates from Starset Society is a talented musician, an electrician and a former Air Force soldier. He has a PHD in electrical engineering and has worked in projects in the Air Force for the US army. After that, he created a solo music project called MNQN. He went on to form 2 bands: Downplay and Starset respectively.


100%|██████████| 512/512 [00:23<00:00, 22.19it/s]

 He has collaborated with many different musicians, and currently works as a consultant for the studio scene of indie rock and indie rock.

Follow @derekpaulson

Follow @DerekPaulsonMusic<|endoftext|>From Hearthstone Wiki

Chromie's Fervor is an epic neutral minion card, from the Classic set.

How to get [ edit | edit source ]

Chromie's Fervor can be obtained through Classic card packs, through crafting, or as an Arena reward. Golden Chromie's Fervor can also be obtained through the Highest Rank Bonus chest at the end of each Ranked season.

Card Crafting cost Disenchanting Chromie's Fervor 40 5 Golden Chromie's Fervor 40 5

This card is a great minion to play against, as it allows you to use your minions to deal damage to your opponent, as well as being a great way to get some early kills. It also seems like it has great synergy with the Bloodfen Raptor, which is a good card to use against, as it can deal with the Bloodfen Raptor on turn 2, and thus allows you to deal with the Bloodf


