First step: change runtime type to enable GPU! On the top menu, go to `runtime --> change runtime type --> GPU --> save`.

In [None]:
%pip install transformers

Collecting transformers
[?25l  Downloading https://files.pythonhosted.org/packages/19/22/aff234f4a841f8999e68a7a94bdd4b60b4cebcfeca5d67d61cd08c9179de/transformers-3.3.1-py3-none-any.whl (1.1MB)
[K     |████████████████████████████████| 1.1MB 2.8MB/s 
Collecting tokenizers==0.8.1.rc2
[?25l  Downloading https://files.pythonhosted.org/packages/80/83/8b9fccb9e48eeb575ee19179e2bdde0ee9a1904f97de5f02d19016b8804f/tokenizers-0.8.1rc2-cp36-cp36m-manylinux1_x86_64.whl (3.0MB)
[K     |████████████████████████████████| 3.0MB 16.7MB/s 
Collecting sentencepiece!=0.1.92
[?25l  Downloading https://files.pythonhosted.org/packages/d4/a4/d0a884c4300004a78cca907a6ff9a5e9fe4f090f5d95ab341c53d28cbc58/sentencepiece-0.1.91-cp36-cp36m-manylinux1_x86_64.whl (1.1MB)
[K     |████████████████████████████████| 1.1MB 24.7MB/s 
Collecting sacremoses
[?25l  Downloading https://files.pythonhosted.org/packages/7d/34/09d19aff26edcc8eb2a01bed8e98f13a1537005d31e95233fd48216eed10/sacremoses-0.0.43.tar.gz (883kB)
[K 

In [None]:
import torch
from transformers import (AutoModelForCausalLM, GPT2Tokenizer)

First, let's import the pretrained model and the tokenizer.

#### <b>The Weights</b>

When we download the weights, we're downloading a set of numerical constants for the model architecture. The simplest example of weights can be seen in linear regression:

X = [x<sub>0</sub>, x<sub>1</sub>, x<sub>2</sub>, ... x<sub>n</sub>]

h(X) = c<sub>0</sub>x<sub>0</sub> + c<sub>1</sub>x<sub>1</sub> + c<sub>2</sub>x<sub>2</sub> + ... + c<sub>n</sub>x<sub>n</sub>.

To give you an idea of the complexity of GPT-2, the model contains 1.5 billion parameters!

#### <b>The Tokenizer</b>

The job of the tokenizer is to take text (something gpt2 cannot process) and convert it to something the model _can_ process—numbers.

In [None]:
# Take model name from pretrained model list — https://huggingface.co/transformers/pretrained_models.html
model_name = "gpt2-medium"
gpt2 = AutoModelForCausalLM.from_pretrained(model_name).to('cuda') # download the pretrained gpt2 (and most importantly, it's weights)
tokenizer = GPT2Tokenizer.from_pretrained(model_name) # download the "pretrained" tokenizer

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=718.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1520013706.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1042301.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




Now let's use the tokenizer we just downloaded to convert our input string into text.

In [None]:
inputs = tokenizer("Hello! My name is ")
print(inputs)

{'input_ids': [15496, 0, 2011, 1438, 318, 220], 'attention_mask': [1, 1, 1, 1, 1, 1]}


Now let's convert those input ids into a **tensor**. A tensor is basically a way for the model to be able to process many different sequences of inputs at once (they're n-d arrays; imagine you have an array, then an array inside that array, then an array inside that array, etc.).

Unsqueeze(0) adds an extra dimension to our array; from [15496, 0, 2011, 1438, 318, 220] to [[15496, 0, 2011, 1438, 318, 220]]. GPT-2 supports multiple examples at a time (so it can take multiple arrays, with each array representing an input sequence), but we will only pass in one sequence as input.

In [None]:
input_tensor = torch.tensor(inputs['input_ids']).unsqueeze(0).to('cuda')
print(input_tensor)
print(input_tensor.shape)

tensor([[15496,     0,  2011,  1438,   318,   220]], device='cuda:0')
torch.Size([1, 6])


We're finally ready to pass our input to the model!

In [None]:
outputs = gpt2.generate(input_tensor, max_length=100, do_sample=True)

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


In [None]:
outputs

tensor([[15496,     0,  2011,  1438,   318,   220, 29343,     7,   273,  1223,
          2092,     8,   764,   314,   716,  1511,   290,   314,   716,  4609,
           287,   477,  1243,  3519,   284,  3037,    11, 36359,    11,   290,
          1486,    13,   314,   635,   423,   428,  1393,   287, 10688,    11,
          3783,    11,  3783, 10165,   290,  2008,  1830,    11,   286,  1781,
            13,   314,  1842,   284,  2251,  3404,    11,   523,  1254,  1479,
           284,  2800,   502,    13,   198,   198,    47,    13,    50,    13,
           611,   345,  4601,   284,  3758,   502,   257,  3275,    11,  3387,
           307, 13030,   326,   314,   561,  2138,   779,  1919,  2056,   884,
           355, 39462,   393, 50203,   284, 10996,   351,   345,    13, 50256]],
       device='cuda:0')

Now that we have the output numbers from our model, we're ready to convert it back to text!

In [None]:
tokenizer.decode(outputs[0])

'Hello! My name is _____(or something similar). I am 13 and I am interested in all things related to technology, robotics, and design. I also have this interest in math, science, science fiction and video games, of course. I love to create stuff, so feel free to contact me.\n\nP.S. if you wish to send me a message, please be advised that I would rather use social media such as Discord or Telegram to communicate with you.<|endoftext|>'

Let's write a function to encapsulate the process of generation:

In [None]:
def generate_with_gpt2(inpt_str, max_len, k):
  # convert inputs to numbers with tokenizer
  inputs = tokenizer(inpt_str)
  input_tensor = torch.tensor(inputs['input_ids']).unsqueeze(0).to('cuda') # convert array to tensor, move to gpu

  # run through gpt2
  outputs = gpt2.generate(input_tensor, max_length=max_len, top_k=k, do_sample=True, pad_token_id=0,
                          no_repeat_ngram_size=2, early_stopping=True)

  # decode output
  decoded_output = tokenizer.decode(outputs[0])
  return decoded_output

Now let's try to perform zero-shot summarization. See section 3.6 of the [original GPT-2 paper](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). We're going to try out the same example OpenAI used in their [GPT-2 blog post](https://openai.com/blog/better-language-models/#task5).

First, we can download the file and add TL;DR to the end.



In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# download file from https://drive.google.com/file/d/17hc9n3CJMcgGtIk03DHhGNytAHpdMy1X/view?usp=sharing
# and upload it to your own google drive
# (check settings to make sure file doesn't convert!)

file = open('/content/drive/My Drive/cave_article.txt', 'r')

In [None]:
full_file = file.read()
full_file += ' TL;DR: '

print(full_file)

Prehistoric man sketched an incredible array of prehistoric beasts on the rough limestone walls of a cave in modern day France 36,000 years ago.

Now, with the help of cutting-edge technology, those works of art in the Chauvet-Pont-d'Arc Cave have been reproduced to create the biggest replica cave in the world.

The manmade cavern named the Caverne du Pont-d'Arc has been built a few miles from the original site in Vallon-Pont-D'arc in Southern France and contains 1,000 painstakingly-reproduced drawings as well as around 450 bones and other features.

The original and unique ‘Grotte Chauvet’ was discovered around 20 years ago and is a Unesco World Heritage Site.

It is the oldest known and the best preserved cave decorated by man, but is not open to the public and is only seen by a handful of experts every year, in order to keep the precious works of art safe.

Now experts have scanned the original drawings using 3D modelling techniques to capture each marking and position them correctl

Let's have the model complete 100 tokens after TL;DR.

In [None]:
file_len = len(tokenizer(full_file)['input_ids'])
max_len = file_len + 200 # set max generation length to be 200 tokens longer than file

model_out = generate_with_gpt2(full_file, max_len, k=2) # follows hyperparameter recommendations of paper
out_summary = model_out[len(full_file) - 1:]
out_summary

" ” The Cavern duPont d'Ardèch is one of Europe's largest and most important prehistoric man-made caves, located in southern France. ” It is also one the largest man made caves in Europe, and one with a rich and fascinating history. The cave contains a number of important works, including the famous Grottes Chauvets‖, an early representation of an animal.<|endoftext|>"