# **GPT-2: an attention-based model for text generation**

gpt-2 (Generative Pre-Training) is an attention-based model by OpenAI specialized in **text generation** tasks. It was released in 2019.

For further details: 
https://openai.com/blog/better-language-models/

We will use the model via the Python library gpt-2-simple.

In [1]:
%tensorflow_version 1.x
!pip install -q gpt-2-simple

TensorFlow 1.x selected.
  Building wheel for gpt-2-simple (setup.py) ... [?25l[?25hdone


## SEE IT IN ACTION

First, let's download the model and see what it can do!

You can also have a look at these:
- https://talktotransformer.com/
- https://play.aidungeon.io/

In [2]:
import gpt_2_simple as gpt2

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



In [3]:
model_name = '124M'   # this is the number of parameters in the model
gpt2.download_gpt2(model_name = model_name)

Fetching checkpoint: 1.05Mit [00:00, 436Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 120Mit/s]                                                    
Fetching hparams.json: 1.05Mit [00:00, 370Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:03, 145Mit/s]                                   
Fetching model.ckpt.index: 1.05Mit [00:00, 269Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 174Mit/s]                                                 
Fetching vocab.bpe: 1.05Mit [00:00, 195Mit/s]                                                       


In [5]:
import tensorflow as tf

tf.reset_default_graph()
session = gpt2.start_tf_sess()

gpt2.load_gpt2(sess = session, run_name = model_name, checkpoint_dir = 'models', model_dir = 'models/'+model_name)

print("\n\n*** GENERATED TEXT:")
gpt2.generate(sess        = session,
              model_name  = model_name,
              temperature = 0.7,
              length      = 50,
              prefix      = "I was walking on the street when") # optional

Loading checkpoint models/124M/model.ckpt
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


*** GENERATED TEXT:
I was walking on the street when I saw this bunch of kids, 25 or 30, and they were talking to my wife and her family. I realized it was so weird. I thought, this is strange."

The day after the shooting, the family, who were on


## FINE-TUNING

gpt-2 can also be **fine-tuned** to learn to generate text of a specific type / genre.

We will now teach it to write new Shakespeare plays.

First, let's download a dataset of sample Shakespeare works:

In [8]:
import os
import requests

filename = "shakespeare.txt"
if not os.path.isfile(filename):
    print("Dowloading {}\n".format(filename))
    url  = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
    data = requests.get(url)
    
    with open(filename, 'w') as f:
        f.write(data.text)

with open(filename) as f:
    print(f.read()[:500])

Dowloading shakespeare.txt

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor


Now, let's fine-tune!

You can do the same with any custom corpus of your choice, provided that the corpus is big enough.

In [14]:
tf.reset_default_graph()
session = gpt2.start_tf_sess()

gpt2.finetune(session,
              dataset       = filename,
              model_name    = model_name,
              steps         = 100,  # just for testing; choose a bigger number like 100
              restore_from  = 'fresh',  # change to 'latest' to resume training
              run_name      = 'gpt-2-finetuning',
              print_every   = 10,   # print loss every ... steps
              sample_every  = 20,   # generate sample text every ... steps
              sample_length = 300,
              save_every    = 500)

Loading checkpoint models/124M/model.ckpt
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:01<00:00,  1.84s/it]


dataset has 338025 tokens
Training...
[10 | 18.05] loss=3.71 avg=3.71
[20 | 30.54] loss=3.62 avg=3.66
TER_DIMENSION_ALERT (0);
\tvar d = 0, s = 0, n = 0,
};

\t}

});

\t}

};

if (this.setTargetSets) {
$tw.utils.getObjects(
\"TargetSets\" );
\t} else {
this.setTargetSets = true;
}
};//

/**
** Clean up all these strings from the DOM:
*/
function cleanUpString(text):
return
this.replace(text,
\"\");
};
const tiddlerClassName = \"test\", $tw.config.vendor.test.Classes;
var text = this.contains("test.txt", null, 'test.txt':0, 'test.txt:test.txt');
var text = this.contains("test.txt", \"test.txt'', 'test.txt');
var text = this.contains("test.txt", \"test.','.'');
var text = this.contains("test.txt", \"test.',
'test of test.txt.'';
var text = this.removeClass("test.txt');

var text = this.removeClass("test.txt");
var text = this.removeClass("test.txt");
var text

[30 | 46.82] loss=3.43 avg=3.58
[40 | 59.34] loss=3.51 avg=3.56
 a
'Fascit das het, o'er, and o'er;
It is, or rather, or rather,