# **GPT-2: an attention-based model for text generation**

gpt-2 (Generative Pre-Training) is an attention-based model by OpenAI specialized in **text generation** tasks. It was released in 2019.

For further details: 
https://openai.com/blog/better-language-models/

We will use the model via the Python library gpt-2-simple.

In [1]:
%tensorflow_version 1.x
!pip install -q gpt-2-simple

  Building wheel for gpt-2-simple (setup.py) ... [?25l[?25hdone


## SEE IT IN ACTION

First, let's download the model and see what it can do!

You can also have a look at these:
- https://talktotransformer.com/
- https://play.aidungeon.io/

In [2]:
import gpt_2_simple as gpt2

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



In [3]:
model_name = '124M'   # this is the number of parameters in the model
gpt2.download_gpt2(model_name = model_name)

Fetching checkpoint: 1.05Mit [00:00, 294Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 123Mit/s]                                                    
Fetching hparams.json: 1.05Mit [00:00, 756Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:02, 235Mit/s]                                   
Fetching model.ckpt.index: 1.05Mit [00:00, 330Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 160Mit/s]                                                 
Fetching vocab.bpe: 1.05Mit [00:00, 169Mit/s]                                                       


In [4]:
import tensorflow as tf

tf.reset_default_graph()
session = gpt2.start_tf_sess()

gpt2.load_gpt2(sess = session, run_name = model_name, checkpoint_dir = 'models', model_dir = 'models/'+model_name)

print("\n\n*** GENERATED TEXT:")
gpt2.generate(sess        = session,
              model_name  = model_name,
              temperature = 0.7,
              length      = 50,
              prefix      = "I was walking on the street when") # optional

Loading checkpoint models/124M/model.ckpt
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


*** GENERATED TEXT:
I was walking on the street when I heard the sound of my brother's voice coming from the back of my head. He was playing the guitar. He was really young, he was carrying a large pack at the time. I said, 'He's playing the guitar?' And he


## FINE-TUNING

gpt-2 can also be **fine-tuned** to learn to generate text of a specific type / genre.

We will now teach it to write new Shakespeare plays.

First, let's download a dataset of sample Shakespeare works:

In [5]:
import os
import requests

filename = "shakespeare.txt"
if not os.path.isfile(filename):
    print("Dowloading {}".format(filename))
    url  = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
    data = requests.get(url)
    
    with open(filename, 'w') as f:
        f.write(data.text)

with open(filename) as f:
    print(f.read()[:500])

Dowloading shakespeare.txt
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor


Now, let's fine-tune!

You can do the same with any custom corpus of your choice, provided that the corpus is big enough.

In [6]:
tf.reset_default_graph()
session = gpt2.start_tf_sess()

gpt2.finetune(session,
              dataset       = filename,
              model_name    = model_name,
              steps         = 1000,
              restore_from  = 'fresh',  # change to 'latest' to resume training
              run_name      = 'gpt-2-finetuning',
              print_every   = 10,   # print loss every ... steps
              sample_every  = 20,   # generate sample text every ... steps
              sample_length = 300,
              save_every    = 500)

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Loading checkpoint models/124M/model.ckpt
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:01<00:00,  1.81s/it]


dataset has 338025 tokens
Training...
[10 | 28.83] loss=3.57 avg=3.57
[20 | 52.09] loss=3.46 avg=3.52
! I had a great experience.

And for ten years, I have taught to him.

And on the fifth day of the year, after I had lost myself in the sight of the Lord, I received by faith, as some people do, that with an eye to the devil I had the same as the devil.
I received his face: and he was like to me;
He looked upon all my enemies with contempt,
And did not seem to regard them as enemies, and for what they were all as enemies,
And so I fled upon his hand with him;
And I, being a fool, came to some new place
Where I heard a sound in heaven, but not, where it was:
For that my life in heaven is a lie,
For all my business is a lie:
Thou, the father of heaven, that should go,
That all be good to him; and that it should not,
Was therefore, that thou shouldest have some advantage;
Thus it happened.

And all this happened.

And the angel of glad tidings came unto him,
And told him, God, thou doest,

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py", line 337, in finetune
    opt_compute, feed_dict={context: sample_batch()})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    

TypeError: ignored