# Using GPT-2 to Create the 'Secret' Friends Episode

[GPT-2 Simple](https://github.com/minimaxir/gpt-2-simple) is the only dependency we'll need to get going. It is basically a customized GPT-2 that allows us to 'fine tune' the text using data of our choosing. It only works with Tensorflow ver <=1.14

In [1]:
!pip install gpt-2-simple
!pip install tensorflow==1.14



In [1]:
import gpt_2_simple as gpt2
from datetime import datetime

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


We need to download the proper GPT-2 model first.

There are three released sizes of GPT-2:

124M (default): the "small" model, 500MB on disk.
355M: the "medium" model, 1.5GB on disk.
774M: the "large" model, can't be fine tuned in Colab.
1558M: the "extra large", true model also can't be fine tuned in Colab.

The best model for fine tuning in colab and easiest to work with is the 124M.

In [3]:
gpt2.download_gpt2(model_name="124M")

Fetching checkpoint: 1.05Mit [00:00, 504Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 101Mit/s]                                                    
Fetching hparams.json: 1.05Mit [00:00, 398Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:01, 250Mit/s]                                   
Fetching model.ckpt.index: 1.05Mit [00:00, 309Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 132Mit/s]                                                 
Fetching vocab.bpe: 1.05Mit [00:00, 183Mit/s]                                                       


Mount google drive and load the file

In [4]:
gpt2.mount_gdrive()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# I used seasons 1 & 3, with other random episodes throughout
text_location = "friends_script.txt"
gpt2.copy_file_from_gdrive(text_location)

## Finetune GPT-2

Now for the longest and most important step, we have to finetune the GPT-2 model with our data (to run indefinitely, set steps = -1).

The model checkpoints will be saved in /checkpoint/chkpt1 by default. We have it set up to save the checkpoints every 100 steps. If you rerun this cell, you might need to restart the kernel.

restore_from: Set to 'fresh' to start training from the base GPT-2, or set to 'latest' to restart training from an existing checkpoint.

sample_every: Number of steps to print example output

print_every: Number of steps to print training progress.

learning_rate: Learning rate for the training (lower to 1e-5 if you have <1MB input data)

run_name: subfolder within checkpoint to save the model (will also need to specify run_name when loading the model)

overwrite: Set to 'True' if you want to continue finetuning an existing model (make sure to also set restore_from='latest') without creating duplicate copies

In [3]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset=text_location,
              model_name='124M',
              steps=500,
              restore_from='latest',
              run_name='chkpt1',
              print_every=10,
              sample_every=250,
              save_every=50,
              overwrite = True
              )

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Loading checkpoint checkpoint/chkpt1/model-45
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from checkpoint/chkpt1/model-45


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:01<00:00,  1.53s/it]


dataset has 324674 tokens
Training...
Saving checkpoint/chkpt1/model-45
[50 | 387.17] loss=2.23 avg=2.23
Saving checkpoint/chkpt1/model-50
Instructions for updating:
Use standard file APIs to delete files with this prefix.
[60 | 1150.46] loss=2.35 avg=2.29
[70 | 1917.28] loss=2.34 avg=2.31
[80 | 2669.83] loss=2.47 avg=2.35
interrupted
Saving checkpoint/chkpt1/model-82


## Loss

Something worth noting is the loss and the average loss. While the loss at the point of iteration is not important, the average IS important. It needs to be constantly going down. If it starts going back up, that means the model has 'converged' and the model won't get any better based on the data it's working with. In this scenario, it looks like step 50 was the lowest the loss was, and it's creeping back up. This makes sense becuase we only have it 1 MB worth of text training. I stopped the training at step 82. 

Save the checkpoint

In [None]:
gpt2.copy_checkpoint_to_gdrive(run_name="chkpt1")

Generate from saved checkpoint!

In [5]:
gpt2.generate(sess, run_name="chkpt1", 
              length=400,
              temperature=0.7,
              prefix="[Scene: Monica and Joey are sitting on coffee shop couch reading My Life by Bill Clinton]",
              nsamples=3,
              batch_size=3)

[Scene: Monica and Joey are sitting on coffee shop couch reading My Life by Bill Clinton]
Ross: No! No!
Chandler: No! No!
Ross: No, no, no, no!
Chandler: No, no, no, no!
Ross: No, no, no, no, no!
Chandler: No, no, no, no, no!
Ross: No, no, no, no, no!
Chandler: No, no, no, no, no!
Ross: No, no, no, no, no!
Chandler: No, no, no, no, no!
Ross: No, no, no, no, no!
Chandler: No, no, no, no, no, no!
Ross: No, no, no, no, no!
Chandler: No, no, no, no, no, no!
Ross: No, no, no, no, no, no!
Chandler: No, no, no, no, no, no, no!
Ross: No, no, no, no, no, no!
Chandler: No, no, no, no, no, no, no!
Ross: No, no, no, no, no, no!
Chandler: No, no, no, no, no, no, no!
Ross: No, no, no, no, no, no!
Chandler: No, no, no, no, no, no, no!
Ross: No, no, no, no, no, no, no!
Chandler: No, no, no, no, no, no, no!
Ross: No, no, no, no, no, no, no, no!
[Scene: Monica and Joey are sitting on coffee shop couch reading My Life by Bill Clinton]
 
Monica: Hey!
Joey: Hey, you wanna date me?
Monica: Yeah!
Joey: Can I