<a href="https://colab.research.google.com/github/jnawjux/batman_plots/blob/master/Finetuning_GPT_2_w_Batman_Plot_summaries.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Finetuning GPT-2 w/ Batman Plot summaries

This example uses `gpt-2-simple` designed by Max Woolf, you can visit [this GitHub repository](https://github.com/minimaxir/gpt-2-simple).



In [1]:
!pip install -q gpt-2-simple

import gpt_2_simple as gpt2
from datetime import datetime
from google.colab import files

W0620 16:48:59.122233 139703747852160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/memory_saving_gradients.py:13: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.



## Downloading GPT-2

For this project, I trained on the small model (117M), but the module does make the medium model avaialble as well (345M). The larger caused performance issues for my purposes.


In [3]:
gpt2.download_gpt2(model_name="117M")

Fetching checkpoint: 1.05Mit [00:00, 259Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 110Mit/s]                                                    
Fetching hparams.json: 1.05Mit [00:00, 330Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:02, 209Mit/s]                                   
Fetching model.ckpt.index: 1.05Mit [00:00, 696Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 237Mit/s]                                                 
Fetching vocab.bpe: 1.05Mit [00:00, 199Mit/s]                                                       


#### Mounting Google Drive

To save and export the model when complete, I needed to connect to my Google Drive. I have already stored in my Google Drive the text document with all the plot summaries.

In [4]:
gpt2.mount_gdrive()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
file_name = "batman_66.txt"
gpt2.copy_file_from_gdrive(file_name)

## Finetune GPT-2

#### Notes on how finetuning works with gpt-2-simple:

<em>

*  **`restore_from`**: Set to `fresh` to start training from the base GPT-2, or set to `latest` to restart training from an existing checkpoint.
* **`sample_every`**: Number of steps to print example output
* **`print_every`**: Number of steps to print training progress.
* **`learning_rate`**:  Learning rate for the training. (default `1e-4`, can lower to `1e-5` if you have <1MB input data)
*  **`run_name`**: subfolder within `checkpoint` to save the model. This is useful if you want to work with multiple models (will also need to specify  `run_name` when loading the model)
* **`overwrite`**: Set to `True` if you want to continue finetuning an existing model (w/ `restore_from='latest'`) without creating duplicate copies. </em>

In [7]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset=file_name,
              model_name='117M',
              steps=1000,
              restore_from='fresh',
              run_name='batman_66',
              print_every=10,
              sample_every=200,
              save_every=500
              )

W0620 16:49:31.092854 139703747852160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py:90: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0620 16:49:31.094818 139703747852160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py:100: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0620 16:49:31.452461 139703747852160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/gpt_2_simple/gpt_2.py:164: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0620 16:49:31.456313 139703747852160 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/gpt_2_simple/src/model.py:148: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0620 16:49:37.449892 139703747852160 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/gpt_2_sim

Loading checkpoint models/117M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:00<00:00,  4.30it/s]


dataset has 18838 tokens
Training...
[10 | 31.61] loss=3.23 avg=3.23
[20 | 56.58] loss=2.58 avg=2.90
[30 | 81.10] loss=2.22 avg=2.67
[40 | 105.23] loss=1.51 avg=2.38
[50 | 129.59] loss=0.72 avg=2.04
[60 | 154.22] loss=0.56 avg=1.79
[70 | 178.70] loss=0.31 avg=1.57
[80 | 203.06] loss=0.19 avg=1.39
[90 | 227.40] loss=0.35 avg=1.27
[100 | 251.81] loss=0.09 avg=1.15
[110 | 276.28] loss=0.07 avg=1.04
[120 | 300.72] loss=0.07 avg=0.96
[130 | 325.21] loss=0.05 avg=0.88
[140 | 349.68] loss=0.06 avg=0.82
[150 | 374.10] loss=0.05 avg=0.77
[160 | 398.51] loss=0.05 avg=0.72
[170 | 422.86] loss=0.05 avg=0.68
[180 | 447.17] loss=0.06 avg=0.64
[190 | 471.44] loss=0.04 avg=0.60
[200 | 495.73] loss=0.04 avg=0.57
 City, The Dynamic Duo enter the school, with the intention of discovering who put the piano on security camera. Unfortunately, they find out little on the pianist's shop listing suggests he creates the image. Later, at Madame Antoinette's office, Lin-Manuel Rosario, dressed in the famous red t

W0620 17:31:30.746577 139703747852160 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:960: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.


### Copying checkpoints to Google Drive

In [0]:
gpt2.copy_checkpoint_to_gdrive(run_name='batman_66')

Once this is complete, I downloaded the checkpoint .rar file to unpack and start to test.