In [7]:
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

STEPS = 3
if IN_COLAB:
    !pip install gpt-2-simple
    STEPS = 100

import gpt_2_simple as gpt2
import os
import requests
import keras

# Borrowing GPT2

Language models such as the popular GPT2/3/4/chat models are trained on lots of data and are absolutely huge in size. It isn't realistic for us to train a model that is anywhere near that size and sophistication, but we can borrow a model and repurpose it for our use. 

## Download Model

The model itself is pretty large, we are downloading a model that is roughly 500MB, and we are using the smallest model. The large ones are large enough that they are impractical to deal with if we don't have some enterprise scale hardware.

<b>Big Note:</b> these GPT models that we are downloading have one specific and annoying trait, they can't repurposed once created. Meaning that the "sess" object we define below is tied to the model and data we used it on first - it isn't like other examples where there can be a series of models all named "model = Sequential...". There will be an error quoting something like "graph error" if you try. To fix things, restart the runtime and run again. 

In [5]:
# Load model
model_name = "124M"
if not os.path.isdir(os.path.join("models", model_name)):
	print(f"Downloading {model_name} model...")
	gpt2.download_gpt2(model_name=model_name)   # model is saved into current directory under /models/124M/

## Finetune Model

We can take the model and tailor it to our use by providing it with some additional text that it can use for fine tuning. 

In [4]:
file_name = "shakespeare.txt"
if not os.path.isfile(file_name):
	url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
	data = requests.get(url)

	with open(file_name, 'w') as f:
		f.write(data.text)


sess = gpt2.start_tf_sess()
gpt2.finetune(sess,
              file_name,
              model_name=model_name,
              steps=STEPS)   # steps is max number of training steps

gpt2.generate(sess)

2023-04-13 11:55:32.045625: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-13 11:55:52.970576: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled


Loading checkpoint checkpoint/run1/model-16
INFO:tensorflow:Restoring parameters from checkpoint/run1/model-16
Loading dataset...


100%|██████████| 1/1 [00:05<00:00,  5.88s/it]


dataset has 338025 tokens
Training...
[17 | 67.85] loss=4.56 avg=4.56
[18 | 139.30] loss=3.90 avg=4.23
interrupted
Saving checkpoint/run1/model-18


### Generate Text

Now that the model is downloaded and fine tuned to our data, we can generate some new text. There are a few parts here that we should look at a little more closely:
<ul>
<li> Temperature: This is a value that controls how random the output is. The higher the value, the more random the output. The lower the value, the more likely the output is to be similar to the input. </li>
<li> Length: This is the number of tokens that will be generated. </li>
<li> Prefix: This is the text that will be used to seed the model, a.k.a. the "starting point" of the brand new text we'll be creating. </li>
</ul>

#### Text Generation Process

Inside the model, the transformer generates new text by taking the prefix text and using it to then generate a series of tokens that make up our eventual output. 

![Text Generation](images/transformer_text_gen.png "Text Generation")

In [None]:
gpt2.generate(sess, model_name=model_name, length=100, temperature=0.7, nsamples=5, batch_size=5, prefix="Where for art thou")

In [None]:
gpt2.generate(sess, model_name=model_name, length=100, temperature=0.7, nsamples=5, batch_size=5, prefix="Where the hood at")

2023-03-27 16:15:47.318110: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled


Loading checkpoint checkpoint/run1/model-1
INFO:tensorflow:Restoring parameters from checkpoint/run1/model-1
This article is about the legendary character. You may be looking for Kudri. You may be looking for

Kudri is an iconic character in Fallout 4 and Fallout: New Vegas. He is a human male whose name means "sister" and who took the name "Kudri" from his mother, Mary.

Contents show]

Biography Edit

Kudri was born in the Kudri village of Anvil, a stone's throw from the continent of Tamriel, and grew up in the ruins of an abandoned town. After his mother's death, his father moved to the same town, after which the rest of the family members refused to give him any of their children. Despite the obvious kinship of his mother and father, Kudri's half-human half-human half-boy half-boy, who is named Kudri, was never brought up by his grandmother. He was raised by his son, who is named Kudri, and adopted by his grandmother, who taught him to read and write. He was neglected by his grandm

### Different Data

We can load some different data and see what our model generates.

<b>Note:</b> you might need to delete and restart the runtime here. If you get errors, try that, run the import stuff above, but not any modelling, then try this. Basically skip creating the first model that is Shakespeare tuned. 

In [6]:
url2 = "https://jrssbcrsefilesnait.blob.core.windows.net/3950data1/reddit_wsb_clean.csv"
train_text_file = keras.utils.get_file('train_reddit.txt', url2)

sess_2 = gpt2.start_tf_sess()
gpt2.finetune(sess_2,
              train_text_file,
              model_name=model_name,
              steps=STEPS)   # steps is max number of training steps


2023-04-13 13:05:32.808532: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled


Loading checkpoint checkpoint/run1/model-18
INFO:tensorflow:Restoring parameters from checkpoint/run1/model-18
Loading dataset...


  0%|          | 0/1 [00:32<?, ?it/s]


KeyboardInterrupt: 

In [None]:
gpt2.generate(sess_2, model_name=model_name, length=100, temperature=0.7, nsamples=5, batch_size=5, prefix="I have diamond hands")

In [None]:
gpt2.generate(sess_2, model_name=model_name, length=100, temperature=0.7, nsamples=5, batch_size=5, prefix="Where for art thou")