In [None]:
import tensorflow as tf
import gpt_2_simple as gpt2
from datetime import datetime
import ipywidgets as widgets
import os
import shutil
import tarfile
from IPython.display import clear_output
clear_output()

In [None]:
gpt2.download_gpt2(model_name="124M")
gpt2.download_gpt2(model_name="355M")
clear_output()

# **Create AI-generated text with GPT-2-Simple**###

Ever wondered what a Shakespearean AI would sound like? Or what an AI would make of your essays? Wonder no more! Just follow the instructions here, and you'll be creating your own AI-generated text in (probably) no time.

GPT-2 is an open-source text-generationn AI from OpenAI, first released in 2019. At the time, GPT-2 was a game changer in the AI natural language processing field. This notebook allows you (yes, *you*!) to train your own GPT-2-based AI and to generate plausible-sounding text with ease. Much of the code in this notebook was drawn from [GPT-2-simple](https://minimaxir.com/2019/09/howto-gpt2/) by Max Woolf.

What you can do with this notebook:
1. Train your own GPT-2 model with text data you've selected.
2. Generate text, using a model you've trained.
3. Generate text using pre-trained models (vanilla GPT-2, Shakespeare, and more)

Note: Each time you want to switch between models, you'll need to close this tab and re-open the link. There are almost certainly lots of bugs in this notebook; if you encounter trouble, just reload the page or restart the tab.

Note^2: This notebook may time a few seconds to a few minutes to fully load.

First, you'll need to connect the notebook to your Google Drive. Don't have Google Drive? Get Google Drive.

In [None]:
print("To connect your Google Drive:")
gpt2.mount_gdrive()

To connect your Google Drive:
Mounted at /content/drive


Great! You're ready to start exploring.

If you'd like to train a new AI with your own training text dataset, head to **1. Train a Model with Your own Text Data**. To generate text from a model you trained and saved previously, go to **2. Generate Text From A Pre-Trained Model**.

If you want to switch from training your own model to generating text from a pre-trained model, or vice versa, you'll need to close the tab and re-open the link. You'll need to do the same if you want to train a new model after already training one or generating text from a pre-trained model in this session.

###**1. Train a Model with Your own Text Data**###

There are three main steps to training your own GPT-2 model:
1. Upload your dataset from Google Drive
2. Select your training hyperparameters
3. Train your model (and save it to your Google Drive for future use)

Do make sure that you go through those steps in sequence, otherwise this notebook is liable to crash. You'll be able to generate text after you've trained your model.

###1.1 Upload your dataset from Google Drive###

"But omniscient narrator," I hear you ask, "I don't have my own dataset." If that's the case, go and collect together some text that you'd like to train your model to mimic. You'll need quite a lot - at least a couple tens of thousands of words - to increase your chances of training of plausible-sounding AI, but it's possible to manage with less with a bit of luck.

Some ideas for text data:
*   The collected works of your favourite author, scholar, political figure, historical personage etc.
*   Everything you've ever written (e.g. all of your essays, articles and theses)
*   Your favourite book/books
*   A dataset from the web (there are plenty available, designed specifically for training natural language processing models)

Once you've got all your text data, you'll need to copy it into a single .txt file, and then upload it to the top-level directory in your Google Drive (i.e. your "My Drive" folder).

Type into the full filename of your .txt file (e.g. "shakespeare.txt") into the file_name field below, and press the "Upload dataset" button. You can upload as many dataset as you wish (at least, as many as the server storage will hold), but you can only train a given model on a single dataset in this notebook.



In [None]:
# Upload dataset from GDrive. User provides file name, and notebook fetches from GDrive
GDrive_dataset_filename = ""
def upload_dataset(file_name):
  global GDrive_dataset_filename
  GDrive_dataset_filename = file_name
  return file_name

def upload_dataset_on_click(filename):
  gpt2.copy_file_from_gdrive(GDrive_dataset_filename)
  print(GDrive_dataset_filename + " downloaded")

enter_dataset_name = widgets.interact(upload_dataset, file_name="")

upload_dataset_button = widgets.Button(description="Upload dataset")

upload_dataset_button.on_click(upload_dataset_on_click)

widgets.VBox([upload_dataset_button])

interactive(children=(Text(value='', description='file_name'), Output()), _dom_classes=('widget-interact',))

VBox(children=(Button(description='Upload dataset', style=ButtonStyle()),))

shakespeare.txt downloaded
Zhining corpus.txt downloaded


###1.2 Select your training hyperparameters###

Now that your dataset is loaded, you'll need to decide on the model's settings - its hyperparameters. To keep things simple, there are only 3 main training hyperparameters you can tune with this notebook:
1. The dataset you want to train your model on
2. The number of training steps
3. The size of the model

First, select the dataset to train on. Enter the filename of a dataset that you have already uploaded (e.g. "my_text.txt"). 

Note: A dataset containig all Shakespeare's plays has already been loaded in for you. To use it, type in "shakespeare.txt".

In [None]:
# parameter selection
# dataset name (any dataset uploaded in this session, default is the last uploaded dataset)
DATASET_FILENAME = ""

def dataset_filename_parameter(file_name):
  global DATASET_FILENAME
  DATASET_FILENAME = file_name
  return (file_name)

enter_train_dataset_filename = widgets.interact(dataset_filename_parameter, file_name="")

interactive(children=(Text(value='', description='file_name'), Output()), _dom_classes=('widget-interact',))

Next, enter the number of training steps. In general, a higher training step number will lead to better result up to a point, but a training step number that is too high will cause your model to overfit and dramatically reduce the quality of generated text. The larger the number, the more time the model will take to train.

As a rough rule of thumb, for datasets of around a few MB in size, 10 steps will be too few, while 1000 steps will likely be too high. 100 or 200 steps is normally a safe number to go with.

Make sure you enter a positive integer figure.

In [None]:
# select train steps
TRAIN_STEPS = 0

def train_steps_parameter(train_steps):
  global TRAIN_STEPS
  TRAIN_STEPS = int(train_steps)
  return ("Train for " + str(TRAIN_STEPS) + " steps" )

enter_train_steps = widgets.interact(train_steps_parameter, train_steps="100")

interactive(children=(Text(value='100', description='train_steps'), Output()), _dom_classes=('widget-interact'…

Finally, select the size of the model. GPT-2 comes in a number of sizes - this notebook supports training on the 124 and 355 million parameter versions. While more parameters may lead to better results, high-parameter models take longer to train and may be counterproductive when the training dataset is small. Generally, the "124M" model is the safer option.

In [None]:
# select model name (i.e. 124M or 355M)
select_train_model_name = widgets.RadioButtons(
    options=['124M', '355M'],
    description='Select model',
    disabled=False
)

TRAIN_MODEL_NAME = select_train_model_name.value

widgets.VBox([select_train_model_name])

VBox(children=(RadioButtons(description='Select model', options=('124M', '355M'), value='124M'),))

###1.3 Train your model###

You're ready to train your AI! Once you've uploaded a dataset and set all the hyperparameters, press "Train your model" below. Training time may take some time, depending on the size of your dataset, the size of the model selected, and the number of training steps. Expect to wait at least a minute or two. Very large datasets or training step numbers may take over an hour to train.

In [None]:
# This takes the last .txt dataset uploaded to 
sess = gpt2.start_tf_sess()
def train_your_model(change):
  gpt2.finetune(sess,
              dataset=DATASET_FILENAME,
              model_name=TRAIN_MODEL_NAME,
              steps=TRAIN_STEPS,
              restore_from='fresh',
              run_name='run1',
              print_every=10,
              sample_every=100,
              save_every=500
              )
# train model button
train_model_button = widgets.Button(description="Train your model!")

train_model_button.on_click(train_your_model)

widgets.VBox([train_model_button])

VBox(children=(Button(description='Train your model!', style=ButtonStyle()),))

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Loading checkpoint models/124M/model.ckpt
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


  0%|          | 0/1 [00:00<?, ?it/s]

Loading dataset...


100%|██████████| 1/1 [00:01<00:00,  1.71s/it]


dataset has 338024 tokens
Training...
[10 | 28.92] loss=3.58 avg=3.58
Saving checkpoint/run1/model-10


If you'd like to access your trained model again after this session, you'll need to save it to your Google Drive. Upon pressing "Copy model to GDrive", the notebook will save your model in compressed form to your "My Drive" folder with the filename "checkpoint_run1.tar".

Make sure you've renamed any other models you've made with this notebook in your "My Drive" folder to something other than "checkpoint_run1.tar" to avoid overwriting anything. It may take a few moments for the model to be saved.

In [None]:
# Copy checkpoint to GDrive # TODO
def copy_model_to_GDrive(change):
  gpt2.copy_checkpoint_to_gdrive(run_name='run1')
  print("Model copied to Google Drive")

copy_model_to_GDrive_button = widgets.Button(description="Copy model to GDrive")

copy_model_to_GDrive_button.on_click(copy_model_to_GDrive)

widgets.VBox([copy_model_to_GDrive_button])

VBox(children=(Button(description='Copy model to GDrive', style=ButtonStyle()),))

Model copied to Google Drive


###1.4 Generate text from your trained model###

How well does your model perform? Does it produce plausible-sounding, more-or-less grammatically correct text? (Probably). Can it ace the Turing test? (probably not).

Before you can find out, you'll need to state what you want. First, enter the number of words you'd like each generated text to be. The default is 250. Make sure you enter a positive integer figure.

In [None]:
# TEXT_LENGTH
TEXT_LENGTH = 0

def text_length_parameter(text_length):
  global TEXT_LENGTH
  TEXT_LENGTH = int(text_length)
  return ("Generate text of length " + str(TEXT_LENGTH) + " words")

enter_text_length = widgets.interact(text_length_parameter, text_length="250")

interactive(children=(Text(value='250', description='text_length'), Output()), _dom_classes=('widget-interact'…

Next, set the "temperature". Temperature can be thought of as something that sets how adventurous the model will be in generating text. A temperature of 0.1 will lead to some deeply (and I mean *deeply* uninspiring stuff). Meanwhile, a temperature of 0.9 will produce results that will range somewhere between "creative" and "ridiculous".

The default temperature, 0.7, generally produces the best results.

In [None]:
# TEMPERATURE
TEMPERATURE = 0

def temperature_parameter(temperature):
  global TEMPERATURE
  TEMPERATURE = temperature
  return ("Text generation temperature: " + str(TEMPERATURE))

enter_temperature = widgets.interact(temperature_parameter, temperature=widgets.FloatSlider(
  value=0.7,
  min=0.0,
  max=1.0,
  step=0.01,
  description='Temperature:'))

interactive(children=(FloatSlider(value=0.7, description='Temperature:', max=1.0, step=0.01), Output()), _dom_…

How many text samples would you like to generate in one go? Make sure you enter a positive integer.

In [None]:
# NSAMPLES
NSAMPLES = 0

def nsamples_parameter(nsamples):
  global NSAMPLES
  NSAMPLES = int(nsamples)
  return ("Number of text samples to generate: " + str(NSAMPLES))

enter_nsamples = widgets.interact(nsamples_parameter, nsamples="5")



interactive(children=(Text(value='5', description='nsamples'), Output()), _dom_classes=('widget-interact',))

What would you like each generated sample text to start with? The model will generate a text based on the starter text you enter below.

The default is "Once upon a time", but that's boring. Generally speaking, the more creative the starter text, the more interesting/silly the results, so get creative.

In [None]:
# STARTER_TEXT
STARTER_TEXT = ""

def starter_text_parameter(starter_text):
  global STARTER_TEXT
  STARTER_TEXT = starter_text
  return ("Starter text: " + starter_text)

enter_starter_text = widgets.interact(starter_text_parameter, starter_text="Once upon a time")

interactive(children=(Text(value='Once upon a time', description='starter_text'), Output()), _dom_classes=('wi…

Once you're done entering the settings, press "Generate text" below to see what your model comes up with! You may need to wait a few seconds for the texts to be generated.

You can generate text, with different settings, as many times as you like in this session. However, if you want to train a new model or load in a new model in the section of the notebook below, you'll probably need to close the tab and re-open the link to avoid bugs or errors.

In [None]:
generate_text_button = widgets.Button(
    description='Generate text',
    disabled=False,
)

display(generate_text_button)

out = widgets.Output()
display(out)

def generate_text(clear_text_button):
    with out:
        clear_output()
        gpt2.generate(sess,
                length=TEXT_LENGTH,
                temperature=TEMPERATURE,
                prefix=STARTER_TEXT,
                nsamples=NSAMPLES,
                batch_size=5
                )

generate_text_button.on_click(generate_text)

Button(description='Generate text', style=ButtonStyle())

Output()

###**2. Generate Text from a Model You've Made Before**###

###2.1 Load in a Model You've Made Before###

Do you want to show off your model to all your friends? Do you have an uncontrollable addiction to creating auto-generated text? Are you bored and have nothing better to do?

If you've saved a model you trained in this notebook on a previous occasion into your Google Drive, and you've answered "Yes, of course!" to any of the above questions, you'll want to load your model back in here to generate more text.

To do so, place the .tar file containing the model into your highest-level directory in your Google Drive (i.e. your "My Drive" folder). Then, enter the filename of the model, without the ".tar" filename extension, in the field below. The default is "checkpoint_run1".

In [None]:
# Select model_filename to be downloaded
# STARTER_TEXT
UPLOAD_MODEL_FILNAME = ""

def upload_model_filename_parameter(upload_model_filename):
  global UPLOAD_MODEL_FILENAME
  UPLOAD_MODEL_FILENAME = upload_model_filename
  return ("Model to upload: " + upload_model_filename)

enter_upload_model_filename = widgets.interact(upload_model_filename_parameter, upload_model_filename="checkpoint_run1")

interactive(children=(Text(value='checkpoint_run1', description='upload_model_filename'), Output()), _dom_clas…

In [None]:
# Load in a pre-trained dataset
def my_copy_checkpoint_from_gdrive(model_filename='checkpoint_run1'): # this is an altered version of gpt-2-simple's copy_checkpoint_from_GDrive() by Max Woolf
  """Copies the checkpoint folder from a mounted Google Drive."""
  gpt2.is_mounted()
  
  checkpoint_folder = os.path.join('checkpoint', model_filename)

  file_path = model_filename + '.tar'

  #file_path = gpt2.get_tarfile_name(checkpoint_folder)

  shutil.copyfile("/content/drive/My Drive/" + file_path, file_path)

  with tarfile.open(file_path, 'r') as tar:
    tar.extractall()
  
  print(model_filename + " downloaded and extracted")

def copy_model_from_GDrive(change):
  my_copy_checkpoint_from_gdrive(model_filename='checkpoint_run1')
  return "Model copied from Google Drive"

copy_model_from_GDrive_button = widgets.Button(description="Copy model from GDrive")

copy_model_from_GDrive_button.on_click(copy_model_from_GDrive)

widgets.VBox([copy_model_from_GDrive_button])


VBox(children=(Button(description='Copy model from GDrive', style=ButtonStyle()),))

checkpoint_run1 downloaded and extracted
checkpoint_run1 downloaded and extracted


Next, select the size of the model you just uploaded.

Don't remember the model size you used when you trained it?

I have no answers I'm afraid. Just try both options. If one option doesn't work, you may need to close the tab and re-open the link to get the notebook working again.

In [None]:
# LOAD_MODEL_NAME (Default 124M)

LOAD_MODEL_NAME = "124M"

select_load_model_name = widgets.RadioButtons(
    options=['124M', '355M'],
    description='Select model',
    disabled=False
)

LOAD_MODEL_NAME = select_load_model_name.value

widgets.VBox([select_load_model_name])


VBox(children=(RadioButtons(description='Select model', options=('124M', '355M'), value='124M'),))

###2.2 Generate Text from Your Model###

First, enter the number of words you'd like each generated text sample to have. Make sure you enter a positive integer.

In [None]:
# TEXT_LENGTH
TEXT_LENGTH = 0

def text_length_parameter(text_length):
  global TEXT_LENGTH
  TEXT_LENGTH = int(text_length)
  return ("Generate text of length " + str(TEXT_LENGTH) + " words")

enter_text_length = widgets.interact(text_length_parameter, text_length="250")

interactive(children=(Text(value='250', description='text_length'), Output()), _dom_classes=('widget-interact'…

Next, enter you're preferred temperature. You should enter a floating point number betwee 0.0 and 1.0.

In [None]:
# TEMPERATURE
TEMPERATURE = 0

def temperature_parameter(temperature):
  global TEMPERATURE
  TEMPERATURE = temperature
  return ("Text generation temperature: " + str(TEMPERATURE))

enter_temperature = widgets.interact(temperature_parameter, temperature=widgets.FloatSlider(
  value=0.7,
  min=0.0,
  max=1.0,
  step=0.01,
  description='Temperature:'))

interactive(children=(FloatSlider(value=0.7, description='Temperature:', max=1.0, step=0.01), Output()), _dom_…

How many text asmples would you like your model to generate. Enter a positive integer below.

In [None]:
# NSAMPLES
NSAMPLES = 0

def nsamples_parameter(nsamples):
  global NSAMPLES
  NSAMPLES = int(nsamples)
  return ("Number of text samples to generate: " + str(NSAMPLES))

enter_nsamples = widgets.interact(nsamples_parameter, nsamples="5")

interactive(children=(Text(value='5', description='nsamples'), Output()), _dom_classes=('widget-interact',))

Last, but certainly not least, give your model some a starting prompt to build on. "Once upon a time" is the default.

In [None]:
# STARTER_TEXT
STARTER_TEXT = ""

def starter_text_parameter(starter_text):
  global STARTER_TEXT
  STARTER_TEXT = starter_text
  return ("Starter text: " + starter_text)

enter_starter_text = widgets.interact(starter_text_parameter, starter_text="Once upon a time")

interactive(children=(Text(value='Once upon a time', description='starter_text'), Output()), _dom_classes=('wi…

You're good to go! Press the "Generate text" button, and your generated texts will apear after a few seconds.

In [None]:
generate_text_button = widgets.Button(
    description='Generate text',
    disabled=False,
)

display(generate_text_button)

out = widgets.Output()
display(out)

def generate_text(clear_text_button):
    with out:
        clear_output()
        gpt2.generate(sess,
                length=TEXT_LENGTH,
                temperature=TEMPERATURE,
                prefix=STARTER_TEXT,
                nsamples=NSAMPLES,
                batch_size=5
                )

generate_text_button.on_click(generate_text)

Button(description='Generate text', style=ButtonStyle())

Output()

In [None]:
os.path.join('test', 'test')

'test/test'