<a href="https://colab.research.google.com/github/tedk108/GPT-2-Journalism-Can-AI-produce-Mike-Royko-s-writing-/blob/main/gpt_Train_a_GPT_2_Text_Generating_Model_w_GPU_mikeroyko_20210514_public.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  Train a GPT-2 Text-Generating Model w/ GPU For Free 

by [Max Woolf](http://minimaxir.com)

*Last updated: February 14th, 2021*

Retrain an advanced text generating neural network on any text dataset **for free on a GPU using Collaboratory** using `gpt-2-simple`!

For more about `gpt-2-simple`, you can visit [this GitHub repository](https://github.com/minimaxir/gpt-2-simple). You can also read my [blog post](https://minimaxir.com/2019/09/howto-gpt2/) for more information how to use this notebook!


To get started:

1. Copy this notebook to your Google Drive to keep it and save your changes. (File -> Save a Copy in Drive)
2. Make sure you're running the notebook in Google Chrome.
3. Run the cells below:


In [None]:
%%time

%tensorflow_version 1.x
!pip install -q gpt-2-simple
import gpt_2_simple as gpt2
from datetime import datetime
from google.colab import files

CPU times: user 12.2 ms, sys: 36.4 ms, total: 48.6 ms
Wall time: 2.38 s


## Downloading GPT-2

If you're retraining a model on new text, you need to download the GPT-2 model first. 

There are three released sizes of GPT-2:

* `124M` (default): the "small" model, 500MB on disk.
* `355M`: the "medium" model, 1.5GB on disk.
* `774M`: the "large" model, cannot currently be finetuned with Colaboratory but can be used to generate text from the pretrained model (see later in Notebook)
* `1558M`: the "extra large", true model. Will not work if a K80/P4 GPU is attached to the notebook. (like `774M`, it cannot be finetuned).

Larger models have more knowledge, but take longer to finetune and longer to generate text. You can specify which base model to use by changing `model_name` in the cells below.

The next cell downloads it from Google Cloud Storage and saves it in the Colaboratory VM at `/models/<model_name>`.

This model isn't permanently saved in the Colaboratory VM; you'll have to redownload it if you want to retrain it at a later time.

In [None]:
%%time

# NOTE: 1m25s at  4:22 on 20210414 to download 124M model
#       8m00s at 17:30 on 20210419 to download 355M model

gpt2.download_gpt2(model_name="355M")

Fetching checkpoint: 1.05Mit [00:00, 459Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 3.63Mit/s]
Fetching hparams.json: 1.05Mit [00:00, 360Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 1.42Git [03:26, 6.86Mit/s]                                 
Fetching model.ckpt.index: 1.05Mit [00:00, 229Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 3.45Mit/s]
Fetching vocab.bpe: 1.05Mit [00:00, 4.40Mit/s]

CPU times: user 5.03 s, sys: 3.74 s, total: 8.77 s
Wall time: 3min 29s





## Upload your fine tuning textfile.txt to the temporary Colab drive

In [None]:
# DRAG AND DROP your clean corpus.txt file onto the temporary virtual Colab drive as shown in the class video

# Your filename should have no spaces, only alphanumeric and underscore characters with a '.txt' file extension

# e.g. 'fitzgerald_sarahg.txt'

In [None]:
# Verify that you can see your uploaded file here

!ls

checkpoint  mike_royko_new.txt	mike_royko.txt	models	sample_data  samples


In [None]:
# Verify the content of the first 25 lines

!head -n 25 mike_royko.txt  # Change 'shakespeare.txt' to the name of your file (e.g. 'fitzgerald_sarahg.txt')

1995 
Enquiring minds don't need to know 
Tuesday, January 3, 1995 
When we met for our traditional New Year's drink of Ovaltine, Slats Grobnik said: "Tell me about those pills. You buy them across the counter or does a doc have to write a prescription?" 
Pills? What pills? "Those Stupid Pills I figure you been taking lately. Boy, they really did the job, didn't they?" I am not familiar with Stupid Pills and have not used them. "You did it all on your own? Boy, then you're a natural like that baseball movie with Robert Redford. Maybe they'll make a 
movie like that about you, except at the end the only fireworks will be in your head." Do you mind if we talk about something else? "Hey, no problem." Thank you. I gather that you, too, have been on vacation. Did you have a pleasant time? "Sure. And I didn't get arrested even once." 
That wasn't what I meant. Did you go anywhere? "Yeah, I took a little trip. You wanna hear about it?" Sure. "Well, I got where I was going without having

In [None]:
# Change "shakespeare.txt" to your filename 
# e.g. 
# file_name = 'harry_potter.txt'

file_name = "mike_royko.txt"  

In [None]:
# Read/Write file to force convert to known/legal encoding

import io
with io.open(file_name, "r", encoding="utf-8", errors='ignore') as my_file:
     my_unicode_str = my_file.read() 

print(my_unicode_str[:5000])

file_name_new = file_name.split('.')[0] + '_new.txt'

with io.open(file_name_new, "w", encoding="utf-8", errors='ignore') as my_file:
     my_file.write(my_unicode_str) 

print(f'\n==========\n\n[Just wrote out new cleaned filename]: {file_name_new}')

1995 
Enquiring minds don't need to know 
Tuesday, January 3, 1995 
When we met for our traditional New Year's drink of Ovaltine, Slats Grobnik said: "Tell me about those pills. You buy them across the counter or does a doc have to write a prescription?" 
Pills? What pills? "Those Stupid Pills I figure you been taking lately. Boy, they really did the job, didn't they?" I am not familiar with Stupid Pills and have not used them. "You did it all on your own? Boy, then you're a natural like that baseball movie with Robert Redford. Maybe they'll make a 
movie like that about you, except at the end the only fireworks will be in your head." Do you mind if we talk about something else? "Hey, no problem." Thank you. I gather that you, too, have been on vacation. Did you have a pleasant time? "Sure. And I didn't get arrested even once." 
That wasn't what I meant. Did you go anywhere? "Yeah, I took a little trip. You wanna hear about it?" Sure. "Well, I got where I was going without having to pu

## Finetune GPT-2

The next cell will start the actual finetuning of GPT-2. It creates a persistent TensorFlow session which stores the training config, then runs the training for the specified number of `steps`. (to have the finetuning run indefinitely, set `steps = -1`)

The model checkpoints will be saved in `/checkpoint/run1` by default. The checkpoints are saved every 500 steps (can be changed) and when the cell is stopped.

The training might time out after 4ish hours; make sure you end training and save the results so you don't lose them!

**IMPORTANT NOTE:** If you want to rerun this cell, **restart the VM first** (Runtime -> Restart Runtime). You will need to rerun imports but not recopy files.

Other optional-but-helpful parameters for `gpt2.finetune`:


*  **`restore_from`**: Set to `fresh` to start training from the base GPT-2, or set to `latest` to restart training from an existing checkpoint.
* **`sample_every`**: Number of steps to print example output
* **`print_every`**: Number of steps to print training progress.
* **`learning_rate`**:  Learning rate for the training. (default `1e-4`, can lower to `1e-5` if you have <1MB input data)
*  **`run_name`**: subfolder within `checkpoint` to save the model. This is useful if you want to work with multiple models (will also need to specify  `run_name` when loading the model)
* **`overwrite`**: Set to `True` if you want to continue finetuning an existing model (w/ `restore_from='latest'`) without creating duplicate copies. 

In [None]:
# %reset_selective sess

In [None]:
%whos

Variable         Type             Data/Info
-------------------------------------------
datetime         type             <class 'datetime.datetime'>
file_name        str              mike_royko.txt
file_name_new    str              mike_royko_new.txt
files            module           <module 'google.colab.fil<...>s/google/colab/files.py'>
gpt2             module           <module 'gpt_2_simple' fr<...>pt_2_simple/__init__.py'>
io               module           <module 'io' from '/usr/lib/python3.7/io.py'>
my_file          TextIOWrapper    <_io.TextIOWrapper name='<...>ode='w' encoding='utf-8'>
my_unicode_str   str              1995 \nEnquiring minds do<...>per, this town, weeps. \n
sess             Session          <tensorflow.python.client<...>object at 0x7fc5940e5c90>


In [None]:
# import tensorflow as tf
# tf.reset_default_graph()

In [None]:
%%time

# NOTE: 21m46s at 4:23 on 20210414 with 124M model
#       11m05s at 17:30 on 20210419 with 355M model
#              at  5;15 on 20210420 with 355M model/fitzgerald_sarahg.txt

sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset=file_name_new,
              model_name='355M',    # Change this if you use a different sized model
              steps=2000,           # Number of training epochs
              restore_from='fresh',
              run_name='run1',
              print_every=10,       # Print feedback of loss/avg metrics every 10 epochs
              sample_every=200,     # Print sample text generation every 200 epochs to see progress of model learning
              save_every=500        # Save learned weights/biases every 500 epochs in case of crash/interruptions
              )

ValueError: ignored

## (a) Generate Text without a prompt

After you've trained the model or loaded a retrained model from checkpoint, you can now generate text. `generate` generates a single text from the loaded model.

In [None]:
%%time

# NOTE  0m22s at 5:32 on 20210420 using 355M/fitzerald_sarahg

# Try generating new text with default model parameters

gpt2.generate(sess, run_name='run1')

## (b) Generate Text with a Custom Prompt

If you're creating an API based on your model and need to pass the generated text elsewhere, you can do `text = gpt2.generate(sess, return_as_list=True)[0]`

You can also pass in a `prefix` to the generate function to force the text to start with a given character sequence and generate text from there (good if you add an indicator when the text starts).

You can also generate multiple texts at a time by specifing `nsamples`. Unique to GPT-2, you can pass a `batch_size` to generate multiple samples in parallel, giving a massive speedup (in Colaboratory, set a maximum of 20 for `batch_size`).

Other optional-but-helpful parameters for `gpt2.generate` and friends:

*  **`length`**: Number of tokens to generate (default 1023, the maximum)
* **`temperature`**: The higher the temperature, the crazier the text (default 0.7, recommended to keep between 0.7 and 1.0)
* **`top_k`**: Limits the generated guesses to the top *k* guesses (default 0 which disables the behavior; if the generated output is super crazy, you may want to set `top_k=40`)
* **`top_p`**: Nucleus sampling: limits the generated guesses to a cumulative probability. (gets good results on a dataset with `top_p=0.9`)
* **`truncate`**: Truncates the input text until a given sequence, excluding that sequence (e.g. if `truncate='<|endoftext|>'`, the returned text will include everything before the first `<|endoftext|>`). It may be useful to combine this with a smaller `length` if the input texts are short.
*  **`include_prefix`**: If using `truncate` and `include_prefix=False`, the specified `prefix` will not be included in the returned text.

## Prompt #1

## Prompt #2

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

FailedPreconditionError: ignored

## Prompt #3

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #4

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #5

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #6

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #7

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #8

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #9

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #10

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="Chicago Cubs win the world series in 2016", # ENTER a custom starting prompt
              nsamples=5,)

## Prompt #1

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #2

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #3

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #4

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #5

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #6

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #7

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #8

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #9

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

## Prompt #10

In [None]:
%%time

# ENTER a custom starting prompt as a starting seed for GPT-2 to begin generating text

gpt2.generate(sess,
              length=1000,
              temperature=0.7,
              # Replace the 'prefix' string variable with your own starting seed prompt
              # prefix="It is time for world leaders to acknowledge the failure of their economic reforms. What is called for now is rather a", # ENTER a custom starting prompt
              prefix="<Enter a custom starting prompt for GPT-2 to use as a beginning seed to generate new text>", # ENTER a custom starting prompt
              nsamples=5,
              batch_size=5
              )

For bulk generation, you can generate a large amount of text to a file and sort out the samples locally on your computer. The next cell will generate a generated text file with a unique timestamp.

You can rerun the cells as many times as you want for even more generated texts!

## (c) Generate 10 files of Samples without a prompt (each file has 100 Samples for 1,000 Samples total)

### Generate Text #1

In [None]:
%%time

# NOTE:  1m39s at  5:35 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file1 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file1,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

CPU times: user 1min 5s, sys: 4.59 s, total: 1min 10s
Wall time: 2min 44s


### Generate Text #2

In [None]:
%%time

gen_file2 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

# NOTE:  1m40s at  5:37 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gpt2.generate_to_file(sess,
                      destination_path=gen_file2,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

CPU times: user 1min 5s, sys: 4.35 s, total: 1min 10s
Wall time: 2min 44s


### Generate Text #3

In [None]:
%%time

# NOTE:  1m40s at  5:39 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file3 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file3,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #4

In [None]:
%%time

# NOTE:  1m39s at  5:42 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file4 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file4,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #5

In [None]:
%%time

# NOTE:  1m40s at  5:45 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file5 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file5,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #6

In [None]:
%%time

# NOTE:  1m41s at  5:47 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file6 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file6,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #7

In [None]:
%%time

# NOTE:  1m41s at  5:49 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file7 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file7,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #8

In [None]:
%%time

# NOTE:  1m41s at  5:51 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file8 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file8,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #9

In [None]:
%%time

# NOTE:  1m39s at  5:53 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file9 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file9,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Generate Text #10

In [None]:
%%time

# NOTE:  1m39s at  5:42 on 20210420 with 355M model fine-tuned on fitzgerald_sarahg.txt 

gen_file10 = 'gpt2_gentext_{:%Y%m%d_%H%M%S}.txt'.format(datetime.utcnow())

gpt2.generate_to_file(sess,
                      destination_path=gen_file10,
                      length=500,
                      temperature=0.7,
                      nsamples=100,
                      batch_size=20
                      )

### Zip all 10 files of generated samples (100 each) and Download zip fille

In [None]:
!ls gpt2_*.txt

In [None]:
zipfile_name = 'gpt2_gentext_' + file_name.split('.')[0].strip() + '.zip'

In [None]:
!zip -r $zipfile_name gpt2_gentext_* # gen_file1 gen_file2 gen_file3 gen_file4 gen_file5 gen_file6 gen_file7 gen_file8 gen_file9 gen_file10

In [None]:
# Download the zip archive with all 1000 Samples (100 per file)

# NOTE: YOU MUST CLICK [OK] IN THE POP-UP DIALOG BOX that appears to permanently save your work
#       if you don't do this you will lose all your work!!!

files.download(zipfile_name)

# **END OF NOTEBOOK**

# Etcetera

If the notebook has errors (e.g. GPU Sync Fail), force-kill the Colaboratory virtual machine and restart it with the command below:

In [None]:
!kill -9 -1

# LICENSE

MIT License

Copyright (c) 2019 Max Woolf

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.