In [1]:
# Downloads: run only the first time

###
# !pip install --user gpt-2-simple
# import gpt_2_simple as gpt2
# model_name = "124M"
# gpt2.download_gpt2(model_name=model_name)

In [2]:
# Imports: run every time
import warnings
warnings.filterwarnings(action="ignore")

import gpt_2_simple as gpt2
model_name="124M"
sess = gpt2.start_tf_sess()

import os
import re

In [3]:
# Corpus generation: run once

###
# !cat text/w_[amn]*.txt >> text/all_amn.txt

done = True

def clean_corpus(filepath):

    with open(filepath, 'r') as infile:
        lines = infile.readlines()

    cleaned = []

    for line in lines:

        cl = re.sub(r"##\d+", '', line)
        cl = re.sub(r"@ @ @ @ @ @ @ @ @ @ ", '', cl)
        cl = re.sub(r"[#@]", '', cl)

        cleaned.append(cl)

    with open(filepath, 'w') as outfile:
        outfile.writelines(cleaned)
        
if not done:
    clean_corpus("text/all_amn.txt")

In [4]:
# Finetuning, 1 of 3: run once, save, then restart
# Give the model an idea of the structure of a piece of writing.
done = True

if not done:
    gpt2.finetune(
        sess,
        "text/all_amn.txt",
        model_name=model_name,
        run_name="finetuning1",
        restore_from="fresh",
        steps=500,
        print_every=10,
    )

In [5]:
# Now let's restart and try to finetune on a new corpus (fingers crossed!)

In [5]:
# I duplicated the finetuning1 run into finetuning2 to allow me to use overwrite without losing my previous run!

# gpt2.load_gpt2(
#     sess,
#     model_name=model_name,
#     run_name="finetuning2",
# )

In [6]:
# Finetuning, 2 of 3: run once, save, then restart
# Show the model the essay to set in the topic it should talk about.
done = False

if not done:
    gpt2.finetune(
        sess,
        "text/catcher.txt",
        model_name=model_name,
        run_name="finetuning2",
        restore_from="fresh",
        steps=10,
        print_every=5,
        overwrite=True,
    )

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Loading checkpoint models/124M/model.ckpt
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from models/124M/model.ckpt


100%|██████████| 1/1 [00:00<00:00, 1051.20it/s]

Loading dataset...
dataset has 1868 tokens
Training...





[5 | 13.40] loss=1.91 avg=1.91
[10 | 19.78] loss=1.07 avg=1.49
Saving checkpoint/finetuning2/model-10


In [7]:
gpt2.generate(
    sess,
    model_name="124M",
    run_name="finetuning2",
    nsamples=5,
    length=200,
    top_k=0,
    temperature=1,
    prefix="""The final and arguably most accurate and important connection between Salinger’s life and the book he created is a similarity between Holden and Salinger. The two share a great interest, of which is a definite theme throughout the book, being the desire for the preservation of innocence. Throughout the book, Holden seems to be less grown up than most of the people around him. He seems to dislike many people his age or older, saying that they are all phonies. Kids on the other hand, he likes them a lot due to just how uncorrupted and innocent they are. When Holden is depressed, seeing the innocence of a kid generally makes him feel better, as seen when Holden is depressed before meeting up with Sally, but he sees a child singing “If a body catch a body coming through the rye.” and it made him feel better.""",
)

The final and arguably most accurate and important connection between Salinger’s life and the book he created is a similarity between Holden and Salinger. The two share a great interest, of which is a definite theme throughout the book, being the desire for the preservation of innocence. Throughout the book, Holden seems to be less grown up than most of the people around him. He seems to dislike many people his age or older, saying that they are all phonies. Kids on the other hand, he likes them a lot due to just how uncorrupted and innocent they are. When Holden is depressed, seeing the innocence of a kid generally makes him feel better, as seen when Holden is depressed before meeting up with Sally, but he sees a child singing “If a body catch a body coming through the rye.” and it made him feel better. It made me “feel less depressed” (p. 140) In the last line of the book, Salinger seems to be talking to himself about his life. He writes:”"A few years ago, I stopped by your house. Th

Hilarious results:

> "Did you know? According to legend, Holden married Amy Pascal after his mother passed away. [1] According to the movie, the couple had a little girl named Katherine, and they had a baby. A year after the baby was born, the baby's mother came home from work and saw a child singing “If a body catch a body coming through the rye.” “”
In this story, the baby became the child's mother, and the baby became the father"

In [7]:
# alright... I definitely overfit... but that's great!!!
# that's exactly what I wanted, because now I can just tone it back a bit.

In [8]:
# what I'm also curious about is what would happen if I overfit and then turn the temperature way up...
# let's try that now.

In [7]:
gpt2.generate(
    sess,
    model_name="124M",
    run_name="finetuning",
    nsamples=5,
    length=200,
    top_k=0,
    temperature=1,
    prefix="""The final and arguably most accurate and important connection between Salinger’s life and the book he created is a similarity between Holden and Salinger. The two share a great interest, of which is a definite theme throughout the book, being the desire for the preservation of innocence. Throughout the book, Holden seems to be less grown up than most of the people around him. He seems to dislike many people his age or older, saying that they are all phonies. Kids on the other hand, he likes them a lot due to just how uncorrupted and innocent they are. When Holden is depressed, seeing the innocence of a kid generally makes him feel better, as seen when Holden is depressed before meeting up with Sally, but he sees a child singing “If a body catch a body coming through the rye.” and it made him feel better.""",
    include_prefix=False,
)

The final and arguably most accurate and important connection between Salinger’s life and the book he created is a similarity between Holden and Salinger. The two share a great interest, of which is a definite theme throughout the book, being the desire for the preservation of innocence. Throughout the book, Holden seems to be less grown up than most of the people around him. He seems to dislike many people his age or older, saying that they are all phonies. Kids on the other hand, he likes them a lot due to just how uncorrupted and innocent they are. When Holden is depressed, seeing the innocence of a kid generally makes him feel better, as seen when Holden is depressed before meeting up with Sally, but he sees a child singing “If a body catch a body coming through the rye.” and it made him feel better. It made me feel not so depressed anymore.” (p. 115) As well, the book even derives its title from a single passage in the book where Holden is talking to Phoebe about what he wishes he

In [13]:
# Okay... totally overfit now let's go back