# Transformer Model (Using GPT-2 / 4.4)
Using Transfer Learning from GPT-2

https://github.com/minimaxir/gpt-2-simple

*   STEP 1 DOWNLOAD GPT-2 MODEL (TO BE FINETUNED)
*   STEP 2 FINETUNE OUR MODEL ON OUR TRAINING DATASET
*   STEP 3 GENERATE SOME SAMPLES!

In [None]:
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


## STEP0 // IMPORT OUR STUFF

In [None]:
# !pip install numpy
import pandas as pd
import numpy as np
import ast
# visualization
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from itertools import compress

import random
import sys
import io

#GPT2-simple not supported in TF2.0
%tensorflow_version 1.x   
!pip install -q gpt-2-simple
import gpt_2_simple as gpt2
from datetime import datetime
from google.colab import files

TensorFlow 1.x selected.
  Building wheel for gpt-2-simple (setup.py) ... [?25l[?25hdone
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



In [None]:
import tensorflow as tf

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Concatenate, Masking, Embedding, Dropout
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import GRU, LSTM, Bidirectional
from tensorflow.keras.layers import Conv1D, Activation, Multiply
from tensorflow.keras.losses import CategoricalCrossentropy, SparseCategoricalCrossentropy
from tensorflow.keras.callbacks import LambdaCallback, ModelCheckpoint
from tensorflow.keras.optimizers import RMSprop, Adam, Adamax
from tensorflow.keras import activations

from sklearn.model_selection import train_test_split
import nltk
from nltk.translate.bleu_score import sentence_bleu
from nltk.translate.bleu_score import SmoothingFunction

In [None]:
data = pd.read_csv('/content/drive/My Drive/CS230/finaldata.csv')
train = pd.read_csv('/content/drive/My Drive/CS230/finaldata_train.csv')
test = pd.read_csv('/content/drive/My Drive/CS230/finaldata_test.csv')

## STEP1 // DOWNLOAD GPT-2

In [None]:
gpt2.download_gpt2(model_name="124M")

Fetching checkpoint: 1.05Mit [00:00, 299Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 118Mit/s]                                                    
Fetching hparams.json: 1.05Mit [00:00, 282Mit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 498Mit [00:13, 36.0Mit/s]                                  
Fetching model.ckpt.index: 1.05Mit [00:00, 432Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 181Mit/s]                                                 
Fetching vocab.bpe: 1.05Mit [00:00, 160Mit/s]                                                       


In [None]:
with open('x.txt', 'w') as f:
    f.write('\n'.join([str(elem) for elem in list(train.overview)]) + '\n')

## STEP2 // FINETUNE THE MODEL ON OUR DATASET

Parameters for `gpt2.finetune`:


*  **`restore_from`**: Set to `fresh` to start training from the base GPT-2, or set to `latest` to restart training from an existing checkpoint.
* **`sample_every`**: Number of steps to print example output
* **`print_every`**: Number of steps to print training progress.
* **`learning_rate`**:  Learning rate for the training. (default `1e-4`, can lower to `1e-5` if you have <1MB input data)
*  **`run_name`**: subfolder within `checkpoint` to save the model. This is useful if you want to work with multiple models (will also need to specify  `run_name` when loading the model)
* **`overwrite`**: Set to `True` if you want to continue finetuning an existing model (w/ `restore_from='latest'`) without creating duplicate copies. 

In [None]:
sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset='x.txt',
              model_name='124M',
              steps=800,
              restore_from='latest',
              run_name='run1',
              print_every=10,
              sample_every=500,
              save_every=500,
              overwrite=True
              )

## STEP3 // GENERATING TEXT WITH OUR FINETUNED MODEL


Parameters for `gpt2.generate`:

* **`prefix`**: to force the text to start with a given character sequence and generate text from there
* **`length`**: Number of tokens to generate (default 1023, the maximum)
* **`temperature`**: The higher the temperature, the crazier the text (default 0.7, recommended to keep between 0.7 and 1.0)
* **`top_k`**: Limits the generated guesses to the top *k* guesses (default 0 which disables the behavior; if the generated output is super crazy, you may want to set `top_k=40`)
* **`top_p`**: Nucleus sampling: limits the generated guesses to a cumulative probability. (gets good results on a dataset with `top_p=0.9`)
* **`truncate`**: Truncates the input text until a given sequence, excluding that sequence (e.g. if `truncate='<|endoftext|>'`, the returned text will include everything before the first `<|endoftext|>`). It may be useful to combine this with a smaller `length` if the input texts are short.
*  **`include_prefix`**: If using `truncate` and `include_prefix=False`, the specified `prefix` will not be included in the returned text.

In [None]:
gpt2.generate(sess,
              length=65,
              temperature=0.7,
              prefix="CS230 students met a year ago for",
              nsamples=5,
              batch_size=5
              )

CS230 students met a year ago for the first time in years. The staff, as well as the students themselves, have been blown away by the passion that these two disparate individuals express for one another. It is a bond that will last a lifetime. The only question is how long will it last?
Folks at a posh New York restaurant find
CS230 students met a year ago for one of the most intimate visits in their lives. The students, their families and the people who love them all took a trip to the new place.
When a mother and a father are kidnapped, a few lucky souls go to the rescue.
A young man is left in a car accident by his friend. He had
CS230 students met a year ago for the first time at the Texas A&M in Austin, Texas. There, they met the same students they had met before. It was a dream meeting. They had never met and never will. As they were the very perfect pair of girls.
The story of a three-year-old boy who is forced to
CS230 students met a year ago for the first time in their lives. 

In [None]:
gpt2.generate(sess,
              length=65,
              temperature=0.7,
              prefix="Professor Andrew goes on a mission to",
              nsamples=5,
              batch_size=5
              )

Professor Andrew goes on a mission to find the rightful heir to the throne.
A young, lonely college professor is put in the position of a special assistant to a mysterious stranger.
The story of the most important love story of all time.
A family saga of a Chinese immigrant and a Frenchman whose house is destroyed in a fire.
David, an
Professor Andrew goes on a mission to find the real killer who engineered the outbreak.
A girl named Rose becomes involved in the murder of the father of a young boy who was abducted by the Skulls and held captive for twenty-three years.
A fisherman in the late 19th century is forced to kill his wife and take a job as a maid to
Professor Andrew goes on a mission to help his estranged brother.
The story of a young Russian boy who is forced to live in a totalitarian dictatorship after witnessing a revolution in Russia, and his battle with psychiatric problems, and his struggle to survive in the country where he is forced for his freedom.
A semi-autobiograph

In [None]:
gpt2.generate(sess,
              length=65,
              temperature=0.7,
              prefix="A group of friends decides to go out for",
              nsamples=5,
              batch_size=5
              )

A group of friends decides to go out for a little fun and fight the giant monster.
A man returns from a vacation in the Bahamas to find the house he left in the middle of the sandstorm. He is greeted by Harold, who is still haunted by a memory of a woman who died in the island.
The events leading up to the Iraq war are
A group of friends decides to go out for a quick run in the beautiful country... and forget about the all important wedding!
A man is scheduled to appear in court for his alleged involvement in a plot to assassinate a prominent American businessman. Instead, the man is brought to the U.S. Attorney's office in Los Angeles for a routine arraignment. On his
A group of friends decides to go out for a weekend getaway and make a quick stop in the remote area of the town.
A group of boys in a small town discover a horrifying secret that changes their lives forever.
A young man is recruited by the US Army to be the special forces commander in Afghanistan, but has to take on a ve