<a href="https://colab.research.google.com/github/ShaneZhong/train-gpt-2-model/blob/master/Train_the_GPT_2_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training the GPT-2 model from an input text

Setup:

1) Make sure GPU is enabled, go to edit->notebook settings->Hardware Accelerator GPU

2) make a copy to your google drive, click on copy to drive in panel

Note: colab will reset after 12 hours make sure to save your model checkpoints to google drive around 10-11 hours mark or before, then go to runtime->reset all runtimes. Now copy your train model back into colab and start training again from the previous checkpoint.

## Environment Setup



### Git clone and install dependencies

In [0]:
#@title Clone or pull train-gpt-2-model from Github

%cd /content
Mode = "clone" #@param ["clone", "pull"]

if (Mode == "clone"):
  !git clone https://github.com/ShaneZhong/train-gpt-2-model.git
else:
  %cd /content/train-gpt-2-model
  !git pull

!pip3 install -r /content/train-gpt-2-model/requirements.txt

In [0]:
cd /content/train-gpt-2-model

In [0]:
ls

### GDrive Mount

In [0]:
# Run this cell to mount your Google Drive.
from google.colab import drive
drive.mount('/content/drive')

In [0]:
# download the models
!python3 download_model.py 117M
!python3 download_model.py 345M

In [0]:
!export PYTHONIOENCODING=UTF-8

### Fetch previous model checkpoints in google drive (optional)

In [0]:
# check if you have any content in your GDrive directory
ls /content/drive/My\ Drive/Models/GPT-2/checkpoint/

In [0]:
# If the above is not empty, you can copy the previously
# saved model to your project directory.
!cp -r /content/drive/My\ Drive/Models/GPT-2/checkpoint/ /content/train-gpt-2-model/ 

## Get the training dataset

lets get our text to train on, in this case from project gutenberg, A Tale of Two Cities, by Charles Dickens

In [0]:
  !wget https://www.gutenberg.org/files/98/98-0.txt

In [0]:
ls /content/train-gpt-2-model/

### Alternative: Trump tweets
The Trump tweets is already saved in the /data directory

In [0]:
ls /content/train-gpt-2-model/data

trump_tweets.txt


## Training the model

Select either the 117M or 345M model to train. 

**IMPORTANT**: After running the cell below, it does not stop automatically. To stop training, you need to click the stop button. The saved model will be generated in the checkpoint directory (`e.g. Saving checkpoint/run1/model-289`)


Many parameters can be tunned in this model. You can find the reference here: [link](https://github.com/ShaneZhong/train-gpt-2-model/blob/master/train.py) <br><br>



Regarding the input_data, here are the two files you can choose:
* '/content/train-gpt-2-model/data/trump_tweets.txt'
*'/content/train-gpt-2-model/98-0.txt'

In [0]:
#@title Train the model with the following parameters
input_data = '/content/train-gpt-2-model/data/trump_tweets.txt' #@param
model = "345M" #@param ["117M","345M"]
Samples_per_N_steps = 100 #@param
Folder_Name = 'Trump-tweets' #@param

!PYTHONPATH=src /content/train-gpt-2-model/train.py --dataset $input_data --model_name $model --sample_every $Samples_per_N_steps --run_name $Folder_Name

### Save the trained model to GDrive
By default, the trained model is saved in the `checkpoint` folder under your your GDrive root directory.

In [0]:
!cp -r /content/train-gpt-2-model/checkpoint/ /content/drive/My\ Drive/Models/GPT-2/

## Apply the trained model

### Fetch the trained model
The trained model (117M or 345M) is pasted to the model directory.

In [0]:
#!cp -r /content/train-gpt-2-model/checkpoint/run1/* /content/train-gpt-2-model/models/117M/
!cp -r /content/train-gpt-2-model/checkpoint/Trump-tweets/* /content/train-gpt-2-model/models/345M/

In [0]:
# load the instruction
!python3 src/interactive_conditional_samples.py -- --help

### Conditional samples

In [0]:
#!python3 src/interactive_conditional_samples.py --model_name='117M' --nsamples=2 --top_k=40 --temperature=0.7
!python3 src/interactive_conditional_samples.py --model_name='345M' --nsamples=2 --top_k=40 --temperature=0.7

### Unconditional samples

In [0]:
!python3 src/generate_unconditional_samples.py --model_name='117M' --nsamples=2 --top_k=40 --temperature=0.7
!python3 src/generate_unconditional_samples.py --model_name='345M' --nsamples=2 --top_k=40 --temperature=0.7