<a href="https://colab.research.google.com/github/rahiakela/transfer-learning-for-natural-language-processing/blob/main/6-text-generation-with-gpt-2-and-gpt-3-models/text_completion_with_gpt2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Text completion with GPT-2

This notebook will clone the OpenAI GPT-2 repository, download the 345M parameter GPT-2 transformer model, and interact with it. We will enter context sentences and analyze the text generated by the transformer. The goal is to see how it creates new content.

We must activate the GPU to train our GPT-2 345M parameter transformer model.
OpenAI has begun to offer transformers as a cloud service. However, we saw that we might be able to run small models with standard machines without going through a cloud service to run a powerful transformer GPT-3.

We will use the GPT-2 model but will not train it. We could not train large GPT models even if we had access to the source code because most of us lack the computing power to do it. Because we would need petaflops with
the more recent transformer models!

An average developer does not have access to this level of machine power. Google Cloud, Microsoft Azure, Amazon Web Services (AWS), for example, can rent a certain level of machine resources in teraflops to cloud customers.

If we go a step further, it becomes tougher to train transformers in teraflops. Accessing petaflops calculation power is limited to a restricted number of teams in the world.

However, we will see that the results produced by the limited power of Google
Colaboratory VMs for our 345M parameter GPT-2 are quite convincing.

## Step 1: Cloning the OpenAI GPT-2 repository

OpenAI is still letting us download GPT-2. This may be discontinued in the future, or maybe we will have access to more resources. At this point, the evolution of transformers and their usage moves so fast nobody can foresee how the market will evolve, even the major research labs themselves.

We will clone OpenAI's GitHub directory on our VM.

In [1]:
!git clone https://github.com/openai/gpt-2.git

Cloning into 'gpt-2'...
remote: Enumerating objects: 233, done.[K
remote: Total 233 (delta 0), reused 0 (delta 0), pack-reused 233[K
Receiving objects: 100% (233/233), 4.38 MiB | 6.59 MiB/s, done.
Resolving deltas: 100% (124/124), done.


You can see that we do not have the Python training files we need. We will install them when we train the GPT-2 model in the Training a GPT-2 language model.

## Step 2: Installing the requirements

The requirements will be installed automatically:

In [None]:
import os # when the VM restarts import os necessary

os.chdir("/content/gpt-2")
!pip3 install -r requirements.txt

The requirements for this notebook are:

- `Fire 0.1.3` to generate command-line interfaces (CLIs)
- `regex 2017.4.5` for regex usage
- `Requests 2.21.0` an HTTP library
- `tqdm 4.31.1` to display a progress meter for loops

You may be asked to restart the notebook.

Do not restart it now. Let's wait until we check the version of TensorFlow.

## Step 3: Checking the version of TensorFlow

The GPT-2 transformer 345M transformer model provided by OpenAI uses
TensorFlow 1.x. In 2020, GPT models have reached 175 billion
parameters, making it impossible for us to train them without having access to a supercomputer.

The corporate giants' research labs, such as Facebook AI and OpenAI and Google
Research/Brain, are speeding towards super-transformers and are leaving us with what they can for us to learn and understand. They do not have time to go back and update all of the models they share.

This is one of the reasons for which Google Colaboratory VMs have preinstalled
versions of both TensorFlow 1.x and TensorFlow 2.x.

We will be using TensorFlow 1.x in this notebook:

In [1]:
#Colab has tf 1.x and tf 2.x installed
#Restart runtime using 'Runtime' -> 'Restart runtime...'
%tensorflow_version 1.x
import tensorflow as tf
print(tf.__version__)

TensorFlow 1.x selected.
1.15.2


Whether version tf 1.x is displayed or not, rerun the cell to make sure, then restart the VM. Rerun this cell to make sure before continuing.

If you encounter a TensforFlow error during the process (ignore the warnings), rerun this cell, restart the VM, and rerun to make sure.

Do this every time you restart the VM. The default version of the VM is tf.2.

## Step 4: Downloading the 345M parameter GPT-2 model

We will now download a trained 345M parameter GPT-2 model:

In [3]:
import os
# Downloading the 345M parameter GPT-2 Model
os.chdir("/content/gpt-2")
!python3 download_model.py "345M"

Fetching checkpoint: 1.00kit [00:00, 726kit/s]                                                      
Fetching encoder.json: 1.04Mit [00:00, 1.73Mit/s]                                                   
Fetching hparams.json: 1.00kit [00:00, 700kit/s]                                                    
Fetching model.ckpt.data-00000-of-00001: 1.42Git [10:59, 2.15Mit/s]                                 
Fetching model.ckpt.index: 11.0kit [00:00, 7.38Mit/s]                                               
Fetching model.ckpt.meta: 927kit [00:00, 1.50Mit/s]                                                 
Fetching vocab.bpe: 457kit [00:00, 926kit/s]                                                        


It contains the information we need to run the model.

The `hparams.json` file contains the definition of the GPT-2 model:

- `"n_vocab": 50257`, the size of the vocabulary of the model
- `"n_ctx": 1024`, the context size
- `"n_embd": 1024`, the embedding size
- `"n_head": 16`, the number of heads
- `"n_layer": 24`, the number of layers

`encoder.json` and `vacab.bpe` contain the tokenized vocabulary and the BPE word pairs.

The checkpoint file contains the trained parameters at a checkpoint.

The checkpoint file is saved with three other important files:

- `model.ckpt.meta` describes the graph structure of the model. It contains
`GraphDef, SaverDef`, and so on. We can retrieve the information with
`tf.train.import_meta_graph([path]+'model.ckpt.meta')`.
- `model.ckpt.index` is a string table. The keys contain the name of a tensor, and the value is `BundleEntryProto`, which contains the metadata of a tensor.
- `model.ckpt.data` contains the values of all of the variables in a `TensorBundle` collection.

We have downloaded our model. We will now go through some intermediate steps
before activating the model.

## Steps 5: Intermediate instructions

We want to print UTF encoded text to the console when we are interacting with the model:

In [4]:
# Printing UTF encoded text to the console
!export PYTHONIOENCODING=UTF-8

We want to make sure we are in the src directory:

In [5]:
os.chdir("/content/gpt-2/src")

We are ready to interact with the GPT-2 model. However, in this section, we will go through the main aspects of the code.

`interactive_conditional_samples.py` first imports the necessary modules required to interact with the model:

In [6]:
import json
import os
import numpy as np
import tensorflow as tf

We have gone through the intermediate steps leading to the activation of the model.

## Steps 6: Importing and defining the model

We will now activate the interaction with the model with `interactive_conditional_samples.py`.

We need to import three modules that are also in `/content/gpt-2/src`:

In [7]:
import model, sample, encoder

- `model.py` defines the model's structure: the hyperparameters, the multiattention `tf.matmul` operations, the activation functions, and all of the other properties.
- `sample.py` processes the interaction and controls the sample that will be
generated. It makes sure that the tokens are more meaningful.

Softmax values can sometimes be blurry, like seeing an image with low
definition. `sample.py` contains a variable named temperature that will make
the values sharper, increasing the higher probabilities and softening the
lower ones.

`sample.py` can activate Top-k sampling. Top-k sampling sorts the probability
distribution of a predicted sequence. The higher probability values of the
head of the distribution are filtered up to the k-th token. The tail containing the lower probabilities is excluded, preventing the model from predicting low-quality tokens.

`sample.py` can also activate Top-p sampling for language modeling. Top-p
sampling does not sort the probability distribution. It selects the words with
high probabilities until the sum of this subset's probabilities or the nucleus of a possible sequence exceeds `p`.
- `encoder.py` encodes the sample sequence with the defined model, `encoder.
json`, and `vocab.bpe`. It contains both a BPE encoder and a text decoder.

## Step 7: Interacting with GPT-2

To interact with the model, run the interact_model cell:

In [10]:
import interactive_conditional_samples

interact_model("345M", None, 1, 1, 300, 1, 0, "/content/gpt-2/models")

NameError: ignored

You will be prompted to enter some context.

You can try any type of context you wish since this is a standard GPT-2 model.

We can try a sentence written by Emmanuel Kant:

```
Human reason, in one sphere of its cognition, is called upon to
consider questions, which it cannot decline, as they are presented by
its own nature, but which it cannot answer, as they transcend every
faculty of the mind.
```

You can also type Ctrl + M to stop generating text, but it may transform the code into text, and you will have to copy it back into a program cell.

The output is very rich, and we can observe several facts:
- The context we entered conditioned the output generated by the model.
- The context was a demonstration for the model. It learned what to say from
the model without modifying its parameters.
- Text completion is conditioned by context. This opens the door to
transformer models that do not require fine-tuning.
- From a semantic perspective, the output could be more interesting.
- From a grammatical perspective, the output is convincing.


