# Using GPT-2 to generate Text
This Notebook shows how to use the recently published [GPT-2](https://github.com/openai/gpt-2) Model to generate new text. 

## Notebook Content
1. [Introduction](#1)
    1. [Basic Import](#11)
    2. [Download the model](#12)
    3. [Import the model](#13)
2. [Configuration](#2)
    1. [Path declaration](#21)
    2. [Load model and bpe](#22)
    3. [Define Input and get the Output](#23)
3. [Conclusion](#3)

<a id="1"></a>
## 1.0 Introduction
This Notebook shows the usage of GPT-2 using the available 117M Model, to generate new text. The goal is to be able to choose the text the model should finish, to get a quick glance at how 'good' AI already is regarding text comprehention.

<a id="11"></a>
### 1.1 Basic Import

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

['model.ckpt.data-00000-of-00001', 'encoder.json', 'model.ckpt.meta', 'model.ckpt.index', 'vocab.bpe', 'hparams.json', 'checkpoint']


<a id="12"></a>
### 1.2 Download the model
We need to download the model using pip. 

In [2]:
!pip install keras-gpt-2

Collecting keras-gpt-2
  Downloading https://files.pythonhosted.org/packages/ce/5d/c540b090c3555a5c9c653c19ac1b633c63249d2c2bed515bddf4b2eca43a/keras-gpt-2-0.7.0.tar.gz
Collecting keras-embed-sim==0.3.0 (from keras-gpt-2)
  Downloading https://files.pythonhosted.org/packages/8e/16/b05954f9578ded225fd1bd56154ade949782c03b668a1fc424d5050e868a/keras-embed-sim-0.3.0.tar.gz
Collecting keras-pos-embd==0.8.0 (from keras-gpt-2)
  Downloading https://files.pythonhosted.org/packages/09/7b/fbb75fd0ab68f9728c1bff197a2cbc5a9a3874af27d44023f92fa32cb12c/keras-pos-embd-0.8.0.tar.gz
Collecting keras-layer-normalization==0.11.0 (from keras-gpt-2)
  Downloading https://files.pythonhosted.org/packages/c6/b0/c786d5a5e79d985281a06da0a1f3f559cf425921464e6b07b9f1cb093a5a/keras-layer-normalization-0.11.0.tar.gz
Collecting keras-transformer==0.19.0 (from keras-gpt-2)
  Downloading https://files.pythonhosted.org/packages/4d/61/4ffb5d3f8fc50f1dd33132af5869f3779052f3e18b0829cc95d4ad2dce7d/keras-transforme

<a id="13"></a>
### 1.3 Import the model

In [3]:
from keras_gpt_2 import load_trained_model_from_checkpoint, get_bpe_from_files, generate

Using TensorFlow backend.


<a id="2"></a>
## 2. Configuration

<a id="21"></a>
### 2.1 Path declaration
The path declaration was done for the Kaggle file system. If you download the notebook and the model, you need to determine the correct path for yourself.

In [4]:
model_folder = "../input"
config_path = os.path.join(model_folder, 'hparams.json')
checkpoint_path = os.path.join(model_folder, 'model.ckpt')
encoder_path = os.path.join(model_folder, 'encoder.json')
vocab_path = os.path.join(model_folder, 'vocab.bpe')

<a id="22"></a>
### 2.2 Load model and bpe

In [5]:
print('Load model from checkpoint...')
model = load_trained_model_from_checkpoint(config_path, checkpoint_path)

Load model from checkpoint...
Instructions for updating:
Colocations handled automatically by placer.


In [6]:
print('Load BPE from files...')
bpe = get_bpe_from_files(encoder_path, vocab_path)

Load BPE from files...


<a id="23"></a>
### 2.3 Define Input and get the Output

In [7]:
myInput = ['The meaning of life is']
print('Generate text...')
output = generate(model, bpe, myInput, length=20, top_k=2)

Generate text...


In [8]:
print(output)

['The meaning of life is not to be taken for granted. It is not to be taken for a given person. Life does']


<a id="3"></a>
## 3. Conclusion
After trying different examples of text, it is quite clear that the model isn't even close to being perfect. However, with most examples, the model seems to understand what the topic is and how to finish most of the sentences with some meaning. The model which was not published due to ethically reasons, might be already way better than this one. The progress over the last decades are very impressive and it seems only a matter of time, when the first AI will write the first bestseller novel.