<a href="https://colab.research.google.com/github/kamatsuoka/gpt-2/blob/master/gpt_2_playground.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text Generation Playground

# Background
In this notebook you can play around with generating text using the medium (345M parameter) version of [Open AI's GPT-2 model](https://openai.com/blog/better-language-models/).

Briefly, GPT-2 is a kind neural network called a [transformer](https://www.tensorflow.org/alpha/tutorials/text/transformer), trained on millions of web documents shared in Reddit posts with a score of at least 3.  The model learns to predict the next word, given a sequence of words.  By repeating the predicition process, the model can generate full sentences and paragraphs.  The results are often quite interesting.

##1. Install Code and Data
Download the model data and install Python libraries.   This will take a minute.


In [0]:
import os
import sys
basedir = '/content' # specific to colaboratory
os.chdir(basedir)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
!rm -rf gpt-2
!git clone -q https://github.com/kamatsuoka/gpt-2/
os.chdir('gpt-2')
sys.path.append(os.getcwd() + '/src')
!pip3 --quiet install -r requirements-colab.txt
!python download_model.py 345M --quiet
from src.conditional_samples import restore_model, generate_samples
sess, hparams, sequence_output, enc, placeholders = restore_model()


INFO:tensorflow:Restoring parameters from models/345M/model.ckpt


## 2. Generate samples conditioned on starting text

Enter starting text and optionally change the numeric parameters below.<br>
Then run the cell to generate samples conditioned on the starting text.

In [0]:
#@markdown ### Starting text:
starting_text = "" #@param {type:"string"}
#@markdown ### Samples to generate:
nsamples = 3 #@param {type:"slider", min:1, max:10, step:1}
#@markdown ### Number of words per sample:
length = 200 #@param {type:"slider", min:10, max:500, step:1}
#@markdown ### Randomness in choosing the next word:
temperature = 0.9 #@param {type:"slider", min:0, max:1, step:0.01}
#@markdown ### Number of words to consider for next word:
top_k = 40 #@param {type:"slider", min:0, max:200, step:1}

from IPython.display import display, HTML
from html import escape 

def escape_sample(sample):
    return map(escape, sample.split('<|endoftext|>')[0].split("\n"))
  
def text_to_html(sample):
    display(HTML("<p><i>" + escape(starting_text) + "</i>" + 
         "<br/>".join(escape_sample(sample)) + "</p><hr/>"))
  
styles = """
  p { font-size: 150%; margin-top: 1em;  margin-bottom: 1em; }
"""  

display(HTML("<style>" + styles + "</style><h1>Samples</h1>"))  
generate_samples(
        sess,
        hparams,
        sequence_output,
        enc,
        placeholders,
        print_fn = text_to_html,
        starting_text = starting_text,
        nsamples=nsamples,
        length=length,
        temperature=temperature,
        top_k=top_k)

