# Conversational Chatbot with GPT-2

Base code from [this](https://dwjbosman.github.io/chatbot-using-open-ai-gpt-2-transformer-model) blog.

## Downloading the GPT-2 model

In [None]:
# first download the gpt-2 code
!git clone https://github.com/nshepperd/gpt-2.git
cd gpt-2

Cloning into 'gpt-2'...
remote: Enumerating objects: 435, done.[K
remote: Counting objects: 100% (64/64), done.[K
remote: Compressing objects: 100% (51/51), done.[K
remote: Total 435 (delta 19), reused 48 (delta 13), pack-reused 371[K
Receiving objects: 100% (435/435), 4.48 MiB | 16.32 MiB/s, done.
Resolving deltas: 100% (220/220), done.


install requirements

In [None]:
!pip3 install -r requirements.txt

Collecting fire>=0.1.3
[?25l  Downloading https://files.pythonhosted.org/packages/11/07/a119a1aa04d37bc819940d95ed7e135a7dcca1c098123a3764a6dcace9e7/fire-0.4.0.tar.gz (87kB)
[K     |████████████████████████████████| 92kB 11.9MB/s 
[?25hCollecting regex==2017.4.5
[?25l  Downloading https://files.pythonhosted.org/packages/36/62/c0c0d762ffd4ffaf39f372eb8561b8d491a11ace5a7884610424a8b40f95/regex-2017.04.05.tar.gz (601kB)
[K     |████████████████████████████████| 604kB 36.0MB/s 
[?25hCollecting requests==2.21.0
[?25l  Downloading https://files.pythonhosted.org/packages/7d/e3/20f3d364d6c8e5d2353c72a67778eb189176f08e873c9900e10c0287b84b/requests-2.21.0-py2.py3-none-any.whl (57kB)
[K     |████████████████████████████████| 61kB 9.8MB/s 
[?25hCollecting tqdm==4.31.1
[?25l  Downloading https://files.pythonhosted.org/packages/6c/4b/c38b5144cf167c4f52288517436ccafefe9dc01b8d1c190e18a6b154cd4a/tqdm-4.31.1-py2.py3-none-any.whl (48kB)
[K     |████████████████████████████████| 51kB 7.7MB/s 


download the model

In [None]:
# download the pretrained model
!python download_model.py 117M
# or 345M...

In [None]:
%tensorflow_version 1.x

import tensorflow
print(tensorflow.__version__)

TensorFlow 1.x selected.
1.15.2


In [None]:
import sys
!export PYTHONIOENCODING=UTF-8
sys.path.append("/content/gpt-2/src")

In [None]:
import fire
import json
import os
import numpy as np
import tensorflow as tf
import model, sample, encoder
import generate_unconditional_samples
import interactive_conditional_samples

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
class GPT2:
  
  # extracted from the source code to generate some text based on a prior
  def __init__(
      self,
      model_name='117M',
      seed=None,
      nsamples=1,
      batch_size=1,
      length=None,
      temperature=1,
      top_k=0,
      raw_text="",
  ):
      """
      Interactively run the model
      :model_name=117M : String, which model to use
      :seed=None : Integer seed for random number generators, fix seed to reproduce
       results
      :nsamples=1 : Number of samples to return total
      :batch_size=1 : Number of batches (only affects speed/memory).  Must divide nsamples.
      :length=None : Number of tokens in generated text, if None (default), is
       determined by model hyperparameters
      :temperature=1 : Float value controlling randomness in boltzmann
       distribution. Lower temperature results in less random completions. As the
       temperature approaches zero, the model will become deterministic and
       repetitive. Higher temperature results in more random completions.
      :top_k=0 : Integer value controlling diversity. 1 means only 1 word is
       considered for each step (token), resulting in deterministic completions,
       while 40 means 40 words are considered at each step. 0 (default) is a
       special setting meaning no restrictions. 40 generally is a good value.
      """
      if batch_size is None:
          batch_size = 1
      assert nsamples % batch_size == 0

      self.nsamples = nsamples
      self.batch_size = batch_size

      my_model_path = "/content/gdrive/MyDrive/Colab Notebooks/NLP/project/"   # "/content/gpt-2/models/"
      
      self.enc = encoder.get_encoder(model_name, my_model_path)
      hparams = model.default_hparams()

      with open(os.path.join(my_model_path, model_name, 'hparams.json')) as f:
          hparams.override_from_dict(json.load(f))

      #with open(os.path.join('models', model_name, 'hparams.json')) as f:
      #    hparams.override_from_dict(json.load(f))

      if length is None:
          length = hparams.n_ctx // 2
      elif length > hparams.n_ctx:
          raise ValueError("Can't get samples longer than window size: %s" % hparams.n_ctx)

      self.sess = tf.Session(graph=tf.Graph())
      self.sess.__enter__()
      
      self.context = tf.placeholder(tf.int32, [batch_size, None])
      np.random.seed(seed)
      tf.set_random_seed(seed)
      self.output = sample.sample_sequence(
          hparams=hparams, length=length,
          context=self.context,
          batch_size=batch_size,
          temperature=temperature, top_k=top_k
      )

      saver = tf.train.Saver()
      self.ckpt = tf.train.latest_checkpoint(os.path.join(my_model_path, model_name))
      #self.ckpt = tf.train.latest_checkpoint(os.path.join('models', model_name))
      saver.restore(self.sess, self.ckpt)

  def close(self):
    self.sess.close()
  
  def generate_conditional(self,raw_text):
      context_tokens = self.enc.encode(raw_text)
      generated = 0
      for _ in range(self.nsamples // self.batch_size):
          out = self.sess.run(self.output, feed_dict={
              self.context: [context_tokens for _ in range(self.batch_size)]
          })[:, len(context_tokens):]
          for i in range(self.batch_size):
              generated += 1
              text = self.enc.decode(out[i])
              return text
              #print("=" * 40 + " SAMPLE " + str(generated) + " " + "=" * 40)
              #print(text)
      #print("=" * 80)

In [None]:
gpt2 = GPT2(model_name="checkpoint81")

# you must also call download_model.py (see earlier cell) with the correct parameter
# 1558M, best results takes a long time to load
# 1558M, 774M, 355M, 345M, 124M, and 117M

Instructions for updating:
Use `tf.cast` instead.
Instructions for updating:
Use `tf.random.categorical` instead.
INFO:tensorflow:Restoring parameters from /content/gdrive/MyDrive/Colab Notebooks/NLP/project/checkpoint81/model.ckpt


In [None]:
result = gpt2.generate_conditional(raw_text="what is two plus two ?")

print(result)


Not a soul.
<|endoftext|>
Not a soul.
I have, uh, His Majesty's orders regarding pipelines. I just not seen anything. I have a message for you, carpenters.
<|endoftext|>
I have, uh, His Majesty's orders regarding pipelines. I just not seen anything. I have a message for you, carpenters.
Oh, I don't know whether you're well connected or not, but I assure you they are. You do know how to use a pipe, don't you. I have some lessons I think you should learn.
<|endoftext|>
Merry Christmas everyone. I'm glad to see you again.
Sure don't look like you been travelling around very much.
<|endoftext|>
Umm, Mr. Justin's office. I was coming by to say that.
Nice to see you again, Mr. Justin.
<|endoftext|>
Nice to see you again, Mr. Justin.
You were very clear on this point quite a bit of talk.
<|endoftext|>
Something like that. I mean, are you strictly cutting both tracks and cutting the results of this test ?
You bet your buck.
<|endoftext|>
Say again?
Yes. I'm making the claim that Canada is usi

In [None]:
class Who:
  """A class defining the conversation parties: me, he"""
  def __init__(self):
    self.prefixes = []

  def matches(self,phrase):
    for prefix in self.prefixes:
      if phrase.startswith(prefix):
        #print(f"{phrase} starts with {prefix}")
        return True
      
    #print(f"{phrase} does not start with {self.prefixes}")
    return False

  def get_random_prefix(self):
    return self.prefixes[0]
  
class Me(Who):
  def __init__(self):
    super().__init__()
    self.prefixes = [""]
   
  
class You(Who):
  def __init__(self):
    super().__init__()
    self.prefixes = [""]

In [None]:
class Conversation:
  
  def __init__(self, prior = None):
    if prior is None:
      prior="""
      You said: "Nice to meet you. What's your name?"
      I said: "My name is Pete."
      You said: "That's an interesting name. How old are you?"
      I said: "I'm 40 years old."
      You said: "Can you tell me something about yourself?"
      I said: "Ofcourse! I like playing video games and eating cake. "
      You said: "I like sweet stuff too. What are your plans for tomorrow?"
      """
    self.suggestion = None
    
    self.me = Me()
    self.you = You()
    self.parties  = [ self.me, self.you ]
    
    self.conversation = []
    
    lines = prior.split("\n")
    for line in lines:
      line = line.strip()
      if len(line)!=0:
        party = None
        for party in self.parties:
          if party.matches(line):
            break
        if party is None:
          raise Exception(f"Unknown party: {line}")
                
        self.conversation.append((party,line))
    self.get_suggestion()
    
  
  def get_prior(self):
    conv = ""
    for (party, line) in self.conversation:
      conv+=line+"\n"
    return conv
  
  def get_suggestion(self):
    who, last_line = self.conversation[-1]

    party_index = self.parties.index(who)
    next_party = self.parties[(party_index+1) % len(self.parties)]
      
    conv = self.get_prior()
    conv += next_party.get_random_prefix()
    answer = self.get_answer(next_party, conv)

    if not next_party.matches(answer):
      prefix = next_party.get_random_prefix()
      answer = prefix + answer
    
    self.suggestion = (next_party, answer)
  
  def next(self, party = None, answer = ""):
    """Continue the conversation
    :param party: None -> use the current party which is currently in turn
    :param answer: None -> use the suggestion, specify a text to override the 
           suggestion
    
    """
    suggested_party, suggested_answer = self.suggestion
    if party is None:
      party = suggested_party
    
    if answer == "":
      answer = suggested_answer
      
    if not party.matches(answer):
      prefix = party.get_random_prefix()
      answer = prefix + answer
    
    answer = answer.strip()
    if answer[-1] != "\"":
      # add the closing "
      answer += "\""
      
    self.conversation.append((party, answer))    
    self.get_suggestion()
    
  def retry(self):
    self.get_suggestion()
        
  def get_answer(self, party, conv):
    answer = gpt2.generate_conditional(raw_text=conv)
    lines = answer.split("\n")
    line = ""
    for line in lines:
      if line !="":
        break
      
    if line!="":
      return line
    
    return ""
      
  def show(self):
    conv = ""
    for (party, line) in self.conversation:
      conv+=line+"\n"
    #print(conv)
    if self.suggestion is not None:
      party, answer  = self.suggestion
      while answer == "<|endoftext|>":
        party, answer  = self.suggestion
        self.retry()
        print("retrying...")
      #print("--> "+answer)
      return answer

## Generating answers for evaluation

In [None]:
import requests

MODEL_NAME = "checkpoint81"
models = ["ncm", "dbdc", "os", "twitter", "cornell"][:]
urls = ["https://raw.githubusercontent.com/chateval/application/master/assets/prompts/ncm.txt",
        "https://raw.githubusercontent.com/chateval/application/master/assets/prompts/dbdc.txt",
        "https://raw.githubusercontent.com/chateval/application/master/assets/prompts/os.txt",
        "https://raw.githubusercontent.com/chateval/application/master/assets/prompts/twitter.txt",
        "https://raw.githubusercontent.com/abaheti95/DC-NeuralConversation/master/MTurk%20Evaluation/MTurk2%20model%20responses/s_cornell_mturk2_test.txt"][:]


In [None]:
models

['ncm', 'dbdc', 'os', 'twitter', 'cornell']

In [None]:
for i in range(len(models)):
  url = urls[i]
  model = models[i]

  print("Dataset", model)

  resp = requests.get(url).text.splitlines()

  answers = []
  for line in resp:
    print("Q", line)
    c = Conversation(line)
    # show the conversation and the suggestion by the ai
    ans = c.show()
    print("A", ans)
    print()

    answers.append(ans)

  with open(MODEL_NAME + "_" + model + ".txt", 'w', encoding="utf8") as filehandle:
      filehandle.writelines("%s\n" % answer for answer in answers)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
retrying...
retrying...
retrying...
retrying...
A Declining stars ?

Q Just put me back in the other way .
retrying...
retrying...
retrying...
retrying...
retrying...
A Well, here we are. I got together an old mole in my office, I want someone to complain to .

Q Uh , I' m sure all your spaces are filled for tonight .
A Yeah, but don't worry about me staying up. There are a thousand theories around here, but I think I've come up with a hundred. Let's just get started. I finished my design, how's it going?

Q Benita Matkudzi ... . . burned to death during a protest rally .
A And Cadwalader Izzard ... !

Q lnspector Yip , James from the UK .
A We made a good impression, didn't we?

Q All right .
A Does Kimura put him in prison for killing that policeman?

Q I' m dead .
retrying...
retrying...
A Then leave me alone  watch this.

Q Newport , ' 88 .
A Joey  I sensed something.

Q That' s just my style .
A Oh, it's never bad 'c