<a href="https://colab.research.google.com/github/Svetlana-L/Tools-for-Data-Science-IBM/blob/main/DialogueSystems_Seminar.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generative Dialogue System

Let's install `transformers` library.

In [None]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.20.1-py3-none-any.whl (4.4 MB)
[K     |████████████████████████████████| 4.4 MB 25.6 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.8.1-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 5.7 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 58.0 MB/s 
[?25hCollecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[K     |████████████████████████████████| 6.6 MB 47.7 MB/s 
Installing collected packages: pyyaml, tokenizers, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 3.13
    Uninstal

Let's import libraries.

In [None]:
import numpy as np
import pandas as pd
import torch
from transformers import (
    AutoModel,
    AutoModelWithLMHead,
    AutoTokenizer,
    AutoModelForSequenceClassification,
    AutoModelForCausalLM,
)

Consider DialoGPT ([paper](https://https://arxiv.org/pdf/1911.00536.pdf), [code](https://github.com/microsoft/DialoGPT)) as a response generator.

Consider DialogRPT([paper](https://arxiv.org/abs/2009.06978), [code](https://github.com/microsoft/DialogRPT)) as a response ranker.

Let's limit the dialogue history to the last 3 utterances.

Let's generate up to 3 hypotheses for each context.

In [None]:
GENERATIVE_MODEL = "microsoft/DialoGPT-medium"
RANKING_MODEL = "microsoft/DialogRPT-updown"

MAX_HISTORY_DEPTH = 3
N_HYPOTHESES_TO_GENERATE = 3

Import tokenizers, generative and ranking models.

Set the model to cuda if device is available.

In [None]:
tokenizer = AutoTokenizer.from_pretrained(GENERATIVE_MODEL)
model = AutoModelForCausalLM.from_pretrained(GENERATIVE_MODEL)

ranker_tokenizer = AutoTokenizer.from_pretrained(RANKING_MODEL)
ranker_model = AutoModelForSequenceClassification.from_pretrained(RANKING_MODEL)

if torch.cuda.is_available():
  model.to("cuda")
  ranker_model.to("cuda")

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/642 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/823M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/812 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/31.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.42G [00:00<?, ?B/s]

The function to generate a hypothesis for the dialogue context.

In [None]:
def generate_response(context, model, tokenizer):
    encoded_context = []
    for uttr in context[-MAX_HISTORY_DEPTH:]:
        encoded_context += [tokenizer.encode(uttr + tokenizer.eos_token, return_tensors="pt")]
    bot_input_ids = torch.cat(encoded_context, dim=-1)

    with torch.no_grad():
        if torch.cuda.is_available():
            bot_input_ids = bot_input_ids.to("cuda")
        chat_history_ids = model.generate(
            bot_input_ids,
            do_sample=True,
            max_length=100,
            temperature=0.6,
            repetition_penalty=1.3,
            pad_token_id=tokenizer.eos_token_id,
        )
        if torch.cuda.is_available():
            chat_history_ids = chat_history_ids.cpu()
    return tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1] :][0], skip_special_tokens=True)

Let's try to generate different hypotheses for the considered contexts.

In [None]:
contexts = [
            ["hello! how are you?", "hi! awesome! what about you?", "nice! chilling all day.."],
]
responses = []

for context in contexts:
    curr_responses = []
    for i in range(N_HYPOTHESES_TO_GENERATE):
        response = generate_response(context, model, tokenizer)
        if len(response) > 3:
            # drop too short responses
            curr_responses += [response]
        else:
            curr_responses += [""]
    responses += [curr_responses]
    for resp in curr_responses:
      for uttr in context:
        print(f"---{uttr}")
      print(f"---{resp}\n\n")

---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---awesome! i'm at work, but this subreddit is so much fun. :D


---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---good! i'm trying to stay in shape for when the semester starts.


---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---oh, cool. I'm going to be at work for the next couple of hours if that's ok with ya? :D




The function to evaluate a response for the given context using ranking models.

In [None]:
def score(cxt: str, hyp: str):
  model_input = ranker_tokenizer.encode(cxt + "<|endoftext|>" + hyp, return_tensors="pt")
  result = ranker_model(model_input, return_dict=True)
  return torch.sigmoid(result.logits)

An example to evaluate a hyopthesis for the given context.

In [None]:
print("Context:", " ".join(contexts[0]))
print("Response:", responses[0][0])
result = score(" ".join(contexts[0]), responses[0][0]).squeeze()
print("Score: ", result.item())

Context: hello! how are you? hi! awesome! what about you? nice! chilling all day..
Response: awesome! i'm at work, but this subreddit is so much fun. :D
Score:  0.3088275194168091


Let's evaluate the generated hypotheses for the considered contexts.

In [None]:
for context, hypotheses  in zip(contexts, responses):
  curr_scores = []
  for hyp in hypotheses:
    result = score(" ".join(context), hyp).squeeze()
    curr_scores += [result.item()]

  for i in np.argsort(curr_scores):
    for uttr in context:
        print(f"---{uttr}")
    print(f"---{hypotheses[i]}\n\n")

---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---good! i'm trying to stay in shape for when the semester starts.


---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---oh, cool. I'm going to be at work for the next couple of hours if that's ok with ya? :D


---hello! how are you?
---hi! awesome! what about you?
---nice! chilling all day..
---awesome! i'm at work, but this subreddit is so much fun. :D




Finally, let's talk to our generative chatbot!

In [None]:
context = ["Hi there!"]
print("---", context[0])

while True:
  input_text = input()
  context += [input_text]
  context = context[-MAX_HISTORY_DEPTH:]
  curr_responses = []
  curr_scores = []
  for i in range(N_HYPOTHESES_TO_GENERATE):
    response = generate_response(context, model, tokenizer)
    result = score(" ".join(context), response).squeeze()
    curr_scores += [result.item()]
    curr_responses += [response]

  print("---", curr_responses[np.argmax(curr_scores)])

--- Hi there!
hi! how are you?
--- I'm doing great! How about yourself? :D
i'm fine. I am working all day long!
--- good! i did the same today. i hope your weekend goes well :D
what movies do you like?
--- I love anything that is comedy or action. It's really boring if you're not into those types of things, though!
what is your favorite movie?
--- The Martian! It's a beautiful film with an amazing soundtrack and great actors.
do you have a sister?
--- no, but i am a female.


KeyboardInterrupt: ignored