# Basic dialogue systems

1. Language Understanding
  - Domain identification
  - Intent detection
  - Argument parsing

2. Dialogue Management
  - State tracking
  - Policy
  - Communication with backend units

## Intent detection
- Let's use semantic information extracted from a language model.
  - In this simple example we will use an English BERT model
- Fine-tune BERT on a downstream task
- Determine uncertain domain (find a threshold for out-of-domain user inputs)

Pros:
- Small number of examples are needed
- Simple, stable, negative examples are not always needed

Cons:
- Using larger models, or fine-tuning on a well-sized database could be better.

Additionally:
- Yes and No answers should be recognized, as well as some topic-change indicators


## Dialogue State Tracking

- Graph-based
- Entering a node results in an action from the chatbot (system turn)
- Each node has a list of successor nodes (which are reachable via various actions, depending on the user input mostly)
- There are some entity-value pairs in the system, which we refer to as parameters or slots. The values of these are filled in by the user or by the backend systems.
- Each action has a set list of required slots (for example, a password, or a reservation date)




- We will not implement complex states. To overcome the limitation of single action states, we will have some states which will not require user input, and will step forward in the conversation automatically. These states will be labelled as automatic.
- We will not implement complex dialogue flow estimation, but complex systems will use that as well

Data and schematic model

https://docs.google.com/spreadsheets/d/1tD_up9OyqFE9h2A_ZruT2JGbIYp1pAuj8mQUr2XIzDw/edit?usp=sharing

https://docs.google.com/presentation/d/1B-TKmZ8RFJhk9xQZt2lB6uROH0btOMMWjOoIg-3BgmY/edit?usp=sharing


In [None]:
!pip install transformers
!pip install evaluate
!pip install datasets
!pip install spacy==3.4.*
!pip install accelerate
!pip install cython

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from enum import IntEnum

INTENT = IntEnum("Intent",["Yes","No","BuyFruit","BuyCereal","Pickup","Delivery","Juice"])

In [None]:
def user():
  return input("User: ")

def aiprint(*args, **kwargs):
  print("AI: ",end="")
  print(*args, **kwargs)

In [None]:
class DialState:
  def __init__(self, name, req_slots, successors, is_automatic, entry_action):
    self.name = name
    self.req_slots = req_slots
    self.successors = successors
    self.is_automatic = is_automatic
    self.entry_action = entry_action
    self.auto_action = None
    self.response = None
  
  def get_missing_slots(self):
    missing = []
    for s in self.req_slots:
      if s.value is None:
        missing.append(s)
    
    return missing
  
  def propagate_by_action(self, action):
    if action in self.successors.keys():
      return self.successors[action]
    else:
      return None
  
  def enter_node(self):
    self.auto_action = None
    self.response = None
    self.entry_action(self)
    return self.propagate_by_action(self.auto_action), self.response
    
    


In [None]:
class Slot:
  def __init__(self, name, slot_prompt, ent_types, value=None):
    self.name = name
    self.value = value
    self.slot_prompt = slot_prompt
    self.ent_types = ent_types

  def reset(self):
    self.value = None

  def get_slot_information(self, nlp):
    while True:
      aiprint(self.slot_prompt)
      response = user()
      doc = nlp(response)
      possible_ents = []
      for e in doc.ents:
        if e.label_ in self.ent_types:
          possible_ents.append(e.text)
      
      if len(possible_ents)!=1:
        aiprint("Sorry, I couldn't understand properly, please repeat in other words!")
      else:
        self.value = possible_ents[0]
        break


In [None]:
import spacy

class DialSystem:
  def __init__(self, states, intent_model, slots, starting_state):
    self.states = states
    self.intent_model = intent_model
    self.slots = slots
    self.starting_state = starting_state
    self.nlp = spacy.load("en_core_web_sm")
  
  def chat(self):
    for s in self.slots:
      s.reset()

    current_state = self.states[self.starting_state]
    while True:
      missing = current_state.get_missing_slots()
      if len(missing)>0:
        aiprint("Before we proceed I would like to request some information!")
        for s in missing:
          s.get_slot_information(self.nlp)
      
      next_state=None
      while next_state is None:
        next_state, response = current_state.enter_node()
        if next_state is None and response is not None:
          next_state = current_state.propagate_by_action(self.intent_model.predict_intent(response))
          if next_state is not None:
            continue
        if next_state is None or response is None:
          aiprint("Sorry, I cannot understand your request.")

      current_state = next_state
      if len(current_state.successors)<=0:
        next_state, response = current_state.enter_node()
        break


In [None]:
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
from datasets import Dataset, ClassLabel
from transformers import TrainingArguments, Trainer
import numpy as np
import evaluate
import torch
from datasets import load_dataset


class IntentModel:
  def __init__(self, model_string="bert-base-uncased", num_intents=2, max_len=64):
    self.model = AutoModelForSequenceClassification.from_pretrained(model_string,
                                                                    num_labels=num_intents)
    self.tokenizer = AutoTokenizer.from_pretrained(model_string)
    self.max_len = max_len
    self.num_intents = num_intents

  def predict_intent(self, prompt):
    tokens = self.tokenizer(prompt, return_tensors="pt",
                       padding=True, truncation=True, max_length=self.max_len)
    
    actionId = np.argmax(self.model.cpu()(tokens.input_ids, tokens.attention_mask).logits.detach().numpy())+1
    return actionId

  def train_model(self, dataPath): 
    dataset = load_dataset("csv", data_files=dataPath, split=None)
    def change_label(row):
      row["label"] = INTENT[row["label"]]-1
      return row

    dataset["train"] = dataset["train"].map(change_label)
    

    def tokenize_function(examples):
        return self.tokenizer(examples["text"], padding="max_length", truncation=True)
    tokenized_datasets = dataset["train"].map(tokenize_function, batched=True)


    tokenized_datasets = tokenized_datasets.class_encode_column("label")
    
    tokenized_datasets = tokenized_datasets.train_test_split(0.25, stratify_by_column="label")
    
    train_dataset = tokenized_datasets["train"].shuffle(seed=42)
    eval_dataset = tokenized_datasets["test"].shuffle(seed=42)
    metric = evaluate.load("accuracy")

    def compute_metrics(eval_pred):
        logits, labels = eval_pred
        predictions = np.argmax(logits, axis=-1)
        return metric.compute(predictions=predictions, references=labels)
    

    training_args = TrainingArguments(output_dir="trainer",
                                      evaluation_strategy="epoch",
                                      num_train_epochs=16,
                                      save_strategy="epoch",
                                      logging_strategy="epoch",
                                      load_best_model_at_end=True,
                                      save_total_limit = 1,
                                      )

    trainer = Trainer(
        model=self.model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        compute_metrics=compute_metrics,
    )

    self.model.classifier.trainable=True
    self.model.bert.trainable=False

    trainer.train()

    trainer = Trainer(
        model=self.model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        compute_metrics=compute_metrics,
    )

    self.model.classifier.trainable=True
    self.model.bert.trainable=True

    trainer.train()


In [None]:
slots = []

prodType = Slot("ProductType","", None)
address = Slot("DeliveryAddress","Please enter your address!",("GPE"))
slots = [prodType, address]

In [None]:
states = []

def startEntry(state):
  aiprint("Welcome to the GroCereal Bot-shop. What would you like to buy?")
  state.response = user()
startState = DialState("StartState",[],{},False,startEntry)

def cerealEntry(state):
  aiprint("Okay, I've put some cereal into your shopping cart. Do you want me to deliver it, or will you pick it up at my store?")
  prodType.value = "Cereal"
  state.response = user()
cerealState = DialState("CerealState",[],{},False,cerealEntry)

def fruitEntry(state):
  aiprint("Okay, I've put some fruit into your shopping cart. Do you want them to be delivered as is, or do you want me to make some juices from them?")
  prodType.value = "Fruit"
  state.response = user()
fruitState = DialState("FruitState",[],{},False,fruitEntry)

def deliveryEntry(state):
  aiprint("Do you want me to deliver the "+prodType.value+" to "+address.value+" then?")
  state.response = user()
deliveryState = DialState("DeliveryState",[address], {}, False, deliveryEntry)

def juiceEntry(state):
  aiprint("Do you want me to create some juice from your fruits then and deliver it to "+address.value+"?")
  state.response = user()
juiceState = DialState("JuiceState",[address], {}, False, juiceEntry)

def deliveryConfirmEntry(state):
  aiprint("Thank you, the delivery man will arrive tomorrow! Good bye!")
deliveryConfirmEntry = DialState("DelvieryConfirmState",[], {}, True, deliveryConfirmEntry)

def pickupConfirmEntry(state):
  aiprint("Thank you, the cereals are ready, you can pick them up between 8 and 17 on every workday. Good bye!")
pickupConfirmState = DialState("PickupConfirmState",[], {}, True, pickupConfirmEntry)

startState.successors={INTENT.BuyCereal:cerealState, INTENT.BuyFruit:fruitState}
cerealState.successors={INTENT.Pickup:pickupConfirmState, INTENT.Delivery:deliveryState}
fruitState.successors={INTENT.Juice:juiceState, INTENT.Delivery:deliveryState}
juiceState.successors={INTENT.Yes:deliveryConfirmEntry, INTENT.No:fruitState}
deliveryState.successors={INTENT.Yes:deliveryConfirmEntry, INTENT.No:startState}

states = [startState, cerealState, fruitState, pickupConfirmState, deliveryState, juiceState, deliveryConfirmEntry]


In [None]:
model = IntentModel(num_intents=7)

model.train_model("data.csv")

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

  0%|          | 0/1 [00:00<?, ?it/s]

Map:   0%|          | 0/28 [00:00<?, ? examples/s]

Map:   0%|          | 0/28 [00:00<?, ? examples/s]

Stringifying the column:   0%|          | 0/28 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/28 [00:00<?, ? examples/s]



Epoch,Training Loss,Validation Loss,Accuracy
1,1.9983,1.897698,0.285714
2,1.947,1.767573,0.285714
3,1.7635,1.706979,0.571429
4,1.6554,1.706585,0.285714
5,1.5422,1.715536,0.428571
6,1.4277,1.76542,0.285714
7,1.3791,1.624543,0.285714
8,1.3115,1.535716,0.428571
9,1.2052,1.521679,0.428571
10,1.1422,1.492824,0.428571




Epoch,Training Loss,Validation Loss,Accuracy
1,0.9286,1.284512,0.714286
2,0.8388,1.306607,0.571429
3,0.7976,1.305689,0.571429
4,0.6277,1.206641,0.571429
5,0.5413,1.092681,0.714286
6,0.4721,0.977069,0.714286
7,0.3588,0.879787,0.571429
8,0.3261,0.771893,0.857143
9,0.2711,0.699324,0.857143
10,0.2323,0.676976,0.857143


In [None]:
system = DialSystem(states, model, slots, 0)

In [None]:
system.chat()

AI: Welcome to the GroCereal Bot-shop. What would you like to buy?
User: I want some cereal!
AI: Okay, I've put some cereal into your shopping cart. Do you want me to deliver it, or will you pick it up at my store?
User: Deliver it to me!
AI: Before we proceed I would like to request some information!
AI: Please enter your address!
User: I live in Budapest, Fő utca 1.
AI: Do you want me to deliver the Cereal to Budapest then?
User: No
AI: Welcome to the GroCereal Bot-shop. What would you like to buy?
User: Cereals
AI: Sorry, I cannot understand your request.
AI: Welcome to the GroCereal Bot-shop. What would you like to buy?
User: I would like to buy cereal
AI: Okay, I've put some fruit into your shopping cart. Do you want them to be delivered as is, or do you want me to make some juices from them?
User: Deliver
AI: Do you want me to deliver the Fruit to Budapest then?
User: No
AI: Welcome to the GroCereal Bot-shop. What would you like to buy?
User: I would like to buy some cereal!
AI: 