# 4.1. Dialogue Management Implementation (Policies)

Datacamp gave me ideas on how ELIZA bot was traditionally coded up. I used that and RASA's chatbot implementation as inspiration, and this notebook represents my own implementation of how I understand a chatbot is coded up.

The goal is to predict the next best action based on a history of dialogue. These actions are governed by policies - each policy will output a probability.

I drew up a high-level diagram (rooting off the one RASA drew for their chatbot architecture) that helps display my entire pipeline.

<!-- <img src="visualizations/chatbot-framework.png" alt="Drawing" style="width: 500px;"/> -->

In [6]:
# Visualization 
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks", color_codes=True)
# Data science
import pandas as pd
print(f"Pandas: {pd.__version__}")
import numpy as np
print(f"Numpy: {np.__version__}")# Data science
import pandas as pd
print(f"Pandas: {pd.__version__}")
import numpy as np
print(f"Numpy: {np.__version__}") 
import random

# Reading in training data
train = pd.read_pickle('objects/train.pkl')

# Getting all the unique intents
intents = train['Intent'].unique()

Pandas: 1.0.1
Numpy: 1.18.1
Pandas: 1.0.1
Numpy: 1.18.1


In [7]:
def main():
    a = Actions()
    a.utter_greet()
    user_input = "Tell EVE something!"
    intents, hardware, app = initialize(input)

    # Initializing dialogue history
    columns = entities["hardware"] + entities["apps"]
    history = pd.DataFrame(dict(zip(columns, np.zeros(len(columns)))))

def initialize(user_input):
    """ Takes the user input and returns the entity representation and predicted intent"""
    # Intent classification
    intents = infer_intent(user_input)
    # Further unpacking
    user_input, pred = intents
    pred = {k: round(float(v), 3) for k, v in pred.items()}

    # Visualizing intent classification
    g = sns.barplot(
        list(pred.keys()),
        list(pred.values()),
        palette=sns.cubehelix_palette(8, reverse=True),
    )
    g.set_xticklabels(g.get_xticklabels(), rotation=90)
    plt.show(bbox_inches="tight")

    st.subheader("Hardware Identified")
    hardware = extract_hardware(user_input, visualize=True)
    print(hardware)

    print("Applications Identified")
    app = extract_app(user_input, visualize=True)
    print(app)

    return (intents, hardware, app)

if __name__ == "__main__":
    main()

NameError: name 'st' is not defined

In [4]:
# Making a class to define all the actions to do when you are
class Actions:
    memory = {'hardware': [], 'app': []}
    def __init__(self):
        pass

    # If greet
    def utter_greet(self):
        # Storing the bank of responses
        return random.choice(
            [
                "Hi! My name is EVE. How may I assist you today?",
                "Hello. How may I be of help?",
            ]
        )

    # If goodbye
    def utter_goodbye(self):
        reaffirm = ["Is there anything else I could help you with?"]
        goodbye = [
            "Thank you for your time. Have a nice day!",
            "Glad I could be of help, have a nice day!",
        ]
        return

    # Speak to representative
    def link_to_human(self):
        return random.choice(["Alright. Let me direct you to a representative!"])

    def battery(self, hardware = False):
        if hardware == False:
            return ['What hardware are you using?']
        else:
            return random.choice([''])

    def forgot_pass(self):
        pass

    def payment(self):
        # What hardware?

        return [""]

    def challenge_robot(self):
        return random.choice(
            [
                "You're funny. Of course I am a robot.",
                "Yes, and I was designed by Matthew to assist you.",
            ]
        )

    def update(self):
        # Affirm hardware
        if hardware == None:
            pass


### Next best action prediction
**Inputs**:
* Last interactions
    * Param: Max history
* Intent
* Entities
* Slots: The policy implementation just sees a binary representation if the slot is filled or not.

**Outputs**: Next best action


### Actions
Actions that my chatbot is capable of:

* `utter_greet`: Greets user with a greeting
* `utter_goodbye`: Says farewell
* `fallback`: Links to human if confidence thresholds are not met or the model thinks that you are saying something out of scope
    * Can do this in two stages - Ask the user to affirm an intent at the middle confidence range
    * Ambiguity threshold - Minimum amount the top intent confidence must exceed the second highest intent
* `utter_default`: Utters default phrase for the intent
* `reroute_intent`: Bot misidentified the intent. I can revert to the option menu in this case. Then to a human.
* `search update resolution`: 

My goal is to make each of these actions into a function, and call that function when the particular `next_best_action` is predicted.

    Me: My iOS isn't updating correctly
    Eve: Would you like to 

In [1]:
# Getting my training data
slot = {'Hardware': []}
history = ['hi', 'my iphone is broken', 'are you a robot']
# User inputs
history

['hi', 'my iphone is broken', 'are you a robot']

Can we somehow bias the model to be able to incorporate a certain flow more than others?

# Action Selection
At every turn, each policy defined in your configuration will predict a next action with a certain confidence level. For more information about how each policy makes its decision, read into the policy’s description below. The bot’s next action is then decided by the policy that predicts with the highest confidence.

In the case that two policies predict with equal confidence (for example, the Memoization and Mapping Policies always predict with confidence of either 0 or 1), the priority of the policies is considered. Rasa policies have default priorities that are set to ensure the expected outcome in the case of a tie. They look like this, where higher numbers have higher priority:

1. TEDPolicy, EmbeddingPolicy, KerasPolicy, and SklearnPolicy - Predicts next best action
2. MappingPolicy - Maps an intent to an action
3. MemoizationPolicy and AugmentedMemoizationPolicy - Memorizes conversational data: predicts either confidence of 1 or 0
4. FallbackPolicy and TwoStageFallbackPolicy - Lower conf thresholds
5. FormPolicy - Filling in the required slots

This priority hierarchy ensures that, for example, if there is an intent with a mapped action, but the NLU confidence is not above the nlu_threshold, the bot will still fall back. In general, it is not recommended to have more than one policy per priority level, and some policies on the same priority level, such as the two fallback policies, strictly cannot be used in tandem.

If you create your own policy, use these priorities as a guide for figuring out the priority of your policy. If your policy is a machine learning policy, it should most likely have priority 1, the same as the Rasa machine learning policies.

In [19]:
pd.read_pickle('objects/processed.pkl')['Real Outbound']

# Have to determine how you are going to respond to each message that comes in
# Usually will have good 3 first words, then the rest is bad
# Language generation models are usually trained with HUGE datasets
# Sign up for the GPT-3 API

0         @115854 Lets take a closer look into this issu...
6         @115856 Hey, let's work together to figure out...
12        @115861 You're in the right place; we'll do al...
13        @115863 Go ahead and send us a DM please. Let ...
15        @115864 We'd like to help, but we'll need more...
                                ...                        
106643    @823737 We're happy to help out with your conc...
106644    @689907 We're certainly glad to get you pointe...
106645    @823765 We'd love to help! Which device are yo...
106646    @823779 We'd like to help. Send us a DM and we...
106647    @823796 We'd love to offer our help in making ...
Name: Real Outbound, Length: 76066, dtype: string

In [1]:
intent

NameError: name 'intent' is not defined

In [10]:
# Making a switch statement to see which actions 
if intent == 'battery':
    pass


NameError: name 'intent' is not defined

### Outbound EDA
I want to get insight on how Apple responds to their customers.

In [36]:
# Search by keywords (single keyword filter)
intent = 'update'

# Seeing what the processed Tweets look like
filt = [(i,j) for i,j in enumerate(processed['Processed Inbound']) if keyword in j]
filtered = processed.iloc[[i[0] for i in filt]]
print(f'{len(filtered)} Tweets contain the keyword {keyword}')
filtered

19942 Tweets contain the keyword update


Unnamed: 0,Processed Inbound,Real Inbound,Real Outbound
0,"[new, update, i️, make, sure, download, yester...",@AppleSupport The newest update. I️ made sure ...,@115854 Lets take a closer look into this issu...
15,"[thank, update, phone, even, slow, barely, wor...",Thank you @AppleSupport I updated my phone and...,"@115864 We'd like to help, but we'll need more..."
19,"[need, software, update, urgently, battery, la...",@AppleSupport I need the software update urgen...,@115865 Hi there! What type of device are we w...
25,"[hey, last, time, download, update, freak, pho...",Hey @115858! Last time I downloaded an update ...,@115869 We're here to help. Meet us in DM and ...
38,"[iphone, yes, io, checked, update, none, avail...","@AppleSupport iPhone 6, yes ios11. Checked for...","@116102 To make sure, is iOS 11.1 showing here..."
...,...,...,...
106624,"[dear, fuck, wish, iphone, would, stop, crash,...",Dear @115858 I fucking wish my iPhone 7 would ...,@823495 We know it's important for your iPhone...
106630,"[im, upset, update, every, time, type, anythin...","im so upset over this @115858 update, every ti...",@485591 We completely understand being upset a...
106634,"[home, button, work, phone, battery, last, lit...",@115858 My home button does not work. My phone...,@823651 We want your iPhone to work as it shou...
106636,"[whenever, new, iphone, get, launch, old, mode...",@115858 why is it whenever a new iphone gets l...,@823679 Thanks for reaching out. We know how i...


# TED Policy Documentation

This is taken directly from RASA.

The Transformer Embedding Dialogue (TED) Policy is described in our paper.

This policy has a pre-defined architecture, which comprises the following steps:

* concatenate user input (user intent and entities), previous system actions, slots and active forms for each time step into an input vector to pre-transformer embedding layer;
* feed it to transformer;
* apply a dense layer to the output of the transformer to get embeddings of a dialogue for each time step;
* apply a dense layer to create embeddings for system actions for each time step;
* calculate the similarity between the dialogue embedding and embedded system actions. This step is based on the StarSpace idea.

It is recommended to use state_featurizer=LabelTokenizerSingleStateFeaturizer(...) (see Featurization of Conversations for details).

### Configuration:

Configuration parameters can be passed as parameters to the TEDPolicy within the configuration file. If you want to adapt your model, start by modifying the following parameters:

* `epochs`: This parameter sets the number of times the algorithm will see the training data (default: 1). One epoch is equals to one forward pass and one backward pass of all the training examples. Sometimes the model needs more epochs to properly learn. Sometimes more epochs don’t influence the performance. The lower the number of epochs the faster the model is trained.
* `hidden_layers_sizes`: This parameter allows you to define the number of feed forward layers and their output dimensions for dialogues and intents (default: dialogue: [], label: []). Every entry in the list corresponds to a feed forward layer. For example, if you set dialogue: [256, 128], we will add two feed forward layers in front of the transformer. The vectors of the input tokens (coming from the dialogue) will be passed on to those layers. The first layer will have an output dimension of 256 and the second layer will have an output dimension of 128. If an empty list is used (default behavior), no feed forward layer will be added. Make sure to use only positive integer values. Usually, numbers of power of two are used. Also, it is usual practice to have decreasing values in the list: next value is smaller or equal to the value before.
* `number_of_transformer_layers`: This parameter sets the number of transformer layers to use (default: 1). The * number of transformer layers corresponds to the transformer blocks to use for the model.
* `transformer_size`: This parameter sets the number of units in the transformer (default: 128). The vectors coming out of the transformers will have the given transformer_size.
* `weight_sparsity`: This parameter defines the fraction of kernel weights that are set to 0 for all feed forward layers in the model (default: 0.8). The value should be between 0 and 1. If you set weight_sparsity to 0, no kernel weights will be set to 0, the layer acts as a standard feed forward layer. You should not set weight_sparsity to 1 as this would result in all kernel weights being 0, i.e. the model is not able to learn.

In [20]:
# Transformer model

# Reads in the question as a contextual question

# Open AI GPT-3 Generates response from scratch. 
# The only issue is that it's a gigantic model.

# One way you could do it
# Pass in inbound examples as the training examples

# Outbound message is what's meant to be predicted. Model will learn if they get a similar inbound
# message it will be similar

# Words go in as documents (word by word). It will find out the context of the sequence that get's passed
# in and it reads the inbound sequence as the inbound context. When it trains it is going to look at the
# outbounds that go into that inbound. Generate a word by word output.

Next word prediction, or use responses as a bank of classes and you classify which existing response. Take advantage of pretrained embeddings. That reduces the amount of training needed. If we are talking about next word generation.