# Building Chatbots in Python

<strong>Course Description</strong>

When done well, interacting with a computer through human language is incredibly powerful and also quite fun. Messaging and Voice-Controlled devices are the next big platforms, and conversational computing has a big role to play in creating engaging augmented and virtual reality experiences. This course will get you started on the path towards building such applications! There are a number of unique challenges to building these kinds of programs. The most obvious one is of course - how do I turn human language into machine instructions? In this course, you'll tackle this first with rule-based systems and then with machine learning. Some chat systems are designed to be useful, while others are just good fun. You will build one of each, and finally put everything together to make a helpful, friendly chatbot! And once you complete the course, you can learn how to <a src= "https://www.datacamp.com/community/tutorials/facebook-chatbot-python-deploy" >connect your chatbot to Facebook Messenger!</a>

## Chapter 1 : Chatbots 101

### Introduction

In this chapter, you'll learn how to build your first chatbot! After gaining a bit of historical context, you'll set up a basic structure for receiving text and responding to users, and then learn how to add the basic elements of personality. You'll then build rule-based systems for parsing text.

<video controls src="Intro_ConversationalAI.mp4" width="560" height="315"/>

In [1]:
bot_template = "BOT : {0}"
user_template = "USER : {0}"

# Define a function that responds to a user's message: respond
def respond(message):
    # Concatenate the user's message to the end of a standard bot respone
    bot_message = "I can hear you! You said: " + message
    # Return the result
    return bot_message

# Define a function that sends a message to the bot: send_message
def send_message(message):
    # Print user_template including the user_message
    print(user_template.format(message))
    # Get the bot's response to the message
    response = respond(message)
    # Print the bot template including the bot's response.
    print(bot_template.format(response))

# Send a message to the bot
send_message("hello")

USER : hello
BOT : I can hear you! You said: hello


### Creating a personality


In [2]:
# Define variables
name = "Greg"
weather = "cloudy"

# Define a dictionary with the predefined responses
responses = {
  "what's your name?": "my name is {0}".format(name),
  "what's today's weather?": "the weather is {0}".format(weather),
  "default": "default message"
}

# Return the matching response if there is one, default otherwise
def respond(message):
    # Check if the message is in the responses
    if message in responses:
        # Return the matching message
        bot_message = responses[message]
    else:
        # Return the "default" message
        bot_message = responses["default"]
    return bot_message

In [3]:
send_message("what's your favorite color?")

USER : what's your favorite color?
BOT : default message


In [4]:
send_message("what's your name?")

USER : what's your name?
BOT : my name is Greg


In [5]:
send_message("what's today's weather?")

USER : what's today's weather?
BOT : the weather is cloudy


#### Adding variety

It can get a little boring hearing the same old answers over and over. In this exercise, you'll add some variation. If you ask your bot how it's feeling, it may equally well respond with "oh I'm great!" as with "I'm very sad today".

Here, you'll use the random module - specifically random.choice(ls) - which randomly selects an element from a list ls.

A dictionary called responses, which maps each message to a list of possible responses, has been defined for you.

In [6]:
# Import the random module
import random

name = "Greg"
weather = "cloudy"

# Define a dictionary containing a list of responses for each message
responses = {
  "what's your name?": [
      "my name is {0}".format(name),
      "they call me {0}".format(name),
      "I go by {0}".format(name)
   ],
  "what's today's weather?": [
      "the weather is {0}".format(weather),
      "it's {0} today".format(weather)
    ],
  "default": ["default message"]
}

# Use random.choice() to choose a matching response
def respond(message):
    # Check if the message is in the responses
    if message in responses:
        # Return a random matching response
        bot_message = random.choice(responses[message])
    else:
        # Return a random "default" response
        bot_message = random.choice(responses["default"])
    return bot_message

In [7]:
send_message("what's your name?")

USER : what's your name?
BOT : my name is Greg


In [8]:
send_message("what's your name?")

USER : what's your name?
BOT : I go by Greg


In [9]:
send_message("what's your name?")

USER : what's your name?
BOT : they call me Greg


#### ELIZA I: asking questions
Asking questions is a great way to create an engaging conversation. Here, you'll create the very first hint of ELIZA's famous personality, by responding to statements with a question and responding to questions with answers.

A dictionary of responses with "question" and "statement" as keys, and lists of appropriate responses as values has already been defined for you. Explore this in the Shell with responses.keys() and responses["question"].

In [10]:
responses = {'question': ["I don't know :(", 'you tell me!'],
 'statement': ['tell me more!',
  'why do you think that?',
  'how long have you felt this way?',
  'I find that extremely interesting',
  'can you back that up?',
  'oh wow!',
  ':)']}

def respond(message):
    # Check for a question mark
    if message.endswith("?"):
        # Return a random question
        return random.choice(responses["question"])
    # Return a random statement
    return random.choice(responses["statement"])

In [11]:
# Send messages ending in a question mark
send_message("what's today's weather?")
send_message("what's today's weather?")

USER : what's today's weather?
BOT : I don't know :(
USER : what's today's weather?
BOT : I don't know :(


In [12]:
# Send messages which don't end with a question mark
send_message("I love building chatbots")
send_message("I love building chatbots")

USER : I love building chatbots
BOT : can you back that up?
USER : I love building chatbots
BOT : I find that extremely interesting


### Text Munging with regular expressions


#### ELIZA II: Extracting key phrases
The really clever thing about ELIZA is the way the program appears to understand what you told it, by occasionally including phrases uttered by the user in its responses.

In this exercise, you will match messages against some common patterns and extract phrases using re.search(). A dictionary called rules has already been defined, which matches the following patterns:

"do you think (.*)"
"do you remember (.*)"
"I want (.*)"
"if (.*)"
Inspect this dictionary in the Shell before starting the exercise.

In [13]:
import re 
rules = {'I want (.*)': ['What would it mean if you got {0}',
  'Why do you want {0}',
  "What's stopping you from getting {0}"],
 'do you remember (.*)': ['Did you think I would forget {0}',
  "Why haven't you been able to forget {0}",
  'What about {0}',
  'Yes .. and?'],
 'do you think (.*)': ['if {0}? Absolutely.', 'No chance'],
 'if (.*)': ["Do you really think it's likely that {0}",
  'Do you wish that {0}',
  'What do you think about {0}',
  'Really--if {0}']}

# Define match_rule()
def match_rule(rules, message):
    response, phrase = "default", None
    
    # Iterate over the rules dictionary
    for pattern, responses in rules.items():
        # Create a match object
        match = re.search(pattern,message)
        if match is not None:
            # Choose a random response
            response = random.choice(responses)
            if '{0}' in response:
                phrase = match.group(1)
    # Return the response and phrase
    return response, phrase

# Test match_rule
print(match_rule(rules, "do you remember your last birthday"))

('Yes .. and?', None)


In [14]:
print(match_rule(rules, "do you remember your last birthday"))

('Yes .. and?', None)


#### ELIZA III: Pronouns
To make responses grammatically coherent, you'll want to transform the extracted phrases from first to second person and vice versa. In English, conjugating verbs is easy, and simply swapping "I" and 'you', "my" and "your" works in most cases.

In this exercise, you'll define a function called replace_pronouns() which uses re.sub() to map "me" and "my" to "you" and "your" (and vice versa) in a string.

In [15]:
# Define replace_pronouns()
def replace_pronouns(message):

    message = message.lower()
    if 'me' in message:
        # Replace 'me' with 'you'
        return re.sub('me','you',message)
    if 'my' in message:
        # Replace 'my' with 'your'
        return re.sub('my','your',message)
    if 'your' in message:
        # Replace 'your' with 'my'
        return re.sub('your','my',message)
    if 'you' in message:
        # Replace 'you' with 'me'
        return re.sub('you','me',message)

    return message

print(replace_pronouns("my last birthday"))
print(replace_pronouns("when you went to Florida"))
print(replace_pronouns("I had my own castle"))

your last birthday
when me went to florida
i had your own castle


#### ELIZA IV: Putting it all together
Now you're going to put it all together and experience the magic! The match_rule(), send_message(), and replace_pronouns() functions have already been defined, and the rules dictionary is available in your workspace.

Your job here is to write a function called respond() with a single argument message which creates an appropriate response to be handled by send_message.

In [16]:
# Define respond()
def respond(message):
    # Call match_rule
    response, phrase = match_rule(rules,message)
    if '{0}' in response:
        # Replace the pronouns in the phrase
        phrase = replace_pronouns(phrase)
        # Include the phrase in the response
        response = response.format(phrase)
    return response

# Send the messages
send_message("do you remember your last birthday")
send_message("do you think humans should be worried about AI")
send_message("I want a robot friend")
send_message("what if you could be anything you wanted")

USER : do you remember your last birthday
BOT : What about my last birthday
USER : do you think humans should be worried about AI
BOT : No chance
USER : I want a robot friend
BOT : Why do you want a robot friend
USER : what if you could be anything you wanted
BOT : Do you wish that me could be anything me wanted


## Chapter 2 : Understanding natural language

Here, you'll use machine learning to turn natural language into structured data using spaCy, scikit-learn, and rasa NLU. You'll start with a refresher on the theoretical foundations, and then move on to building models using the ATIS dataset, which contains thousands of sentences from real people interacting with a flight booking system.

### Understanding Intents and Entities :NLU (Natural Language Understanding)

#### Intent classification with regex I
You'll begin by implementing a very simple way to recognise intents - just looking for the presence of keywords.

A dictionary keywords has already been defined. It has the intents "greet", "goodbye", and "thankyou" as keys, and lists of keywords as the corresponding values. For example, keywords["greet"] is set to "["hello","hi","hey"].

Also defined is a second dictionary, responses, indicating how the bot should respond to each of these intents. It also has a default response with the key "default".

The function send_message(), along with the bot and user templates have also already been defined. Your job in this exercise is to create a dictionary with the intents as keys and regex objects as values.

In [17]:
keywords = {'greet': ['hello', 'hi', 'hey'], 'goodbye': ['bye', 'farewell'], 'thankyou': ['thank', 'thx']}
responses = {'default': 'default message',
 'goodbye': 'goodbye for now',
 'greet': 'Hello you! :)',
 'thankyou': 'you are very welcome'}
# Define a dictionary of patterns
patterns = {}

# Iterate over the keywords dictionary
for intent, keys in keywords.items():
    # Create regular expressions and compile them into pattern objects
    patterns[intent] = 'r"('+'|'.join(keys)+')"'
    
# Print the patterns
print(patterns)

{'greet': 'r"(hello|hi|hey)"', 'goodbye': 'r"(bye|farewell)"', 'thankyou': 'r"(thank|thx)"'}


#### Intent classification with regex II
With your patterns dictionary created, it's now time to define a function to find the intent of a message.

In [18]:
# Define a function to find the intent of a message
def match_intent(message):
    matched_intent = None
    for intent, pattern in patterns.items():
        # Check if the pattern occurs in the message 
        if re.search(pattern,message):
            matched_intent = intent
    return matched_intent

# Define a respond function
def respond(message):
    # Call the match_intent function
    intent = match_intent(message)
    # Fall back to the default response
    key = "default"
    if intent in responses:
        key = intent
    return responses[key]

# Send messages
send_message("hello!")
send_message("bye byeee")
send_message("thanks very much!")

USER : hello!
BOT : default message
USER : bye byeee
BOT : default message
USER : thanks very much!
BOT : default message


#### Entity extraction with regex
Now you'll use another simple method, this time for finding a person's name in a sentence such as "hello, my name is David Copperfield".

You'll look for the keywords "name" or "call(ed)", and find capitalized words using regex and assume those are names. Your job in this exercise is to define a find_name() function to do this.

In [19]:
# Define find_name()
def find_name(message):
    name = None
    # Create a pattern for checking if the keywords occur
    name_keyword = re.compile('[name|call]*')
    # Create a pattern for finding capitalized words
    name_pattern = re.compile('[A-Z]{1}[a-z]*')
    if name_keyword.search(message):
        # Get the matching words in the string
        name_words = name_pattern.findall(message)
        if len(name_words) > 0:
            # Return the name if the keywords are present
            name = ' '.join(name_words)
    return name

# Define respond()
def respond(message):
    # Find the name
    name = find_name(message)
    if name is None:
        return "Hi there!"
    else:
        return "Hello, {0}!".format(name)

# Send messages
send_message("my name is David Copperfield")
send_message("call me Ishmael")
send_message("People call me Cassandra")

USER : my name is David Copperfield
BOT : Hello, David Copperfield!
USER : call me Ishmael
BOT : Hello, Ishmael!
USER : People call me Cassandra
BOT : Hello, People Cassandra!


### Word Vectors

#### word vectors with spaCy
In this exercise you'll get your first experience with word vectors! You're going to use the ATIS dataset, which contains thousands of sentences from real people interacting with a flight booking system.

The user utterances are available in the list sentences, and the corresponding intents in labels.

Your job is to create a 2D array X with as many rows as there are sentences in the dataset, where each row is a vector describing that sentence.

In [20]:
##ATIS Dataset
sentences = [' i want to fly from boston at 838 am and arrive in denver at 1110 in the morning',
 ' what flights are available from pittsburgh to baltimore on thursday morning',
 ' what is the arrival time in san francisco for the 755 am flight leaving washington',
 ' cheapest airfare from tacoma to orlando',
 ' round trip fares from pittsburgh to philadelphia under 1000 dollars',
 ' i need a flight tomorrow from columbus to minneapolis',
 ' what kind of aircraft is used on a flight from cleveland to dallas',
 ' show me the flights from pittsburgh to los angeles on thursday',
 ' all flights from boston to washington',
 ' what kind of ground transportation is available in denver',
 ' show me the flights from dallas to san francisco',
 ' show me the flights from san diego to newark by way of houston',
 ' what is the cheapest flight from boston to bwi',
 ' all flights to baltimore after 6 pm',
 ' show me the first class fares from boston to denver',
 ' show me the ground transportation in denver',
 ' all flights from denver to pittsburgh leaving after 6 pm and before 7 pm',
 ' i need information on flights for tuesday leaving baltimore for dallas dallas to boston and boston to baltimore',
 ' please give me the flights from boston to pittsburgh on thursday of next week',
 ' i would like to fly from denver to pittsburgh on united airlines',
 ' show me the flights from san diego to newark',
 ' please list all first class flights on united from denver to baltimore',
 ' what kinds of planes are used by american airlines',
 " i'd like to have some information on a ticket from denver to pittsburgh and atlanta",
 " i'd like to book a flight from atlanta to denver",
 ' which airline serves denver pittsburgh and atlanta',
 " show me all flights from boston to pittsburgh on wednesday of next week which leave boston after 2 o'clock pm",
 ' atlanta ground transportation',
 ' i also need service from dallas to boston arriving by noon',
 ' show me the cheapest round trip fare from baltimore to dallas']

labels = ['atis_flight',
 'atis_flight',
 'atis_flight_time',
 'atis_airfare',
 'atis_airfare',
 'atis_flight',
 'atis_aircraft',
 'atis_flight',
 'atis_flight',
 'atis_ground_service',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_airfare',
 'atis_ground_service',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_flight',
 'atis_aircraft',
 'atis_airfare',
 'atis_flight',
 'atis_airline',
 'atis_flight',
 'atis_ground_service',
 'atis_flight',
 'atis_airfare']

In [21]:
import spacy
import numpy as np
# Load the spacy model: nlp
nlp = spacy.load('en_md')

# Calculate the length of sentences
n_sentences = len(sentences)

# Calculate the dimensionality of nlp
embedding_dim = nlp.vocab.vectors_length
print(embedding_dim)

# Initialize the array with zeros: X
X = np.zeros((n_sentences, embedding_dim))
X[0:1]

300


array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 

In [22]:
# Iterate over the sentences
for idx, sentence in enumerate(sentences):
    # Pass each each sentence to the nlp object to create a document
    doc = nlp(sentence)
    # Save the document's .vector attribute to the corresponding row in X
    X[idx, :] = doc.vector

In [23]:
# Trouble shooting
from distutils.sysconfig import get_python_lib
print(get_python_lib())

/usr1/anaconda3/envs/jupyter_env/lib/python3.7/site-packages


#### Intent classification with sklearn
An array X containing vectors describing each of the sentences in the ATIS dataset has been created for you, along with a 1D array y containing the labels. The labels are integers corresponding to the intents in the dataset. For example, label 0 corresponds to the intent atis_flight.

Now, you'll use the scikit-learn library to train a classifier on this same dataset. Specifically, you will fit and evaluate a support vector classifier.

In [24]:
import pandas as pd
import os
print(os.listdir("data"))

['atis.zip', 'atis.dict.intent.csv', 'atis.dict.slots.csv', 'atis.dict.vocab.csv', 'atis.test.intent.csv', 'atis.test.pkl', 'atis.test.query.csv', 'atis.test.slots.csv', 'atis.train.intent.csv', 'atis.train.pkl', 'atis.train.query.csv', 'atis.train.slots.csv']


In [25]:
#load ATIS dataset from Kaggle - ATIS = Airline Travel Information System
#https://www.kaggle.com/siddhadev/atis-dataset/notebook


import pickle

DATA_DIR="data"

def load_ds(fname='atis.train.pkl'):
    with open(fname, 'rb') as stream:
        ds,dicts = pickle.load(stream)
    print(ds.keys(),dicts.keys())
    print('Done  loading: ', fname)
    print('      samples: {:4d}'.format(len(ds['query'])))
    print('   vocab_size: {:4d}'.format(len(dicts['token_ids'])))
    print('   slot count: {:4d}'.format(len(dicts['slot_ids'])))
    print(' intent count: {:4d}'.format(len(dicts['intent_ids'])))
    return ds,dicts

In [26]:
# Show some examples from dataset
def extract_sentences(ds,dicts):
    train_sentences = []
    df = pd.DataFrame.from_dict(ds)
    
    t2i, s2i, in2i = map(dicts.get, ['token_ids', 'slot_ids','intent_ids'])
    i2t, i2s, i2in = map(lambda d: {d[k]:k for k in d.keys()}, [t2i,s2i,in2i])
    #i2tv, i2sv, i2inv = map(lambda d: {d[k]:k for k in d.values()}, [t2i,s2i,in2i])
    query, slots, intent =  map(ds.get, ['query', 'slot_labels', 'intent_labels'])

    #for i in range(5):
    #for i in range(len(df)):
    big_add = {}
    
    for i,row in df.iterrows():
        curr = {}
        
        curr['intent_class'] = i2in[intent[i][0]]
        curr['intent_val'] = intent[i][0]
        curr['question'] = ' '.join(map(i2t.get, query[i]))
        curr['slot_ids'] = [ i2s[slots[i][j]] for j in range(len(query[i]))]
        curr['slot_label_list'] = [ i2t[query[i][j]] for j in range(len(query[i]))]
        curr['slots'] = [ (i2t[query[i][j]],i2s[slots[i][j]]) for j in range(len(query[i]))]
        
        big_add[i] = curr
        
        doc = nlp(sentence)
        curr['vector'] = doc.vector
        #print(len(doc.vector))
        '''
        sentence = '{:4d}:{:>15}: {}'.format(i, i2in[intent[i][0]],' '.join(map(i2t.get, query[i])))
        for j in range(len(query[i])):
            print('{:>33} {:>40}'.format(i2t[query[i][j]],i2s[slots[i][j]]  ))'''
        #print('*'*74)
    add_df = pd.DataFrame.from_dict(big_add,orient='index')
    result = pd.concat([df, add_df], axis=1, sort=False)
    print(result.head())
    
    #X_shape = (len(result),nlp.vocab.vectors_length)
    #X = np.zeros(X_train_shape)
    return result

In [27]:
train_ds, tr_dicts = load_ds(os.path.join(DATA_DIR,'atis.train.pkl'))
train_df = extract_sentences(train_ds, tr_dicts)
test_ds, te_dicts  = load_ds(os.path.join(DATA_DIR,'atis.test.pkl'))
test_df = extract_sentences(test_ds, te_dicts)

dict_keys(['slot_labels', 'query', 'intent_labels']) dict_keys(['token_ids', 'slot_ids', 'intent_ids'])
Done  loading:  data/atis.train.pkl
      samples: 4978
   vocab_size:  943
   slot count:  129
 intent count:   26
                                         slot_labels  \
0  [128, 128, 128, 128, 128, 128, 48, 128, 35, 10...   
1  [128, 128, 128, 128, 128, 128, 48, 128, 78, 12...   
2  [128, 128, 128, 128, 45, 108, 128, 48, 110, 12...   
3              [128, 21, 128, 128, 48, 128, 78, 128]   
4  [128, 66, 119, 128, 128, 48, 128, 78, 21, 38, ...   

                                               query intent_labels  \
0  [178, 479, 902, 851, 431, 444, 266, 240, 168, ...          [14]   
1  [178, 916, 429, 228, 244, 444, 682, 851, 247, ...          [14]   
2  [178, 916, 498, 827, 234, 849, 482, 739, 440, ...          [19]   
3           [178, 296, 197, 444, 810, 851, 667, 179]           [3]   
4  [178, 730, 870, 415, 444, 682, 851, 678, 886, ...           [3]   

  intent_class  intent

In [28]:
train_df.head()

Unnamed: 0,slot_labels,query,intent_labels,intent_class,intent_val,question,slot_ids,slot_label_list,slots,vector
0,"[128, 128, 128, 128, 128, 128, 48, 128, 35, 10...","[178, 479, 902, 851, 431, 444, 266, 240, 168, ...",[14],flight,14,BOS i want to fly from boston at 838 am and ar...,"[O, O, O, O, O, O, B-fromloc.city_name, O, B-d...","[BOS, i, want, to, fly, from, boston, at, 838,...","[(BOS, O), (i, O), (want, O), (to, O), (fly, O...","[-0.0010358343, 0.017182918, -0.041435994, 0.0..."
1,"[128, 128, 128, 128, 128, 128, 48, 128, 78, 12...","[178, 916, 429, 228, 244, 444, 682, 851, 247, ...",[14],flight,14,BOS what flights are available from pittsburgh...,"[O, O, O, O, O, O, B-fromloc.city_name, O, B-t...","[BOS, what, flights, are, available, from, pit...","[(BOS, O), (what, O), (flights, O), (are, O), ...","[-0.0010358343, 0.017182918, -0.041435994, 0.0..."
2,"[128, 128, 128, 128, 45, 108, 128, 48, 110, 12...","[178, 916, 498, 827, 234, 849, 482, 739, 440, ...",[19],flight_time,19,BOS what is the arrival time in san francisco ...,"[O, O, O, O, B-flight_time, I-flight_time, O, ...","[BOS, what, is, the, arrival, time, in, san, f...","[(BOS, O), (what, O), (is, O), (the, O), (arri...","[-0.0010358343, 0.017182918, -0.041435994, 0.0..."
3,"[128, 21, 128, 128, 48, 128, 78, 128]","[178, 296, 197, 444, 810, 851, 667, 179]",[3],airfare,3,BOS cheapest airfare from tacoma to orlando EOS,"[O, B-cost_relative, O, O, B-fromloc.city_name...","[BOS, cheapest, airfare, from, tacoma, to, orl...","[(BOS, O), (cheapest, B-cost_relative), (airfa...","[-0.0010358343, 0.017182918, -0.041435994, 0.0..."
4,"[128, 66, 119, 128, 128, 48, 128, 78, 21, 38, ...","[178, 730, 870, 415, 444, 682, 851, 678, 886, ...",[3],airfare,3,BOS round trip fares from pittsburgh to philad...,"[O, B-round_trip, I-round_trip, O, O, B-fromlo...","[BOS, round, trip, fares, from, pittsburgh, to...","[(BOS, O), (round, B-round_trip), (trip, I-rou...","[-0.0010358343, 0.017182918, -0.041435994, 0.0..."


In [32]:
X_train_shape = (len(train_df),nlp.vocab.vectors_length)
X_train = np.zeros(X_train_shape)
i =0
for sentence in train_df['question'].values:
    X_train[i,:] = nlp(sentence).vector
    i +=1
y_train = train_df['intent_val'].values

X_test_shape = (len(test_df),nlp.vocab.vectors_length)
X_test = np.zeros(X_test_shape)
i =0
for sentence in test_df['question'].values:
    X_test[i,:] = nlp(sentence).vector
    i +=1
y_test= test_df['intent_val'].values

In [34]:
X_train[0],y_train[0],X_test[0],y_test[0]

(array([ 5.16234040e-02,  2.32205659e-01, -3.59605029e-02, -2.82780938e-02,
         2.63509810e-01,  3.42081189e-02, -3.79210524e-02, -2.16884981e-03,
         1.39851958e-01,  1.66678202e+00, -2.65889585e-01, -1.23920456e-01,
         8.70050937e-02, -7.50032812e-02, -1.43252000e-01,  2.13570893e-02,
        -1.14725254e-01,  1.22758055e+00, -8.22240394e-03,  2.58636475e-02,
         7.57709146e-02,  2.04386469e-02,  1.30797978e-02, -8.30395967e-02,
        -5.63978478e-02,  4.81858626e-02, -2.49703020e-01, -1.11223243e-01,
         9.84954089e-02,  1.17382910e-02, -6.74643889e-02,  6.22969046e-02,
         2.26359032e-02,  2.95755357e-01, -7.27919042e-02, -9.55374688e-02,
        -3.22525501e-02,  8.17344189e-02, -7.67499348e-03,  1.68229286e-02,
        -1.44551426e-01,  5.18489480e-02, -1.71957344e-01,  6.66451678e-02,
         1.13357425e-01,  1.89913243e-01, -7.01958537e-02, -5.28154485e-02,
         1.23662297e-02, -4.70820963e-02, -1.54851954e-02,  7.63327032e-02,
        -6.7

In [35]:
# Import SVC
from sklearn.svm import SVC

# Create a support vector classifier
clf = SVC()

# Fit the classifier using the training data
clf.fit(X_train, y_train)

# Predict the labels of the test set
y_pred = clf.predict(X_test)

# Count the number of correct predictions
n_correct = 0
for i in range(len(y_test)):
    if y_pred[i] == y_test[i]:
        n_correct += 1

print("Predicted {0} correctly out of {1} test examples".format(n_correct, len(y_test)))



Predicted 684 correctly out of 893 test examples


#### Using spaCy's entity recogniser
In this exercise you'll use spaCy's built-in entity recognizer to extract names, dates, and organizations from search queries. The spaCy library has been imported for you, and it's English model has been loaded as nlp.

Your job is to define a function called extract_entities() which takes in a single argument message and returns a dictionary with the included entity types as keys, and the extracted entities as values. The included entity types are contained in a list called include_entities.

In [36]:
# Define included entities
include_entities = ['DATE', 'ORG', 'PERSON']

# Define extract_entities()
def extract_entities(message):
    # Create a dict to hold the entities
    ents = dict.fromkeys(include_entities)
    # Create a spacy document
    doc = nlp(message)
    for ent in doc.ents:
        if ent.label_ in include_entities:
            # Save interesting entities
            ents[ent.label_] = ent.text
    return ents

print(extract_entities('friends called Mary who have worked at Google since 2010'))
print(extract_entities('people who graduated from MIT in 1999'))

{'DATE': '2010', 'ORG': 'Google', 'PERSON': 'Mary'}
{'DATE': '1999', 'ORG': 'MIT', 'PERSON': None}


#### Assigning roles using spaCy's parser
In this exercise you'll use spaCy's powerful syntax parser to assign roles to the entities in your users' messages. To do this, you'll define two functions, find_parent_item() and assign_colors(). In doing so, you'll use a parse tree to assign roles, similar to how Alan did in the video.

Recall that you can access the ancestors of a word using its .ancestors attribute.

In [37]:
# Create the document
doc = nlp("let's see that jacket in red and some blue jeans")

colors = ['black', 'red', 'blue']
items = ['shoes', 'handback', 'jacket', 'jeans']
def entity_type(word):
    _type = None
    if word.text in colors:
        _type = "color"
    elif word.text in items:
        _type = "item"  
    return _type

# Iterate over parents in parse tree until an item entity is found
def find_parent_item(word):
    # Iterate over the word's ancestors
    for parent in word.ancestors:
        # Check for an "item" entity
        if entity_type(parent) == "item":
            return parent.text
    return None

# For all color entities, find their parent item
def assign_colors(doc):
    # Iterate over the document
    for word in doc:
        # Check for "color" entities
        if entity_type(word) == "color":
            # Find the parent
            item =  find_parent_item(word)
            print("item: {0} has color : {1}".format(item, word))

# Assign the colors
assign_colors(doc)


item: jacket has color : red
item: jeans has color : blue


### Rasa NLU
In this exercise you'll use Rasa NLU to create an interpreter, which parses incoming user messages and returns a set of entities. Your job is to train an interpreter using the MITIE entity recognition model in rasa NLU.

In [38]:
%load_ext autoreload
%autoreload
# Make sure you put the mitielib folder into the python search path.  There are
# a lot of ways to do this, here we do it programmatically with the following
# two statements:
import os,sys
parent = os.path.dirname(os.path.realpath('__file__'))
url='/usr1/datascience/DataCampCourse/4_Building Chatbots in Python/supporting/MITIE-master/mitielib/'
sys.path.append(url)
from mitie import *

In [55]:
# Import necessary modules
%load_ext autoreload
%autoreload
from rasa_nlu.converters import load_data
#from rasa_nlu.training_data import load_data

from rasa_nlu.config import RasaNLUConfig
from rasa_nlu.model import Trainer

# Create args dictionary
args = {"pipeline":"spacy_sklearn"} # these are templates of pipelines 
#https://rasa.com/docs/nlu/0.11.4/pipeline/?highlight=spacy_sklearn
args = {"pipeline":"mitie"}

pipeline = ["nlp_spacy", "tokenizer_spacy", "intent_entity_featurizer_regex", 
 "intent_featurizer_spacy", "ner_crf", "ner_synonyms",  "intent_classifier_sklearn"]

pipeline = ["intent_classifier_sklearn"] # "nlp_spacy","ner_synonyms"

pipeline = ["nlp_spacy","ner_synonyms","tokenizer_spacy","intent_entity_featurizer_regex","nlp_spacy",
           "intent_featurizer_spacy"]

# Create a configuration and trainer
#config = RasaNLUConfig(cmdline_args=args)
config = RasaNLUConfig(cmdline_args={"pipeline": pipeline})
trainer = Trainer(config)

# Load the training data
training_data = load_data("./supporting/rasa_nlu-master/datacamp/training_data.json")

# Create an interpreter by training the model
interpreter = trainer.train(training_data)
print(interpreter.parse("I'm looking for a Mexican restaurant in the North of town"))

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
{'intent': {'name': '', 'confidence': 0.0}, 'entities': [], 'text': "I'm looking for a Mexican restaurant in the North of town"}


In [56]:
# Try it out
#print(interpreter.parse("I am looking for a Mexican restaurant in the North of town"))
#print(interpreter.parse("I 'm looking for a Mexican restaurant in the North of town"))
print(interpreter.parse("I m looking for a Mexican restaurant in the North of town"))

{'intent': {'name': '', 'confidence': 0.0}, 'entities': [], 'text': 'I m looking for a Mexican restaurant in the North of town'}


#### Data-efficient entity recognition
Most systems for extracting entities from text are built to extract 'Universal' things like names, dates, and places. But you probably don't have enough training data for your bot to make these systems perform well!

In this exercise, you'll activate the MITIE entity recogniser inside rasa to extract restaurants-related entities using a very small amount of training data. A dictionary args has already been defined for you, along with a training_data object.

In [57]:
# Import necessary modules
from rasa_nlu.config import RasaNLUConfig
from rasa_nlu.model import Trainer

pipeline = [
    "nlp_spacy",
    "tokenizer_spacy",
    "ner_crf"
]

# Create a config that uses this pipeline
config1 = RasaNLUConfig(cmdline_args={"pipeline": pipeline})

# Create a trainer that uses this config
trainer1 = Trainer(config1)

# Create an interpreter by training the model
interpreter1 = trainer1.train(training_data)

# Parse some messages
print(interpreter1.parse("show me Chinese food in the centre of town"))
print(interpreter1.parse("I want an Indian restaurant in the west"))
print(interpreter1.parse("are there any good pizza places in the center?"))

{'intent': {'name': '', 'confidence': 0.0}, 'entities': [{'start': 28, 'end': 34, 'value': 'centre', 'entity': 'location', 'extractor': 'ner_crf'}], 'text': 'show me Chinese food in the centre of town'}
{'intent': {'name': '', 'confidence': 0.0}, 'entities': [{'start': 10, 'end': 16, 'value': 'indian', 'entity': 'cuisine', 'extractor': 'ner_crf'}, {'start': 35, 'end': 39, 'value': 'west', 'entity': 'location', 'extractor': 'ner_crf'}], 'text': 'I want an Indian restaurant in the west'}
{'intent': {'name': '', 'confidence': 0.0}, 'entities': [{'start': 39, 'end': 45, 'value': 'center', 'entity': 'location', 'extractor': 'ner_crf'}], 'text': 'are there any good pizza places in the center?'}


## Chapter 3 : Building a virtual assistant

In this chapter, you're going to build a personal assistant to help you plan a trip. It will be able to respond to questions like "are there any cheap hotels in the north of town?" by looking inside a hotels database for matching results.

In [59]:
import sqlite3

### Virtual Assistants and accessing data

#### SQL basics
Time to begin writing queries for your first hotel booking chatbot! The database has been loaded as "hotels.db", and a cursor which has access to the database has already been defined for you as cursor.

Three queries are provided below. Your job is to identify which query returns ONLY the "Hotel California".

You can test each query below by calling the cursor's .execute() method and passing the query in as a string. Then, you can print the results by calling the cursor's .fetchall() method, which takes no arguments.

#### SQL statements in Python
It's time to begin writing SQL queries! In this exercise, your job is to run a query against the hotels database to find all the expensive hotels in the south. The connection to the database has been created for you, along with a cursor c.

As Alan described in the video, you should be careful about SQL injection. Here, you'll pass parameters the safe way: As an extra tuple argument to the .execute() method. This ensures malicious code can't be injected into your query.

### Exploring a DB with natural language

#### Creating queries from parameters
Now you're going to implement a more powerful function for querying the hotels database. The goal is to take arguments that can later be specified by other parts of your code.

Specifically, your job here is to define a find_hotels() function which takes a single argument - a dictionary of column names and values - and returns a list of matching hotels from the database.

In [60]:
# Define find_hotels()
def find_hotels(params):
    # Create the base query
    query = 'SELECT * FROM hotels'
    # Add filter clauses for each of the parameters
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params]
        query += " WHERE " + " and ".join(filters)
    # Create the tuple of values
    #t = tuple([v for k,v in params.items()])
    t = tuple(params.values())
    print(query, t)
    # Open connection to DB
    conn = sqlite3.connect("./data/hotels.db")
    # Create a cursor
    c = conn.cursor()
    # Execute the query
    c.execute(query, t)
    # Return the results
    return c.fetchall()
    
find_hotels({'area':'south','price':'mid'})

SELECT * FROM hotels WHERE area=? and price=? ('south', 'mid')


[]

#### Using your custom function to find hotels
Here, you're going to put your find_hotels() function into action! Recall that it accepts a single argument, params, which is a dictionary of column names and values.

In [61]:
# Create the dictionary of column names and values
params = {'area':'south','price':'lo'}

# Find the hotels that match the parameters
print(find_hotels(params))

SELECT * FROM hotels WHERE area=? and price=? ('south', 'lo')
[('Cozy Cottage', 'lo', 'south', 2)]


#### Creating SQL from natural language
Now you'll write a respond() function which can handle messages like "I want an expensive hotel in the south of town" and respond appropriately according to the number of matching results in a database. This is important functionality for any database-backed chatbot.

Your find_hotels() function from the previous exercises has already been defined for you, along with a rasa NLU interpreter object which can handle hotel queries and a list of responses, which you can explore in the Shell.

In [68]:
responses = ["I'm sorry :( I couldn't find anything like that",
 '{} is a great hotel!',
 '{} or {} would work!',
 '{} is one option, but I know others too :)']


# Define respond()
def respond(message):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Initialize an empty params dictionary
    params = {}
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    # Find hotels that match the dictionary
    results = find_hotels(params)
    print(len(results))
    # Get the names of the hotels and index of the response
    names = [r[0] for r in results]
    
    n = min(len(results),3)
    
    # Select the nth element of the responses array
    return responses[n].format(*names)
    
respond("I want an expensive hotel in the south of town")

SELECT * FROM hotels ()
7


'Hotel for Dogs is one option, but I know others too :)'

### Incremental slot filling and negation

#### Refining your search
Now you'll write a bot that allows users to add filters incrementally, in case they don't specify all of their preferences in one message.

To do this, initialize an empty dictionary params outside of your respond() function (unlike inside the function, like in the previous exercise). Your respond() function will take in this dictionary as an argument.

In [69]:
# Define a respond function, taking the message and existing params as input
def respond(message, params):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    # Find the hotels
    results = find_hotels(params)
    names = [r[0] for r in results]
    n = min(len(results), 3)
    # Return the appropriate response
    return responses[n].format(*names), params

# Initialize params dictionary
params = {}

# Pass the messages to the bot
for message in ["I want an expensive hotel", "in the north of town"]:
    print("USER: {}".format(message))
    response, params = respond(message, params)
    print("BOT: {}".format(response))

USER: I want an expensive hotel
SELECT * FROM hotels ()
BOT: Hotel for Dogs is one option, but I know others too :)
USER: in the north of town
SELECT * FROM hotels ()
BOT: Hotel for Dogs is one option, but I know others too :)


#### Basic negation
Quite often you'll find your users telling you what they don't want - and that's important to understand! In general, negation is a difficult problem in NLP. Here we'll take a very simple approach that works for many cases.

A list of tests called tests has been defined for you. Explore it in the Shell - you'll find that each test is a tuple consisting of:

A string containing a message with entities
A dictionary containing the entities as keys, and a Boolean saying whether they are negated as the key
Your job is to define a function called negated_ents() which looks for negated entities in a message.

In [75]:
tests = [("no I don't want to be in the south", {'south': False}),
 ('no it should be in the south', {'south': True}),
 ('no in the south not the north', {'north': False, 'south': True}),
 ('not north', {'north': False})]

# Define negated_ents()
def negated_ents(phrase,ents):
    # Extract the entities using keyword matching
    if len(ents) == 0:
        ents = [e for e in ["south", "north"] if e in phrase]
    # Find the index of the final character of each entity
    ends = sorted([phrase.index(e) +len(e) for e in ents])
    # Initialise a list to store sentence chunks
    chunks = []
    # Take slices of the sentence up to and including each entitiy
    start = 0
    for end in ends:
        chunks.append(phrase[start:end])
        start = end
    result = {}
    # Iterate over the chunks and look for entities
    for chunk in chunks:
        for ent in ents:
            if ent in chunk:
                # If the entity is preceeded by a negation, give it the key False
                if "not" in chunk or "n't" in chunk:
                    result[ent] = False
                else:
                    result[ent] = True
    return result  

# Check that the entities are correctly assigned as True or False
for test in tests:
    print(negated_ents(test[0],[]) == test[1])


True
True
True
True


#### Filtering with excluded slots
Now you're going to put together some of the ideas from previous exercises, and allow users to tell your bot about what they do and what they don't want, split across multiple messages.

The negated_ents() function has already been defined for you. Additionally, a slightly tweaked version of the find_hotels() function, which accepts a neg_params dictionary in addition to a params dictionary, has been defined.

In [77]:
# Define find_hotels()
def find_hotels(params,neg_params):
    # Create the base query
    query = 'SELECT * FROM hotels'
    # Add filter clauses for each of the parameters
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params] + ["{}!=?".format(k) for k in neg_params]
        query += " WHERE " + " and ".join(filters)
    # Create the tuple of values
    #t = tuple([v for k,v in params.items()])
    t = tuple(params.values())
    print(query, t)
    # Open connection to DB
    conn = sqlite3.connect("./data/hotels.db")
    # Create a cursor
    c = conn.cursor()
    # Execute the query
    c.execute(query, t)
    # Return the results
    return c.fetchall()

# Define the respond function
def respond(message,params,neg_params):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    ent_vals = [e["value"] for e in entities]
    # Look for negated entities
    negated = negated_ents(message,ent_vals)
    for ent in entities:
        if ent["value"] in negated and negated[ent["value"]]:
            neg_params[ent["entity"]] = str(ent["value"])
        else:
            params[ent["entity"]] = str(ent["value"])
    # Find the hotels
    results = find_hotels(params,neg_params)
    names = [r[0] for r in results]
    n = min(len(results),3)
    # Return the correct response
    return responses[n].format(*names), params, neg_params

# Initialize params and neg_params
params = {}
neg_params = {}

# Pass the messages to the bot
for message in ["I want a cheap hotel", "but not in the north of town"]:
    print("USER: {}".format(message))
    response, params, neg_params = respond(message, params, neg_params)
    print("BOT: {}".format(response))


USER: I want a cheap hotel
SELECT * FROM hotels ()
BOT: Hotel for Dogs is one option, but I know others too :)
USER: but not in the north of town
SELECT * FROM hotels ()
BOT: Hotel for Dogs is one option, but I know others too :)


## Chapter 4 : Dialogue

Everything you've built so far has statelessly mapped intents to actions & responses. It's amazing how far you can get with that! But to build more sophisticated bots you will always want to add some statefulness. That's what you'll do here, as you build a chatbot that helps users order coffee. Have fun!


### Why statefulness is key

#### Form filling
You'll often want your bot to guide users through a series of steps, such as when they're placing an order.

In this exercise, you'll begin building a bot that lets users order coffee. They can choose between two types: Colombian, and Kenyan. If the user provides unexpected input, your bot will handle this differently depending on where they are in the flow.

Your job here is to identify the appropriate state and next state based on the intents and response messages provided. For example, if the intent is "order", then the state changes from INIT to CHOOSE_COFFEE.

A function send_message(policy, state, message) has already been defined for you. It takes the policy, the current state and message as arguments, and returns the new state as a result. Additionally, an interpret(message) function, similar to the one Alan described in the video, has been pre-defined for you.


In [80]:
def send_message(policy, state, message):
    print("USER : {}".format(message))
    new_state, response = respond(policy, state, message)
    print("BOT : {}".format(response))
    return new_state

def interpret(message):
    msg = message.lower()
    if 'order' in msg:
        return 'order'
    if 'kenyan' in msg or 'columbian' in msg:
        return 'specify_coffee'
    return 'none'

def respond(policy, state, message): 
    (new_state, response) = policy[(state, interpret(message))] 
    return new_state, response


# Define the INIT state
INIT = 0

# Define the CHOOSE_COFFEE state
CHOOSE_COFFEE = 1

# Define the ORDERED state
ORDERED = 2

# Define the policy rules
policy = {
    (INIT, "order"): (CHOOSE_COFFEE, "ok, Columbian or Kenyan?"),
    (INIT, "none"): (INIT, "I'm sorry - I'm not sure how to help you"),
    (CHOOSE_COFFEE, "specify_coffee"): (ORDERED, "perfect, the beans are on their way!"),
    (CHOOSE_COFFEE, "none"): (CHOOSE_COFFEE, "I'm sorry - would you like Colombian or Kenyan?"),
}

# Create the list of messages
messages = [
    "I'd like to become a professional dancer",
    "well then I'd like to order some coffee",
    "my favourite animal is a zebra",
    "kenyan"
]

# Call send_message() for each message
state = INIT
for message in messages:    
    state = send_message(policy, state, message)

USER : I'd like to become a professional dancer
BOT : I'm sorry - I'm not sure how to help you
USER : well then I'd like to order some coffee
BOT : ok, Columbian or Kenyan?
USER : my favourite animal is a zebra
BOT : I'm sorry - would you like Colombian or Kenyan?
USER : kenyan
BOT : perfect, the beans are on their way!


#### Asking contextual questions
Sometimes your users need some help! They will have questions and expect the bot to help them.

In this exercise, you'll allow users to ask the coffee bot to explain the steps to them. Like before, the answer they get will depend on where they are in the flow.

In [89]:
# Define the states
INIT=0 
CHOOSE_COFFEE=1
ORDERED=2

# Define the policy rules dictionary
policy_rules = {
    (INIT, "ask_explanation"): (INIT, "I'm a bot to help you order coffee beans"),
    (INIT, "none"): (INIT, "I'm a bot to help you order coffee beans"),
    (INIT, "order"): (CHOOSE_COFFEE, "ok, Columbian or Kenyan?"),
    (CHOOSE_COFFEE, "none"): (CHOOSE_COFFEE, "ok, Columbian or Kenyan?"),
    (CHOOSE_COFFEE, "specify_coffee"): (ORDERED, "perfect, the beans are on their way!"),
    (CHOOSE_COFFEE, "ask_explanation"): (CHOOSE_COFFEE, "We have two kinds of coffee beans - the Kenyan ones make a slightly sweeter coffee, and cost $6. The Brazilian beans make a nutty coffee and cost $5.")    
}

def respond(state, message):
    print(state, message)
    (new_state, response) = policy_rules[(state, interpret(message))]
    return new_state, response


def send_message(state, message):
    print("USER : {}".format(message))
    new_state, response = respond(state, message)
    print("BOT : {}".format(response))
    return new_state

# Define send_messages()
def send_messages(messages):
    state = INIT
    for msg in messages:
        state = send_message(state, msg)
        print(state)

# Send the messages
send_messages([
    "what can you do for me?",
    "well then I'd like to order some coffee",
    "what do you mean by that?",
    "kenyan"
])


USER : what can you do for me?
0 what can you do for me?
BOT : I'm a bot to help you order coffee beans
0
USER : well then I'd like to order some coffee
0 well then I'd like to order some coffee
BOT : ok, Columbian or Kenyan?
1
USER : what do you mean by that?
1 what do you mean by that?
BOT : ok, Columbian or Kenyan?
1
USER : kenyan
1 kenyan
BOT : perfect, the beans are on their way!
2


#### Dealing with rejection
What happens if you make a suggestion to your user, and they don't like it? Your bot will look really silly if it makes the same suggestion again right away.

Here, you're going to modify your respond() function so that it accepts and returns 4 arguments:

The user message as an argument, and the bot response as the first return value.
A dictionary params including the entities the user has specified.
A suggestions list. When passed to respond(), this should contain the suggestions made in the previous bot message. When returned by respond(), it should contain the current suggestions.
An excluded list, which contains all of the results your user has already explicitly rejected.
Your function should add the previous suggestions to the excluded list whenever it receives a "deny" intent. It should also filter out excluded suggestions from the response.

In [94]:
def interpret(message):
    data = interpreter.parse(message)
    #print(data)
    if 'no' in message:
        data["intent"]["name"] = "deny"
    return data

# Define respond()
def respond(message,params,prev_suggestions,excluded):
    # Interpret the message
    parse_data = interpret(message)
    # Extract the intent
    intent = parse_data["intent"]["name"]
    # Extract the entities
    entities = parse_data["entities"]
    # Add the suggestion to the excluded list if intent is "deny"
    if intent == "deny":
        excluded.extend(prev_suggestions)
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])
    # Find matching hotels
    results = [
        r 
        for r in find_hotels(params, excluded) 
        if r[0] not in excluded
    ]
    # Extract the suggestions
    names = [r[0] for r in results]
    n = min(len(results), 3)
    suggestions = names[:2]
    return responses[n].format(*names), params, suggestions, excluded

# Initialize the empty dictionary and lists
params, suggestions, excluded = {}, [], []

# Send the messages
for message in ["I want a mid range hotel", "no that doesn't work for me"]:
    print("USER: {}".format(message))
    response, params, suggestions, excluded = respond(message, params, suggestions, excluded)
    print("BOT: {}".format(response))
    print("*"*8)


USER: I want a mid range hotel
SELECT * FROM hotels ()
BOT: Hotel for Dogs is one option, but I know others too :)
********
USER: no that doesn't work for me
SELECT * FROM hotels ()
BOT: Grand Hotel is one option, but I know others too :)
********


### Asking questions & queuing answers

#### Pending actions I
You can really improve the user experience of your bot by asking them simple yes or no questions. One easy way to handle these follow-ups is to define pending actions which get executed as soon as the user says "yes", and wiped if the user says "no".

In this exercise, you're going to define a policy function which takes the intent as it's sole argument, and returns two values: The next action to take, and a pending action. The policy function should return this value when a "yes" intent is returned, and should wipe the pending actions if a "no" intent is returned.

Here, the interpret(message) function has been defined for you such that if "yes" is in the message, "affirm" is returned, and if "no" is in the message, then "deny" is returned.

In [95]:
# Define policy()
def policy(intent):
    # Return "do_pending" if the intent is "affirm"
    if intent == "affirm":
        return "do_pending", None
    # Return "Ok" if the intent is "deny"
    if intent == "deny":
        return "Ok", None
    if intent == "order":
        return '''Unfortunately, the Kenyan coffee is currently out of stock, 
        would you like to order the Brazilian beans?''', 
        "Alright, I've ordered that for you!"

#### Pending actions II
Having defined your policy function, it's now time to write a send_message() function which takes both a pending action and a message as its arguments and leverages the policy function to determine the bot's response.

Your policy(intent) function from the previous exercise has been pre-loaded.

In [97]:
def interpret(message):
    msg = message.lower()
    if 'order' in msg:
        return 'order'
    elif 'yes' in msg:
        return 'affirm'
    elif 'no' in msg:
        return 'deny'
    else:
        return 'none'

# Define send_message()
def send_message(pending, message):
    print("USER : {}".format(message))
    action, pending_action = policy(interpret(message))
    if action == "do_pending" and pending is not None:
        print("BOT : {}".format(pending))
    else:
        print("BOT : {}".format(action))
    return pending_action
    
# Define send_messages()
def send_messages(messages):
    pending = None
    for msg in messages:
        pending = send_message(pending,msg)

# Send the messages
send_messages([
    "I'd like to order some coffee",
    "ok yes please"
])

USER : I'd like to order some coffee
BOT : Unfortunately, the Kenyan coffee is currently out of stock, would you like to order the Brazilian beans?
USER : ok yes please
BOT : Alright, I've ordered that for you!


#### Pending state transitions
You'll often need to briefly deviate from a flow, for example to authenticate a user, before returning.

In these cases, it's often simpler - and easier to debug - to save some actions/states as pending rather than adding ever more complicated rules.

Here, you're going to define a policy_rules dictionary, where the keys are tuples of the current state and the received intent, while the values are tuples of the next state, the bot's response, and a state for which to set a pending transition.

In [103]:
# Define the states
INIT=0
AUTHED=1
CHOOSE_COFFEE=2
ORDERED=3

# Define the policy rules
policy_rules = {
    (INIT, "order"): (AUTHED, "you'll have to log in first, what's your phone number?", AUTHED),
    (AUTHED, "number"): (AUTHED, "perfect, welcome back!", None),
    (AUTHED, "none"): (AUTHED, "perfect, welcome back!", None),
    (AUTHED, "order"): (CHOOSE_COFFEE, "would you like Columbian or Kenyan?", None),    
    (CHOOSE_COFFEE, "specify_coffee"): (ORDERED, "perfect, the beans are on their way!", None),
    (CHOOSE_COFFEE, "none"): (ORDERED, "perfect, the beans are on their way!", None)
}

def send_message(state, pending, message):
    print("USER : {}".format(message))
    new_state, response, pending_state = policy_rules[(state, interpret(message))]
    print("BOT : {}".format(response))
    if pending is not None:
        new_state, response, pending_state = policy_rules[pending]
        print("BOT : {}".format(response))
    if pending_state is not None:
        pending = (pending_state, interpret(message))
    return new_state, pending
        
# Define send_messages()
def send_messages(messages):
    state = INIT
    pending = None
    for msg in messages:
        state, pending = send_message(state, pending, msg)

# Send the messages
send_messages([
    "I'd like to order some coffee",
    "555-12345",
    "kenyan"
])


USER : I'd like to order some coffee
BOT : you'll have to log in first, what's your phone number?
USER : 555-12345
BOT : perfect, welcome back!
BOT : would you like Columbian or Kenyan?
USER : kenyan
BOT : perfect, the beans are on their way!
BOT : would you like Columbian or Kenyan?


#### Putting it all together I
It's time to put everything together everything you've learned in the course by combining the coffee ordering bot with the eliza rules from chapter 1.

To begin, you'll define a function called chitchat_response(), which calls the predefined function match_rule() from back in chapter 1. This returns a response if the message matched an eliza template, and otherwise, None.

The eliza rules are contained in a dictionary called eliza_rules.

In [104]:
eliza_rules = {'I want (.*)': ['What would it mean if you got {0}',
  'Why do you want {0}',
  "What's stopping you from getting {0}"],
 'do you remember (.*)': ['Did you think I would forget {0}',
  "Why haven't you been able to forget {0}",
  'What about {0}',
  'Yes .. and?'],
 'do you think (.*)': ['if {0}? Absolutely.', 'No chance'],
 'if (.*)': ["Do you really think it's likely that {0}",
  'Do you wish that {0}',
  'What do you think about {0}',
  'Really--if {0}']}

# Define chitchat_response()
def chitchat_response(message):
    # Call match_rule()
    response, phrase = match_rule(eliza_rules,message)
    # Return none is response is "default"
    if response == "default":
        return None
    if '{0}' in response:
        # Replace the pronouns of phrase
        phrase = replace_pronouns(phrase)
        # Calculate the response
        response = response.format(phrase)
    return response


#### Putting it all together II
With your chitchat_response(message) function defined, the next step is to define a send_message() function which first calls chitchat_response(message), and only uses the coffee bot policy if there is no matching message.

In [105]:
# Define send_message()
def send_message(state,pending,message):
    print("USER : {}".format(message))
    response = chitchat_response(message)
    if response is not None:
        print("BOT : {}".format(response))
        return state, None
    
    # Calculate the new_state, response, and pending_state
    new_state, response, pending_state = policy_rules[(state, interpret(message))]
    print("BOT : {}".format(response))
    if pending is not None:
        new_state, response, pending_state = policy_rules[pending]
        print("BOT : {}".format(response))        
    if pending_state is not None:
        pending = (pending_state, interpret(message))
    return new_state, pending

# Define send_messages()
def send_messages(messages):
    state = INIT
    pending = None
    for msg in messages:
        state, pending = send_message(state, pending, msg)

# Send the messages
send_messages([
    "I'd like to order some coffee",
    "555-12345",
    "do you remember when I ordered 1000 kilos by accident?",
    "kenyan"
])  

USER : I'd like to order some coffee
BOT : you'll have to log in first, what's your phone number?
USER : 555-12345
BOT : perfect, welcome back!
BOT : would you like Columbian or Kenyan?
USER : do you remember when I ordered 1000 kilos by accident?
BOT : Yes .. and?
USER : kenyan
BOT : perfect, the beans are on their way!


### Frontiers of dialogue research

#### Generating text with neural networks
In this final exercise of the course, you're going to generate text using a neural network trained on the scripts of every episode of The Simpsons. Specifically, you'll use a simplified version of the sample_text() function that Alan described in the video.

It takes in two arguments, seed, and temperature. The seed argument is the initial sequence that the network uses to generate the subsequent text, while the temperature argument controls how risky the network is when generating text. At very low temperatures, it just repeats the most common combinations of letters, and at very high temperatures it generates complete gibberish. In order to ensure fast runtimes, the network in this exercise will only work for the subset of temperature values.

After you finish this exercise, be sure to check out this tutorial by Alan in which he walks you through how to connect a chatbot to Facebook Messenger!

In [106]:
generated = {0.2: "i'm gonna punch lenny in the back of the been a to the on the man to the mother and the father to simpson the father to with the marge in the for the like the fame to the been to the for my bart the don't was in the like the for the father the father a was the father been a say the been to me the do it and the father been to go. i want to the boy i can the from a man to be the for the been a like the father to make my bart of the father",
 0.5: "i'm gonna punch lenny in the back of the kin't she change and i'm all better it and the was the fad a drivera it? what i want to did hey, he would you would in your bus who know is the like and this don't are for your this all for your manset the for it a man is on the see the will they want to know i'm are for one start of that and i got the better this is. it whoce and i don't are on the mater stop in the from a for the be your mileat",
 1.0: "i'm gonna punch lenny in the back of the to to macks how screath. firl done we wouldn't wil that kill. of this torshmobote since, i know i ord did, can give crika of sintenn prescoam.whover my me after may? there's right. that up. there's ruining isay.oh.solls.nan'h those off point chuncing car your anal medion.hey, are exallies a off while bea dolk of sure, hello, no in her, we'll rundems... i'm eventy taving me to too the letberngonce",
 1.2: "i'm gonna punch lenny in the back of the burear prespe-nakes, 'lisa to isn't that godios.and when be the bowniday' would lochs meine, mind crikvin' suhle ovotaci!..... hey, a poielyfd othe flancer, this in are rightplouten of of we doll hurrs, truelturone? rake inswaydan justy!we scrikent.ow.. by back hous, smadge, the lighel irely.yes, homer. wel'e esasmoy ryelalrs all wronencay...... nank. i wenth makedyk. come on help cerzind, now, n"}
def sample_text(seed, temperature):
    return generated[temperature]

# Feed the 'seed' text into the neural network
seed = "i'm gonna punch lenny in the back of the"

# Iterate over the different temperature values
for temperature in [0.2, 0.5, 1.0, 1.2]:
    print("\nGenerating text with riskiness : {}\n".format(temperature))
    # Call the sample_text function
    print(sample_text(seed,temperature))

    


Generating text with riskiness : 0.2

i'm gonna punch lenny in the back of the been a to the on the man to the mother and the father to simpson the father to with the marge in the for the like the fame to the been to the for my bart the don't was in the like the for the father the father a was the father been a say the been to me the do it and the father been to go. i want to the boy i can the from a man to be the for the been a like the father to make my bart of the father

Generating text with riskiness : 0.5

i'm gonna punch lenny in the back of the kin't she change and i'm all better it and the was the fad a drivera it? what i want to did hey, he would you would in your bus who know is the like and this don't are for your this all for your manset the for it a man is on the see the will they want to know i'm are for one start of that and i got the better this is. it whoce and i don't are on the mater stop in the from a for the be your mileat

Generating text with riskiness : 1.0

i