# Building a virtual assistant
>  In this chapter, you'll build a personal assistant to help you plan a trip. It will be able to respond to questions like "are there any cheap hotels in the north of town?" by looking inside a hotel’s database for matching results.

- toc: true 
- badges: true
- comments: true
- author: Lucas Nunes
- categories: [Datacamp]
- image: images/datacamp/___

In [None]:
state = INIT
def respond(state, message):
  (new_state, response) = policy_rules[(state, interpret(message))]
  return new_state, response
def send_message(state, message):
  new_state, response = respond(state, message)
  return new_state
state = send_message(state, message)

> Note: This is a summary of the course's chapter 3 exercises "Building Chatbots in Python" at datacamp. <br>[Github repo](https://github.com/lnunesAI/Datacamp/) / [Course link](https://www.datacamp.com/tracks/machine-learning-scientist-with-python)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['figure.figsize'] = (8, 8)

In [None]:
import sqlite3

## Virtual assistants and accessing data

### SQL basics

<div class=""><p>Time to begin writing queries for your first hotel booking chatbot! The database has been loaded as <code>"hotels.db"</code> and a cursor, which has access to the database, has already been defined for you as <code>cursor</code>.</p>
<p>Three queries are provided below. Your job is to identify which query returns ONLY the <code>"Hotel California"</code>.</p>
<p>You can test each query below by calling the cursor's <code>.execute()</code> method and passing the query in as a string. 
Then, you can print the results by calling the cursor's <code>.fetchall()</code> method, which takes no arguments.</p></div>

In [None]:
%%capture
!wget https://github.com/lnunesAI/Datacamp/raw/main/3-skill-tracks/building-chatbots-in-python/datasets/hotels.db
conn = sqlite3.connect('hotels.db')
c = conn.cursor()

In [None]:
c.execute("SELECT * from hotels").fetchall()

[('Hotel for Dogs', 'mid', 'east', 3),
 ('Hotel California', 'mid', 'north', 3),
 ('Grand Hotel', 'hi', 'south', 5),
 ('Cozy Cottage', 'lo', 'south', 2),
 ("Ben's BnB", 'hi', 'north', 4),
 ('The Grand', 'hi', 'west', 5),
 ('Central Rooms', 'mid', 'center', 3)]

<pre>
Possible Answers
SELECT name from hotels where price = 'expensive' AND area = 'center'
<b>SELECT name from hotels where price = 'mid' AND area = 'north'</b>
SELECT name from hotels where price = 'expensive'
</pre>

In [None]:
c.execute("SELECT name from hotels where price = 'mid' AND area = 'north'").fetchall()

[('Hotel California',)]

### SQL statements in Python

<div class=""><p>It's time to begin writing SQL queries! In this exercise, your job is to run a query against the hotels database to find all the expensive hotels in the south. 
The connection to the database has been created for you, along with a cursor <code>c</code>.</p>
<p>As Alan described in the video, you should be careful about SQL injection. Here, you'll pass parameters the safe way: As an extra tuple argument to the <code>.execute()</code> method. 
This ensures malicious code can't be injected into your query.</p></div>

Instructions
<ul>
<li>Define a tuple <code>t</code> of strings <code>"south"</code> and <code>"hi"</code> for the <code>area</code> and <code>price</code>.</li>
<li>Execute the query using the cursor's <code>.execute()</code> method. You're looking for <strong>all</strong> of the fields for <strong>all</strong> <code>hotels</code> where the <code>area</code> is <code>"south"</code> and the price is <code>"hi"</code>.</li>
<li>Print the results using the cursor's <code>.fetchall()</code> method.</li>
</ul>

In [None]:
# Import sqlite3
import sqlite3

# Open connection to DB
conn = sqlite3.connect('hotels.db')

# Create a cursor
c = conn.cursor()

# Define area and price
area, price = "south", "hi"
t = (area, price)

# Execute the query
c.execute('SELECT * FROM hotels WHERE area=? AND price=?', t)

# Print the results
print(c.fetchall())

[('Grand Hotel', 'hi', 'south', 5)]


**According to our database, the Grand Hotel is the only high-end hotel in the south.**

## Exploring a DB with natural language

### Creating queries from parameters

<div class=""><p>Now you're going to implement a more powerful function for querying the hotels database. The goal is for that function to take arguments that can later be specified by other parts of your code.</p>
<p>More specifically, your job is to define a <code>find_hotels()</code> function which takes a single argument - a dictionary of column names and values - and returns a list of matching hotels from the database.</p></div>

Instructions
<ul>
<li>A <code>filters</code> list has been created for you. Join this list together with the strings <code>" WHERE "</code> and <code>" and "</code>.</li>
<li>Create a tuple of the values of the <code>params</code> dictionary. </li>
<li>Create a connection and cursor to <code>"hotels.db"</code> and then execute the <code>query</code>, just as in the previous exercise.</li>
<li>Return the results of the query.</li>
</ul>

In [None]:
# Define find_hotels()
def find_hotels(params):
    # Create the base query
    query = 'SELECT * FROM hotels'
    # Add filter clauses for each of the parameters
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params]
        query += " WHERE " + " and ".join(filters)
    # Create the tuple of values
    t = tuple(params.values())
    
    # Open connection to DB
    conn = sqlite3.connect('hotels.db')
    # Create a cursor
    c = conn.cursor()
    # Execute the query
    c.execute(query, t)
    # Return the results
    return c.fetchall()

**You've now got a function that can find matching hotels for any area and price range combination.**

### Using your custom function to find hotels

<p>Here, you'll see your <code>find_hotels()</code> function in action! Recall that it accepts a single argument, <code>params</code>, which is a dictionary of column names and values.</p>

Instructions
<ul>
<li>Create the <code>params</code> dictionary with the column names (keys) <code>"area"</code> and <code>"price"</code>, with corresponding values <code>"south"</code> and <code>"lo"</code>.</li>
<li>Use the <code>find_hotels()</code> function along with your <code>params</code> dictionary to find all inexpensive hotels in the South.</li>
</ul>

In [None]:
# Create the dictionary of column names and values
params = {"area": "south", "price": "lo"}

# Find the hotels that match the parameters
print(find_hotels(params))

[('Cozy Cottage', 'lo', 'south', 2)]


### Creating SQL from natural language

<div class=""><p>Now you'll write a <code>respond()</code> function that can handle messages like <code>"I want an expensive hotel in the south of town"</code> and respond appropriately according to the number of matching results in a database. This is an important functionality for any database-backed chatbot.</p>
<p>Your <code>find_hotels()</code> function from the previous exercises has already been defined for you, along with a Rasa NLU <code>interpreter</code> object, which can handle hotel queries, and a list of <code>responses</code>, which you can explore in the Shell.</p></div>

In [None]:
#https://colab.research.google.com/github/mohammedterry/NLP_for_ML/blob/master/RASA.ipynb#scrollTo=DVGgJ-1fBwTQ
#https://colab.research.google.com/drive/1X5H8csPrM3SQL29GvQSyBw5t32RDJYyZ#scrollTo=XDcWzANZEle0

Instructions 1/2
<ul>
<li>Use the <code>.parse()</code> method of <code>interpreter</code> to extract the <code>"entities"</code> in the <code>message</code>.</li>
<li>Find matching hotels using the <code>params</code> dictionary and <code>find_hotels()</code> function.</li>
<li>Use the <code>min()</code> function to choose the right index for the response to send. In this case, <code>n</code> is the number of results.</li>
<li>Select the appropriate response from the <code>responses</code> list and insert the <code>names</code> of hotels using the <code>.format()</code> method.</li>
</ul>

In [None]:
# Define respond()
def respond(message):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Initialize an empty params dictionary
    params = {}
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    # Find hotels that match the dictionary
    results = find_hotels(params)
    # Get the names of the hotels and index of the response
    names = [r[0] for r in results]
    n = min(len(results),3)
    # Select the nth element of the responses array
    return responses[n].format(*names)

Instructions 2/2
<li>Excellent! You've built a chatbot that can interpret the results of your hotel DB queries. Now, call the <code>respond()</code> function with the message <code>"I want an expensive hotel in the south of town"</code>. Place it inside a call to <code>print()</code> so that you can see the response of your bot in the shell.</li>

In [None]:
# Define respond()
def respond(message):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Initialize an empty params dictionary
    params = {}
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    # Find hotels that match the dictionary
    results = find_hotels(params)
    # Get the names of the hotels and index of the response
    names = [r[0] for r in results]
    n = min(len(results),3)
    # Select the nth element of the responses array
    return responses[n].format(*names)

# Test the respond() function
print(respond("I want an expensive hotel in the south of town"))

## Incremental slot filling and negation

### Refining your search

<div class=""><p>Now you'll write a bot that allows users to add filters incrementally, just in case they don't specify all of their preferences in one message.</p>
<p>To do this, initialize an empty dictionary <code>params</code> outside of your <code>respond()</code> function (as opposed to inside the function, like in the previous exercise). 
Your <code>respond()</code> function will take in this dictionary as an argument.</p></div>

In [None]:
def find_hotels(params):
    query = 'SELECT * FROM hotels'
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params]
        query += " WHERE " + " and ".join(filters)
    t = tuple(params.values())
    
    # open connection to DB
    conn = sqlite3.connect('hotels.db')
    # create a cursor
    c = conn.cursor()
    c.execute(query, t)
    return c.fetchall()

Instructions
<ul>
<li>Define a <code>respond()</code> function that accepts two arguments - a <code>message</code> <strong>and</strong> a dictionary of <code>params</code> - and returns two results - the message to send to the user and the updated <code>params</code> dictionary.</li>
<li>Extract <code>"entities"</code> from the <code>message</code> using the <code>.parse()</code> method of the <code>interpreter</code>, exactly like you did in the previous exercise.</li>
<li>Find the hotels that match <code>params</code> using your <code>find_hotels()</code> function. </li>
<li>Initialize the <code>params</code> dictionary outside the <code>respond()</code> function and hit 'Submit Answer' to pass the messages to the bot.</li>
</ul>

In [None]:
# Define a respond function, taking the message and existing params as input
def respond(message, params):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    # Find the hotels
    results = find_hotels(params)
    names = [r[0] for r in results]
    n = min(len(results), 3)
    # Return the appropriate response
    return responses[n].format(*names), params

# Initialize params dictionary
params = {}

# Pass the messages to the bot
for message in ["I want an expensive hotel", "in the north of town"]:
    print("USER: {}".format(message))
    response, params = respond(message, params)
    print("BOT: {}".format(response))

**Your chatbot can now help users even when they split their preferences over a few messages.**

### Basic negation

<div class=""><p>Quite often, you'll find your users telling you what they <em>don't</em> want - and that's important to understand!
In general, negation is a difficult problem in NLP. Here, we'll take a very simple approach that works for many cases.</p>
<p>A list of tests called <code>tests</code> has been defined for you. Explore it in the Shell - you'll find that each test is a tuple consisting of:</p>
<ul>
<li>A string containing a message with entities.</li>
<li>A dictionary containing the entities as keys and a Boolean saying whether they are negated as the key.</li>
</ul>
<p>Your job is to define a function called <code>negated_ents()</code> which looks for negated entities in a message.</p></div>

In [None]:
tests = [("no I don't want to be in the south", {'south': False}),
 ('no it should be in the south', {'south': True}),
 ('no in the south not the north', {'north': False, 'south': True}),
 ('not north', {'north': False})]

Instructions
<ul>
<li>Using list comprehension, check if the words <code>"south"</code> or <code>"north"</code> appear in the message and extract those entities.</li>
<li>Split the sentence into chunks ending with each entity. To do this:<ul>
<li>Use the <code>.index()</code> method of <code>phrase</code> to find the starting index of each entity <code>e</code> and add the entity's length to it to find the index of the end of the entity.</li>
<li>Starting with <code>start=0</code>, take slices of the string from <code>start</code> to <code>end</code> for each <code>end</code> in <code>ends</code>. Append each slice of the sentence to the list, <code>chunks</code>. Ensure you update your starting position with each iteration.</li></ul></li>
<li>For each entity, if <code>"not"</code> or <code>"n't"</code> appears in the chunk, consider this entity negated.</li>
</ul>

In [None]:
# Define negated_ents()
def negated_ents(phrase):
    # Extract the entities using keyword matching
    ents = [e for e in ["south", "north"] if e in phrase]
    # Find the index of the final character of each entity
    ends = sorted([phrase.index(e) + len(e) for e in ents])
    # Initialise a list to store sentence chunks
    chunks = []
    # Take slices of the sentence up to and including each entitiy
    start = 0
    for end in ends:
        chunks.append(phrase[start:end])
        start = end
    result = {}
    # Iterate over the chunks and look for entities
    for chunk in chunks:
        for ent in ents:
            if ent in chunk:
                # If the entity contains a negation, assign the key to be False
                if "not" in chunk or "n't" in chunk:
                    result[ent] = False
                else:
                    result[ent] = True
    return result  

# Check that the entities are correctly assigned as True or False
for test in tests:
    print(negated_ents(test[0]) == test[1])

True
True
True
True


### Filtering with excluded slots

<div class=""><p>Now you're going to put together some of the ideas from previous exercises in order to allow users to tell your bot about what they do and do not want, split across multiple messages. </p>
<p>The <code>negated_ents()</code> function has already been defined for you. Additionally, a slightly tweaked version of the <code>find_hotels()</code> function, which accepts a <code>neg_params</code> dictionary in addition to a <code>params</code> dictionary, has been defined.</p></div>

In [None]:
def negated_ents(phrase, ent_vals):
    ents = [e for e in ent_vals if e in phrase]
    ends = sorted([phrase.index(e)+len(e) for e in ents])
    start = 0
    chunks = []
    for end in ends:
        chunks.append(phrase[start:end])
        start = end
    result = {}
    for chunk in chunks:
        for ent in ents:
            if ent in chunk:
                if "not" in chunk or "n't" in chunk:
                    result[ent] = False
                else:
                    result[ent] = True
    return result  

In [None]:
def find_hotels(params, neg_params):
    query = 'SELECT * FROM hotels'
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params] +                  ["{}!=?".format(k) for k in neg_params] 
        query += " WHERE " + " and ".join(filters)
    t = tuple(params.values())
    
    # open connection to DB
    conn = sqlite3.connect('hotels.db')
    # create a cursor
    c = conn.cursor()
    c.execute(query, t)
    return c.fetchall()

Instructions
<ul>
<li>Define a <code>respond()</code> function which accepts a <code>message</code>, <code>params</code>, and <code>neg_params</code> as arguments.</li>
<li>Use the <code>negated_ents()</code> function with <code>message</code> and <code>ent_vals</code> as arguments. Store the result in <code>negated</code>.</li>
<li>Use the tweaked <code>find_hotels()</code> function with the <code>params</code> and <code>neg_params</code> dictionaries as arguments to find matching hotels. Store the result in <code>results</code>.</li>
<li>Initialize the <code>params</code> and <code>neg_params</code> dictionaries outside the <code>respond()</code> function and hit 'Submit Answer' to see the bot's responses!</li>
</ul>

In [None]:
# Define the respond function
def respond(message, params, neg_params):
    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    ent_vals = [e["value"] for e in entities]
    # Look for negated entities
    negated = negated_ents(message, ent_vals)
    for ent in entities:
        if ent["value"] in negated and negated[ent["value"]]:
            neg_params[ent["entity"]] = str(ent["value"])
        else:
            params[ent["entity"]] = str(ent["value"])
    # Find the hotels
    results = find_hotels(params, neg_params)
    names = [r[0] for r in results]
    n = min(len(results),3)
    # Return the correct response
    return responses[n].format(*names), params, neg_params

# Initialize params and neg_params
params = {}
neg_params = {}

# Pass the messages to the bot
for message in ["I want a cheap hotel", "but not in the north of town"]:
    print("USER: {}".format(message))
    response, params, neg_params = respond(message, params, neg_params)
    print("BOT: {}".format(response))

**Your bot can now handle just about any sequence of requests, with positive or negative preferences.**