# OC - Parcours Ingénieur IA - Projet 10 - BERKAN Asli Ceren

## Table of Contents

* [1. Préparation des données](#chapter1)
* [2. Création & publication d'une LUIS app](#chapter2)
* [3. Prédiction & évalution d'une LUIS app](#chapter3)

## 1. Préparation des données <a class="anchor" id="chapter1"></a>

In [3]:
# Importer les librairies
import json
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [12]:
# Ouverture du fichier 'frames'
with open('frames.json', 'r') as f:
    frames = json.load(f)

In [13]:
# Convertir le fichier sous forme de dictionnaire en dataframe
dataframe = pd.DataFrame.from_dict(frames)

In [14]:
# Affichage du contenu de la dataframe
display(dataframe.head())
print('Il y a ' + str(dataframe.shape[0]) + ' dialogues.')
print('Les différentes colonnes sont : ' + str(dataframe.columns.values))

Unnamed: 0,user_id,turns,wizard_id,id,labels
0,U22HTHYNP,[{'text': 'I'd like to book a trip to Atlantis...,U21DKG18C,e2c0fc6c-2134-4891-8353-ef16d8412c9a,"{'userSurveyRating': 4.0, 'wizardSurveyTaskSuc..."
1,U21E41CQP,"[{'text': 'Hello, I am looking to book a vacat...",U21DMV0KA,4a3bfa39-2c22-42c8-8694-32b4e34415e9,"{'userSurveyRating': 3.0, 'wizardSurveyTaskSuc..."
2,U21RP4FCY,[{'text': 'Hello there i am looking to go on a...,U21E0179B,6e67ed28-e94c-4fab-96b6-68569a92682f,"{'userSurveyRating': 2.0, 'wizardSurveyTaskSuc..."
3,U22HTHYNP,[{'text': 'Hi I'd like to go to Caprica from B...,U21DKG18C,5ae76e50-5b48-4166-9f6d-67aaabd7bcaa,"{'userSurveyRating': 5.0, 'wizardSurveyTaskSuc..."
4,U21E41CQP,"[{'text': 'Hello, I am looking to book a trip ...",U21DMV0KA,24603086-bb53-431e-a0d8-1dcc63518ba9,"{'userSurveyRating': 5.0, 'wizardSurveyTaskSuc..."


Il y a 1369 dialogues.
Les différentes colonnes sont : ['user_id' 'turns' 'wizard_id' 'id' 'labels']


In [15]:
# Supprimer les dialogues n'ayant pas abouti
temp = pd.DataFrame(columns=['user_id', 'turns',
                             'wizard_id', 'id', 'labels'])

for i in dataframe.index :
    if dataframe['labels'][i]['wizardSurveyTaskSuccessful'] == True :
        temp = temp.append(dataframe.iloc[i])

display(temp.head())

Unnamed: 0,user_id,turns,wizard_id,id,labels
0,U22HTHYNP,[{'text': 'I'd like to book a trip to Atlantis...,U21DKG18C,e2c0fc6c-2134-4891-8353-ef16d8412c9a,"{'userSurveyRating': 4.0, 'wizardSurveyTaskSuc..."
1,U21E41CQP,"[{'text': 'Hello, I am looking to book a vacat...",U21DMV0KA,4a3bfa39-2c22-42c8-8694-32b4e34415e9,"{'userSurveyRating': 3.0, 'wizardSurveyTaskSuc..."
3,U22HTHYNP,[{'text': 'Hi I'd like to go to Caprica from B...,U21DKG18C,5ae76e50-5b48-4166-9f6d-67aaabd7bcaa,"{'userSurveyRating': 5.0, 'wizardSurveyTaskSuc..."
4,U21E41CQP,"[{'text': 'Hello, I am looking to book a trip ...",U21DMV0KA,24603086-bb53-431e-a0d8-1dcc63518ba9,"{'userSurveyRating': 5.0, 'wizardSurveyTaskSuc..."
5,U21RP4FCY,"[{'text': 'Hey, i Want to go to St. Louis on t...",U21E0179B,bbd17a54-bc6c-4237-8f72-4778081fab0c,"{'userSurveyRating': 3.0, 'wizardSurveyTaskSuc..."


In [16]:
# Pour le projet, on a besoin que des informations de la colonne 'turns'
turns = temp.turns
turnsDF = pd.DataFrame.from_dict(turns[0])
turnsDF

Unnamed: 0,text,labels,author,timestamp,db
0,I'd like to book a trip to Atlantis from Capri...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,
1,"Hi...I checked a few options for you, and unfo...",{'acts': [{'args': [{'val': [{'annotations': [...,wizard,1471272000000.0,{'result': [[{'trip': {'returning': {'duration...
2,"Yes, how about going to Neverland from Caprica...","{'acts': [{'args': [{'val': 'Neverland', 'key'...",user,1471273000000.0,
3,I checked the availability for this date and t...,{'acts': [{'args': [{'val': [{'annotations': [...,wizard,1471273000000.0,"{'result': [[], [], [], [], [], []], 'search':..."
4,I have no flexibility for dates... but I can l...,"{'acts': [{'args': [{'val': False, 'key': 'fle...",user,1471273000000.0,
5,I checked the availability for that date and t...,{'acts': [{'args': [{'val': [{'annotations': [...,wizard,1471273000000.0,"{'result': [[]], 'search': [{'ORIGIN_CITY': 'A..."
6,I suppose I'll speak with my husband to see if...,"{'acts': [{'args': [], 'name': 'thankyou'}], '...",user,1471273000000.0,


In [17]:
# On crée un nouveau df pour stocker uniquement les textes écrits par 'user'
turnsDF_reduit = pd.DataFrame(columns=['text',
                                        'labels',
                                        'author',
                                        'timestamp',
                                        'db'])

In [18]:
# Boucle pour garder les conversations 'user'
for i in temp.index:
    turns_temps = pd.DataFrame.from_dict(temp.turns[i])
    
    # si on veut le premier texte de la conversation
    turnsDF_reduit = turnsDF_reduit.append(turns_temps.iloc[0])
    
#     # si on veut l'ensemble des textes de la conversation
#     for j in turns_temps.index:
#         if turns_temps.author[j] == 'user':
#             turnsDF_reduit = turnsDF_reduit.append(turns_temps.iloc[j])
display(turnsDF_reduit.head())

Unnamed: 0,text,labels,author,timestamp,db
0,I'd like to book a trip to Atlantis from Capri...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,
0,"Hello, I am looking to book a vacation from Go...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,
0,"Hi I'd like to go to Caprica from Busan, betwe...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471274000000.0,
0,"Hello, I am looking to book a trip for 2 adult...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,
0,"Hey, i Want to go to St. Louis on the 17th of ...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,


In [19]:
# On remet des index
turnsDF_reduit = turnsDF_reduit.reset_index(drop=True)
turnsDF_reduit.head()

Unnamed: 0,text,labels,author,timestamp,db
0,I'd like to book a trip to Atlantis from Capri...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,
1,"Hello, I am looking to book a vacation from Go...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,
2,"Hi I'd like to go to Caprica from Busan, betwe...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471274000000.0,
3,"Hello, I am looking to book a trip for 2 adult...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,
4,"Hey, i Want to go to St. Louis on the 17th of ...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,


In [20]:
# On complète avec les colonnes utiles pour extraire les entités
turnsDF_complet = turnsDF_reduit.copy()
turnsDF_complet["intent"] = np.nan
turnsDF_complet["or_city"] = np.nan
turnsDF_complet["dst_city"] = np.nan
turnsDF_complet["str_date"] = np.nan
turnsDF_complet["end_date"] = np.nan
turnsDF_complet["budget"] = np.nan

In [21]:
label_temp = turnsDF_complet.labels[0]
label_temp

{'acts': [{'args': [{'val': 'book', 'key': 'intent'}], 'name': 'inform'},
  {'args': [{'val': 'Atlantis', 'key': 'dst_city'},
    {'val': 'Caprica', 'key': 'or_city'},
    {'val': 'Saturday, August 13, 2016', 'key': 'str_date'},
    {'val': '8', 'key': 'n_adults'},
    {'val': '1700', 'key': 'budget'}],
   'name': 'inform'}],
 'acts_without_refs': [{'args': [{'val': 'book', 'key': 'intent'}],
   'name': 'inform'},
  {'args': [{'val': 'Atlantis', 'key': 'dst_city'},
    {'val': 'Caprica', 'key': 'or_city'},
    {'val': 'Saturday, August 13, 2016', 'key': 'str_date'},
    {'val': '8', 'key': 'n_adults'},
    {'val': '1700', 'key': 'budget'}],
   'name': 'inform'}],
 'active_frame': 1,
 'frames': [{'info': {'intent': [{'val': 'book', 'negated': False}],
    'budget': [{'val': '1700.0', 'negated': False}],
    'dst_city': [{'val': 'Atlantis', 'negated': False}],
    'or_city': [{'val': 'Caprica', 'negated': False}],
    'str_date': [{'val': 'august 13', 'negated': False}],
    'n_adults': 

In [22]:
# On extrait le intent, si pas disponible on passe
for i in turnsDF_complet.index:
    try :
        label_temp = turnsDF_complet.labels[i]
        intent_temp = label_temp['acts'][0]['args'][0]['val']
        turnsDF_complet.loc[i, 'intent'] = intent_temp 
    except IndexError:
        pass
display(turnsDF_complet.head())

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
0,I'd like to book a trip to Atlantis from Capri...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,,book,,,,,
1,"Hello, I am looking to book a vacation from Go...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,,book,,,,,
2,"Hi I'd like to go to Caprica from Busan, betwe...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471274000000.0,,book,,,,,
3,"Hello, I am looking to book a trip for 2 adult...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,,book,,,,,
4,"Hey, i Want to go to St. Louis on the 17th of ...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,,book,,,,,


In [23]:
# Pour compléter les autres colonnes entités
for i in turnsDF_complet.index:
    try :
        label_temp = turnsDF_complet.labels[i]
        or_city = label_temp['acts'][1]['args']
        for x in or_city :
            if x['key'] == 'or_city' :
                or_city = x['val']
                turnsDF_complet.loc[i, 'or_city'] = or_city
            if x['key'] == 'dst_city' :
                dst_city = x['val']
                turnsDF_complet.loc[i, 'dst_city'] = dst_city
            if x['key'] == 'str_date' :
                str_date = x['val']
                turnsDF_complet.loc[i, 'str_date'] = str_date
            if x['key'] == 'end_date' :
                end_date = x['val']
                turnsDF_complet.loc[i, 'end_date'] = end_date
            if x['key'] == 'budget' :
                budget = x['val']
                turnsDF_complet.loc[i, 'budget'] = budget
    except IndexError :
        pass
display(turnsDF_complet.head())

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
0,I'd like to book a trip to Atlantis from Capri...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,,book,Caprica,Atlantis,"Saturday, August 13, 2016",,1700
1,"Hello, I am looking to book a vacation from Go...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471272000000.0,,book,Gotham City,Mos Eisley,,,2100
2,"Hi I'd like to go to Caprica from Busan, betwe...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471274000000.0,,book,Busan,Caprica,"Sunday August 21, 2016","Wednesday August 31, 2016",
3,"Hello, I am looking to book a trip for 2 adult...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,,book,Kochi,Denver,,,"$21,300"
4,"Hey, i Want to go to St. Louis on the 17th of ...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471275000000.0,,book,,St. Louis,17th of August,,


In [24]:
# quelques information sur le df
turnsDF_complet.describe(include='all')

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
count,1287,1287,1287,1287.0,0.0,955,550,615.0,255,143,161.0
unique,1249,1074,1,,,85,209,211.0,152,111,90.0
top,hi,"{'acts': [{'args': [], 'name': 'greeting'}], '...",user,,,book,Kabul,-1.0,August 27th,24th,-1.0
freq,8,96,1287,,,850,14,18.0,10,3,15.0
mean,,,,1472472000000.0,,,,,,,
std,,,,700018600.0,,,,,,,
min,,,,1471272000000.0,,,,,,,
25%,,,,1471902000000.0,,,,,,,
50%,,,,1472502000000.0,,,,,,,
75%,,,,1473182000000.0,,,,,,,


In [25]:
# On ne garde que les intent 'book'
turnsDF_final = turnsDF_complet.copy()
turnsDF_final = turnsDF_final[turnsDF_final.intent == 'book']
turnsDF_final.shape

(850, 11)

In [26]:
train, test = train_test_split(turnsDF_final, test_size=0.2)
train = train.reset_index(drop=True)
display(train.head())
print('Le jeu de train contient : ' + str(train.shape[0]) + ' textes.')
test = test.reset_index(drop=True)
display(test.head())
print('Le jeu de test contient : ' + str(test.shape[0]) + ' textes.')

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
0,I want to go to Kingston from Queenstown with ...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473430000000.0,,book,Queenstown,Kingston,,,
1,Hi. I'm looking for an adventure from Thursday...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471282000000.0,,book,,,,,
2,my assistant and I want to get to ciudad juare...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473369000000.0,,book,,ciudad juarez,,,
3,yes. i am going 2 bring my grand daughter with...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473354000000.0,,book,,,,,
4,"Hey, I need to get to Mannheim asap","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472656000000.0,,book,,Mannheim,,,


Le jeu de train contient : 680 textes.


Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
0,Seoul to Long Beach. 4 kids 6 adults. 35400 bu...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472852000000.0,,book,Seoul,Long Beach,,,35400.0
1,Hi im from Sydney and i want to go to mannheim,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471628000000.0,,book,Sydney,mannheim,,,
2,Hello I am currently in Tel Aviv on business a...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471963000000.0,,book,,,,,
3,Do you have trips out of Tel Aviv?,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472750000000.0,,book,Tel Aviv,,,,
4,"Hi there, I'd like to book a trip from Boston ...","{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473181000000.0,,book,Boston,,,,


Le jeu de test contient : 170 textes.


In [36]:
# Save to csv
train.to_csv('train.csv', index=False)
test.to_csv('test.csv', index=False)

In [27]:
def get_example_label(utterance, entity_name, value):
    """Build a EntityLabelObject.
    This will find the "value" start/end index in "utterance", and assign it to "entity name"
    """
    utterance = utterance.lower()
    value = value.lower()
    return {
        'entity_name': entity_name,
        'start_char_index': utterance.find(value),
        'end_char_index': utterance.find(value) + len(value)
    }

In [28]:
# Transforme les df train et test en utterance pour l'app Luis
def utterances_from_df(df):
    utterances = []
    for i in df.index:
        texte = df.text[i]

        temp_labels = []
        if df.or_city[i] != '':
            dict_temp = get_example_label(texte, 'Departure', df.or_city[i])
            temp_labels.append(dict(dict_temp))
        if df.dst_city[i] != '':
            dict_temp = get_example_label(texte, 'Destination', df.dst_city[i])
            temp_labels.append(dict(dict_temp))
        if df.str_date[i] != '':
            dict_temp = get_example_label(texte, 'StartDate', df.str_date[i])
            temp_labels.append(dict(dict_temp))
        if df.end_date[i] != '':
            dict_temp = get_example_label(texte, 'EndDate', df.end_date[i])
            temp_labels.append(dict(dict_temp))
        if df.budget[i] != '':
            dict_temp = get_example_label(texte, 'Budget', df.budget[i])
            temp_labels.append(dict(dict_temp))

        element = {
            'text': texte,
            'intent_name': 'BookFlights',
            'entity_labels': temp_labels
        }

        utterances.append(element)
    return(utterances)

In [29]:
# Et format finale des utterances !
utterances_train = utterances_from_df(train.replace(np.nan, ""))
# display(utterances_train)
utterances_test = utterances_from_df(test.replace(np.nan, ""))
# display(utterances_test)

In [20]:
# Save files
with open('utterances_train.json', 'w') as outfile:
    json.dump(utterances_train, outfile)
    
with open('utterances_test.json', 'w') as outfile:
    json.dump(utterances_test, outfile)

## 2. Création & publication d'une LUIS app <a class="anchor" id="chapter2"></a>

In [6]:
from azure.cognitiveservices.language.luis.authoring import LUISAuthoringClient
from azure.cognitiveservices.language.luis.authoring.models import ApplicationCreateObject
from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient
from msrest.authentication import CognitiveServicesCredentials
from functools import reduce

import os
import json, time
from dotenv import load_dotenv
load_dotenv()

True

In [8]:
# Connection à l'app LUIS
authoringKey = os.environ.get("authoringKey", "")
authoringEndpoint = os.environ.get("authoringEndpoint", "")
predictionKey = os.environ.get("predictionKey", "")
predictionEndpoint = os.environ.get("predictionEndpoint", "")
    
client = LUISAuthoringClient(authoringEndpoint, CognitiveServicesCredentials(authoringKey))

In [23]:
# Create a LUIS app
default_app_name = "Booking"
version_id = "0.1"

print("Creating App {}, version {}".format(
    default_app_name, version_id))

app_id = client.apps.add({
    'name': default_app_name,
    'initial_version_id': version_id,
    'description': "New App created with LUIS Python sample",
    'culture': 'en-us',
})
print("Created app {}".format(app_id))

Creating App Booking, version 0.1
Created app 02e79a1d-9047-45ba-84ca-941a5500c455


In [24]:
# Ajouter un intent, on souhaite que le bot réserve des vols (et uniquement ça pour le moment)
intent_name = "BookFlights"
intent_id = client.model.add_intent(
    app_id,
    version_id,
    intent_name
)
print("{} intent created with id {}".format(
    intent_name,
    intent_id
))

BookFlights intent created with id 26fc7a12-8efd-4ee4-8c8e-31a381e6a67f


In [25]:
# Add information into the model
print("\nWe'll create five new entities.")
print("The \"Departure\" simple entity will hold the flight departure city.")
print("The \"Destination\" simple entity will hold the flight destination.")
print("The \"StartDate\" simple entity will hold the flight start date.")
print("The \"EndDate\" simple entity will hold the flight end date.")
print("The \"Budget\" simple entity will hold the flight budget.")

entities_list = ["Departure", "Destination", "StartDate", "EndDate", "Budget"]

for enum in entities_list:
    entity_id = client.model.add_entity(app_id, version_id, name=enum)
    print("{} simple entity created with id {}".format(enum, entity_id))


We'll create five new entities.
The "Departure" simple entity will hold the flight departure city.
The "Destination" simple entity will hold the flight destination.
The "StartDate" simple entity will hold the flight start date.
The "EndDate" simple entity will hold the flight end date.
The "Budget" simple entity will hold the flight budget.
Departure simple entity created with id 5e27fd16-0648-48c6-9a05-31f91d88d9db
Destination simple entity created with id 00ae0a95-6f44-4c69-b873-bc8148e8d6d8
StartDate simple entity created with id 94d76687-4e50-4dc2-a440-af87f1ddcea5
EndDate simple entity created with id d7684a3c-a2db-4120-97d2-4abaa7f71c19
Budget simple entity created with id 7cd67cc8-a600-4e5e-8737-14aa3ed0d8c1


In [26]:
# Ajouter les utterances
for i in range(len(utterances_train)): 
    utterances_result = client.examples.batch(
                app_id,
                version_id,
                [utterances_train[i]]
            )

print("\nUtterances added to the {} intent".format(intent_name))


Utterances added to the BookFlights intent


In [27]:
# Train the app
client.train.train_version(app_id, version_id)
waiting = True
while waiting:
    info = client.train.get_status(app_id, version_id)

    # get_status returns a list of training statuses, one for each model. Loop through them and make sure all are done.
    waiting = any(map(lambda x: 'Queued' == x.details.status or 'InProgress' == x.details.status, info))
    if waiting:
        print ("Waiting 10 seconds for training to complete...")
        time.sleep(10)
    else: 
        print ("trained")
        waiting = False

Waiting 10 seconds for training to complete...
Waiting 10 seconds for training to complete...
trained


In [28]:
# Publish the app
print("\nWe'll start publishing your app...")

publish_result = client.apps.publish(
    app_id,
    version_id,
    is_staging=False,
    region='westeurope'
)
endpoint = publish_result.endpoint_url + \
    "?subscription-key=" + authoringKey + "&q="
print("Your app is published. You can now go to test it on\n{}".format(endpoint))


We'll start publishing your app...
Your app is published. You can now go to test it on
https://westeurope.api.cognitive.microsoft.com/luis/v2.0/apps/02e79a1d-9047-45ba-84ca-941a5500c455?subscription-key=57c9fae5941d4539a19c2cbd0135bbd0&q=


## 3. Prédiction & évalution d'une LUIS app <a class="anchor" id="chapter3"></a>

In [1]:
# Get the app_id
app_id = '02e79a1d-9047-45ba-84ca-941a5500c455'

In [4]:
# Read files 
with open('utterances_train.json', 'r') as outfile:
    utterances_train = json.load(outfile)

with open('utterances_test.json', 'r') as outfile:
    utterances_test = json.load(outfile)

In [9]:
# Authenticate the prediction runtime client
runtimeCredentials = CognitiveServicesCredentials(predictionKey)
clientRuntime = LUISRuntimeClient(endpoint=predictionEndpoint, credentials=runtimeCredentials)

In [10]:
# Production == slot name
predictionRequest = {"query" : utterances_test[0]['text']}

predictionResponse = clientRuntime.prediction.get_slot_prediction(app_id, "Production", predictionRequest)
print("Top intent: {}".format(predictionResponse.prediction.top_intent))
print("Intents: ")

for intent in predictionResponse.prediction.intents:
    print("\t{}".format (json.dumps (intent)))
print("Entities: {}".format (predictionResponse.prediction.entities))

Top intent: BookFlights
Intents: 
	"BookFlights"
Entities: {'Departure': ['Tel Aviv?']}


In [11]:
# Boucle de prédiction
predictionResponseList = []
for i in range(len(utterances_test)):
    predictionRequest = {"query" : utterances_test[i]['text']}
    predictionResponse = clientRuntime.prediction.get_slot_prediction(app_id, "Production", predictionRequest)
    predictionResponseList.append(predictionResponse.as_dict())
print(len(predictionResponseList))
print(predictionResponseList[0])

170
{'query': 'Do you have trips out of Tel Aviv?', 'prediction': {'top_intent': 'BookFlights', 'intents': {'BookFlights': {'score': 0.9995054}}, 'entities': {'Departure': ['Tel Aviv?']}}}


In [38]:
entities_pred = predictionResponseList[0]['prediction']['entities']
for i in entities_pred:
    print(i)
print(predictionResponseList[0]['prediction']['entities']['Departure'][0])

Departure
Tel Aviv?


In [68]:
predictedDf = test.copy()
predictedDf = predictedDf.reset_index(drop=True)
display(predictedDf.head())
print(predictedDf.shape[0])

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget
0,Do you have trips out of Tel Aviv?,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472750000000.0,,book,Tel Aviv,,,,
1,I'd like to find a vacation from Kabul to Sant...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472059000000.0,,book,Kabul,Santiago,-1,,1800.0
2,I want to throw my parents on a plane and get ...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473445000000.0,,book,,,,,
3,Please check if there is a flight to Naples fr...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471880000000.0,,book,Fort Lauderdale,Naples,August 31,,
4,Please find a flight from Beijing to Kochi for...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472577000000.0,,book,Beijing,Kochi,Sept 16,20.0,


170


In [69]:
predictedDf["pred_or_city"] = np.nan
predictedDf["pred_dst_city"] = np.nan
predictedDf["pred_str_date"] = np.nan
predictedDf["pred_end_date"] = np.nan
predictedDf["pred_budget"] = np.nan
for j in predictedDf.index:
    entities_pred = predictionResponseList[j]['prediction']['entities']
    for i in entities_pred:
        if i == 'Departure':
            predictedDf.loc[j, 'pred_or_city'] = entities_pred['Departure'][0]
        if i == 'Destination':
            predictedDf.loc[j, 'pred_dst_city'] = entities_pred['Destination'][0]
        if i == 'StartDate':
            predictedDf.loc[j, 'pred_str_date'] = entities_pred['StartDate'][0]
        if i == 'EndDate':
            predictedDf.loc[j, 'pred_end_date'] = entities_pred['EndDate'][0]
        if i == 'Budget':
            predictedDf.loc[j, 'pred_budget'] = entities_pred['Budget'][0]
predictedDf.head()

Unnamed: 0,text,labels,author,timestamp,db,intent,or_city,dst_city,str_date,end_date,budget,pred_or_city,pred_dst_city,pred_str_date,pred_end_date,pred_budget
0,Do you have trips out of Tel Aviv?,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472750000000.0,,book,Tel Aviv,,,,,Tel Aviv?,,,,
1,I'd like to find a vacation from Kabul to Sant...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472059000000.0,,book,Kabul,Santiago,-1,,1800.0,Kabul,Santiago,,,1800.0
2,I want to throw my parents on a plane and get ...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1473445000000.0,,book,,,,,,,,,,
3,Please check if there is a flight to Naples fr...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1471880000000.0,,book,Fort Lauderdale,Naples,August 31,,,Fort Lauderdale,Naples,August 31,,
4,Please find a flight from Beijing to Kochi for...,"{'acts': [{'args': [{'val': 'book', 'key': 'in...",user,1472577000000.0,,book,Beijing,Kochi,Sept 16,20.0,,Beijing,Kochi,,20.0,


In [41]:
predictedDf.isnull().sum()

text               0
labels             0
author             0
timestamp          0
db               170
intent             0
or_city           68
dst_city          53
str_date         110
end_date         138
budget           136
pred_or_city      76
pred_dst_city     68
pred_str_date    119
pred_end_date    141
pred_budget      145
dtype: int64

In [63]:
from sklearn.metrics import accuracy_score

predictedDf = predictedDf.replace(np.nan, "")
entities_list = ['Origin city', 'Destination city', 'Start date', 'End date', 'Budget']
entities_gt = ['or_city', 'dst_city', 'str_date', 'end_date', 'budget']
entities_pred = ['pred_or_city', 'pred_dst_city', 'pred_str_date', 'pred_end_date', 'pred_budget']

for i in range(len(entities_list)):
    precision = accuracy_score(predictedDf[entities_gt[i]], predictedDf[entities_pred[i]])
    print('Precision for {} is :'.format(entities_list[i]))
    print(precision)

Precision for Origin city is :
0.788235294117647
Precision for Destination city is :
0.6529411764705882
Precision for Start date is :
0.8176470588235294
Precision for End date is :
0.8823529411764706
Precision for Budget is :
0.8764705882352941


In [70]:
# Save the prediction of Luis model
predictedDf.to_csv('predicted_df.csv', index=False)