# Développez un chatbot pour réserver des vacances

Il a identifié les deux outils suivants pour faciliter la réalisation de cette V1 :

- Le code source du framework de développement Python de Microsoft Bot `“Microsoft Bot Framework SDK v4 for Python”`.
- Le `service cognitif LUIS d’Azure` qui permet de faire une analyse sémantique d’un message saisi par l’utilisateur et le structurer pour traitement par le bot (il devrait te permettre d’identifier les cinq éléments demandés)


Comme ce projet est itératif, nous avons limité les fonctionnalités de la V1 du chatbot. La V1 devra pouvoir identifier dans la demande de l’utilisateur les cinq éléments suivants :

- Ville de départ
- Ville de destination
- Date aller souhaitée du vol
- Date retour souhaitée du vol
- Budget maximum pour le prix total des billets
 

Si un des éléments est manquant, le chatbot devra pouvoir poser les questions pertinentes (en anglais) à l’utilisateur pour comprendre complètement sa demande. Lorsque le chatbot pense avoir compris tous les éléments de la demande de l’utilisateur, il doit pouvoir reformuler la demande de l’utilisateur et lui demander de valider sa compréhension.

Il est conseillé d'utiliser une Azure Application Insight pour suivre l'activité du chatbot quand il sera en production.

https://openclassrooms.com/fr/paths/188/projects/725/assignment


https://medium.com/analytics-and-data/overview-of-the-different-approaches-to-putting-machinelearning-ml-models-in-production-c699b34abf86

Our strategie of creating the chatbot using Luis is to go from Bot Builder sample example and follow thses setps :
* 1. Import the json file to luis.ai
* 2. Update entities name and intents
* 3. Import utterances
* 4. Train the model suing the SDK
* 5. Update the bot application using Bot Framework SDK
* 6. Test it using the DSK
* 7. Configure the applicayion Insight
* 8. Deploy the Bot Azure


# 1.  Préparation d’exemples LUIS à partir du fichier frame.json

## LUIS Concept

Bot FrameWoirdks has LUIS Integration. 
`LUIS` as Languange Understanding Intellignet Service helps the ChatBot, in which it is integrated to understand `Utterances`. Utterances are what people say. Utterances are simply user input when interacting with a ChatBot. 
In LUIS, we need to define `Intents` and `Entities`. an intent is a goal. What is expected in user input like Schedule a meeting, buy a ticket, an so on. `Entities` are facts or parameters linked to intents. They help to refine an intent, make it more understandable by the BOT...?



**Create a schema that matches your real-world scenario**

Create an intent for each action your bot can perform. Use entities to collect data needed to complete that action.

**Use a similar number of examples for each intent**

When LUIS makes a prediction, it gets results from all models and picks the top scoring model. Example utterances assigned to one intent act as negative examples for all other intents.
Intents with significantly more positive examples are more likely to receive positive predictions. This is called data imbalance.

**Use examples with distinct vocabulary**

Make sure the vocabulary for each intent is used only in that intent. Intents with overlapping vocabulary can confuse LUIS and cause the scores for the top intents to be very close.
This causes unclear predictions.


**Apply active learning to add user utterances**
Use active learning to add real-world user utterances to your model. Real user utterances can reveal patterns of word choice and placement.LUIS is meant to learn quickly with few examples. User utterances will help you find examples with a variety of formats.


**Add phrase lists and patterns**
Phrase lists designate words or phrases as more significant to your app than other words.Patterns can improve your prediction accuracy when word order is important to differentiate between intents.Phrase lists and patterns are weighted more heavily than examples, so they should be added after adding user utterances to your example data.



* A **`turn`** is one interaction between the user and the system, and a conversation is made of at least two turns. A question and an answer.

# Dataset Format

## Global Properties
* **id**	
Refers to a unique identification for the dialogue.
* **user_id**	
Refers to a unique identifier for the user taking part in the dialogue.
* **wizard_id**	
Refers to a unique identifier for the wizard taking part in the dialogue.



## Labels

* **userSurveyRating** 	
A value that represents the user’s satisfaction with the Wizard’s service, ranging from 1 – complete dissatisfaction to 5 – complete satisfaction.
* **wizardSurveyTaskSuccessful**	
A boolean which is true if the wizard thinks at the end of the dialogue that the user’s goal was achieved.


## Turns

* **author**
The author of the message in a dialogue. i.e. “user” or “wizard”.
* **text**
The sentence that the author uttered. It is the exact text that the author of a turn said. E.g. “text”: “Consider it done. Have a great trip!”.
* **labels**
JSON object which has three keys: `active_frame`, `acts`, and `acts_without_refs`. 
 - The **active_frame** is the `id` of the currently active frame. 
 - The **acts** are the dialogue acts for the current utterance. Each act has a `name` and `arguments args`. The name is the name of the dialogue act, for instance, `offer`, or `inform`. The args contain the slot types (`key`) and slot values (`val`), for instance `budget=2000`. Slot values are optional. An act contains a ref `tag` whenever a user or wizard refers to a `past frame`. 
 - The **acts_without_refs** are similar to the acts except that they do not have these `ref tags`. 
 - We define the **frame tracking task** as the task that takes as input the `acts_without_refs` and outputs the `acts`.
* **timestamp**	
Unix timestamp denoting the time at which the current turn occurred.
* **frames**
List of frames up to the current turn. Each frame has the following keys: `frame_id`, `frame_parent_id`, `requests`, `binary_questions`, `compare_requests`, and `info`.
* **db**
It can only occur during a wizard’s turn. It is a list of search queries made by the wizard with the associated list of search results.
E.g. “db”: {“search”: [{“ORIGIN_CITY”: “Montreal”}], “result”: []}


In [583]:
import json
import time
import datetime
from pprint import pprint

import json

with open('./data/frames/frames.json') as inputfile : 
    frames = json.load(inputfile)

In [584]:
frames[0].keys()

dict_keys(['user_id', 'turns', 'wizard_id', 'id', 'labels'])

In [585]:
len(frames)

1369

In [510]:
frames[0]['turns'][0].keys()

dict_keys(['text', 'labels', 'author', 'timestamp'])

We are going to work on the first turn and for each turn we gonna take the `text`and the `labels`

In [587]:
frames[0]['turns'][0]['labels']['acts']

[{'args': [{'val': 'book', 'key': 'intent'}], 'name': 'inform'},
 {'args': [{'val': 'Atlantis', 'key': 'dst_city'},
   {'val': 'Caprica', 'key': 'or_city'},
   {'val': 'Saturday, August 13, 2016', 'key': 'str_date'},
   {'val': '8', 'key': 'n_adults'},
   {'val': '1700', 'key': 'budget'}],
  'name': 'inform'}]

In [589]:
frames[0]['turns'][0]['text']

"I'd like to book a trip to Atlantis from Caprica on Saturday, August 13, 2016 for 8 adults. I have a tight budget of 1700."

# 2. Création et entraînement app LUIS

We created the LUIS model from a json file taken from a sample case (BookingFlight.json). We then imported this file on Luis. We adapted the model to our case.

Download the repo from Microsoft github repo 
https://github.com/microsoft/BotBuilder-Samples/tree/main/samples/python/21.corebot-app-insights/cognitiveModels

![image.png](attachment:image.png)


In [None]:
# LUIS CREATION

In [488]:
from azure.cognitiveservices.language.luis.authoring import LUISAuthoringClient
from azure.cognitiveservices.language.luis.authoring.models import ApplicationCreateObject
from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient
from msrest.authentication import CognitiveServicesCredentials
from functools import reduce

import json, time, uuid


import pandas as pd


# AUTH
authoringKey = ''
authoringEndpoint = 'https://openai-authoring.cognitiveservices.azure.com/'


predictionKey=""
predictionEndpoint="https://francecentral.api.cognitive.microsoft.com/"

# DEFINE THE CLIENT
client = LUISAuthoringClient(authoringEndpoint, CognitiveServicesCredentials(authoringKey))

# the app Id of the imported 
app_id="14445580-ac63-4dee-8408-71d066f97912"

version_id = "0.1"

# 4. Train the model using the BotFrameWork SDK and LUIS

In [518]:
# UTILS
def get_example_label(utterance, entity_name, value):
    """
    args:
        utterance : text raw input
        entity_name : the name of the entity
        value : the name of the word to lookup in the utterance
    rerturn:
        dict of entity indexes

    """
    utterance = utterance.lower()
    value = value.lower()
    return {
        'entity_name': entity_name,
        'start_char_index': utterance.find(value),
        'end_char_index': utterance.find(value) + len(value)
    }

# keys to look for in v3_keys
v3_keys=['or_city', 'dst_city', 'budget', 'str_date', 'end_date']


def build_training_list(frames=frames, noref = False):
    """
    args:
        frames: the json dataset
        noref: take True or False. Help choosing between acts or acts without refs
    
    return:
        list of the cleaned version of the dataset
    """
    frame_list = []
    args_list = []
    
    for i in range(len(frames)):
        
        print(f'now_processing_interaction {i} of {len(frames)}')
        clear_output(wait=True)
        
        frame_args={}
        # ensure that we got a value fro userSurveyRating
        frames[i]['labels']['userSurveyRating'] = frames[i]['labels']['userSurveyRating']\
        if frames[i]['labels']['userSurveyRating'] else 0
        
        # choosing only good rating : 4 and 5
        if frames[i]['labels']['userSurveyRating']>=4:
            text_i = frames[i]['turns'][0]['text']
            if noref : 
                frame_i = frames[i]['turns'][0]['labels']['acts_without_refs']
            else : 
                frame_i = frames[i]['turns'][0]['labels']['acts']
            
            frame_args['text']=text_i
            frame_args['labels']=frame_i
        
            
        frame_list.append(frame_args)
        
    # take only element of the list that have text and labels (both of them)
    frames=[]
    for x in frame_list:
        if (len(x.keys()))==2:
            frames.append(x)
            
    return frames

In [519]:
def create_utterances_list(row_id):
    """ 
    work from the frame_list data. and return utterances
    args:
        row_id: index of the list
    returns:
        utterances ready to be imported in LUIS
        
    
    """
    args_list = []
    utt_arg = {}

    for arg in frame_list[row_id]['labels']:
        if len(arg['args'])!=0:
            for elm in arg['args']:
                if elm['key'] in v3_keys:
                    if len(elm.keys())==2 :
                        args_list.append({elm['key'] : elm['val']})
    
    text=frame_list[row_id]['text']
    
    entities_list=[]
    for i in args_list:
        for k, v in i.items():
            entities_list.append(get_example_label(text, k, v))
        
    
    utt_arg['text']=text
    utt_arg['intent_name']='BookFlight'
    utt_arg['entity_labels']=entities_list
    
    
    return  utt_arg

def create_utterance_to_upload(sample_size):
    """ take sample and return a list of utterances
    """
    utterances=[]
    for row_id in range(sample_size):
        utterances.append(create_utterances_list(row_id))
    
    return utterances



def upload_batch_utterances(utterances, batch_size=100):
    """take utterances list and upload with a batch of 100
    
    """
    
    for i in range(len(utterances)//100):
        utt100=[]
        for i in range(i * batch_size, (i + 1) * batch_size):
            utt100.append(utterances[i])
        
        utterances_result = client.examples.batch(
        app_id,
        version_id,
        utt100 )
    print("Utteranes are imported!!")

In [523]:
app_id, version_id

('14445580-ac63-4dee-8408-71d066f97912', '0.1')

In [524]:
client.model.add_composite_entity(app_id, version_id, children=['datetimeV2'], name='str_date')

'3a672679-617b-463c-b80d-022c94c3cb61'

In [525]:
client.model.add_composite_entity(app_id, version_id, children=['datetimeV2'], name='end_date')

'de8cd511-38ac-484f-b8ed-f0b11eb8e4c1'

In [527]:
# UPLOAD UTTERANCES

In [520]:
frame_list=build_training_list()

now_processing_interaction 1368 of 1369


In [590]:
# example of the utterances of index 21
create_utterances_list(0)

{'text': "I'd like to book a trip to Atlantis from Caprica on Saturday, August 13, 2016 for 8 adults. I have a tight budget of 1700.",
 'intent_name': 'BookFlight',
 'entity_labels': [{'entity_name': 'dst_city',
   'start_char_index': 27,
   'end_char_index': 35},
  {'entity_name': 'or_city', 'start_char_index': 41, 'end_char_index': 48},
  {'entity_name': 'str_date', 'start_char_index': 52, 'end_char_index': 77},
  {'entity_name': 'budget', 'start_char_index': 117, 'end_char_index': 121}]}

In [532]:
# 500 input utterances
utterances=create_utterance_to_upload(1000)

In [533]:
upload_batch_utterances(utterances)

Utteranes are imported!!


In [517]:
# HERE THE LUIS MODEL APP_ID AND VERSION_ID
app_id, version_id

('14445580-ac63-4dee-8408-71d066f97912', '0.1')

In [470]:
# TRAINING
client.train.train_version(app_id, version_id)
waiting = True
while waiting:
    info = client.train.get_status(app_id, version_id)

    # get_status returns a list of training statuses, one for each model. 
    # Loop through them and make sure all are done.
    waiting = any(map(lambda x: 'Queued' == x.details.status or 'InProgress' == x.details.status, info))
    if waiting:
        print ("Waiting 10 seconds for training to complete...")
        time.sleep(10)
    else: 
        print ("trained")
        waiting = False

Waiting 10 seconds for training to complete...
trained


In [471]:
client.apps.update_settings(app_id, is_public=True)

responseEndpointInfo = client.apps.publish(app_id, version_id, is_staging=False)

In [473]:
#responseEndpointInfo

In [592]:
runtimeCredentials = CognitiveServicesCredentials(predictionKey)
clientRuntime = LUISRuntimeClient(endpoint=predictionEndpoint, credentials=runtimeCredentials)

In [596]:
reponse=clientRuntime.prediction.resolve(CONFIG.LUIS_APP_ID, query=request)

In [598]:
rep.entities
all_entities = response.entities
    
for i in range(0, len(all_entities)):
    print(all_entities[i])

{'additional_properties': {'score': 0.75221944}, 'entity': '2300', 'type': 'budget', 'start_index': 136, 'end_index': 139}
{'additional_properties': {'score': 0.99885434}, 'entity': 'leon', 'type': 'dst_city', 'start_index': 114, 'end_index': 117}
{'additional_properties': {'score': 0.9999425}, 'entity': 'sao paulo', 'type': 'or_city', 'start_index': 100, 'end_index': 109}
{'additional_properties': {'score': 0.9812809}, 'entity': 'september 20th 2022', 'type': 'str_date', 'start_index': 44, 'end_index': 62}
{'additional_properties': {'score': 0.75831884}, 'entity': 'october 4th 2022', 'type': 'str_date', 'start_index': 78, 'end_index': 93}
{'additional_properties': {'resolution': {'values': [{'timex': '2022-09-20', 'Mod': 'after', 'type': 'daterange', 'sourceEntity': 'datetimepoint', 'start': '2022-09-20'}]}}, 'entity': 'starting on september 20th 2022', 'type': 'builtin.datetimeV2.daterange', 'start_index': 32, 'end_index': 62}
{'additional_properties': {'resolution': {'values': [{'ti

In [601]:
for i in range(0, len(all_entities)):
    print(all_entities[i].type)

budget
dst_city
or_city
str_date
str_date
builtin.datetimeV2.daterange
builtin.datetimeV2.daterange


# 5. Tests unitaires et mise en production du chatbot

In [534]:
%%writefile config.py

import os


class DefaultConfig:
    """Bot Configuration"""

    ############## Azure Bot Service ###############
    PORT = 3978
    APP_ID = os.environ.get("MicrosoftAppId", "") 
    APP_PASSWORD = os.environ.get("MicrosoftAppPassword", "")

    ############## LUIS Service ###############
    LUIS_APP_ID = os.environ.get("LuisAppId", "14445580-ac63-4dee-8408-71d066f97912")
    LUIS_API_KEY = os.environ.get("LuisAPIKey", "78583ade4d3d41efad110cdbb7018d52")
    LUIS_API_HOST_NAME = os.environ.get("LuisAPIHostName", "https://francecentral.api.cognitive.microsoft.com/")

    ############## App Insights Service ###############
    APPINSIGHTS_INSTRUMENTATION_KEY = os.environ.get(
        "AppInsightsInstrumentationKey", "59f6eb7d-ae5f-4f72-80c4-c3fd8f7c05e1")

Writing config.py


In [537]:
from azure.cognitiveservices.language.luis.runtime import LUISRuntimeClient
from msrest.authentication import CognitiveServicesCredentials
from config import DefaultConfig

CONFIG = DefaultConfig()


def test_luis_intent():
    """Check LUIS non-regression on *Top intent*
    """
    # Instantiate prediction client
    clientRuntime = LUISRuntimeClient(
        CONFIG.LUIS_API_HOST_NAME,
        CognitiveServicesCredentials(CONFIG.LUIS_API_KEY))
    
    # Create request
    #request ='book a flight from Tunis to Toronto between 22 October 2021 to 5 November 2021, for a budget of $3500'
    request = "Travel to Paris"
    # Get response
    response = clientRuntime.prediction.resolve(CONFIG.LUIS_APP_ID, query=request)

    check_top_intent = 'BookFlight'
    is_top_intent = response.top_scoring_intent.intent
    #print(" Intent ? ",  is_top_intent)
    assert check_top_intent == is_top_intent
    print("Intent is checked ", is_top_intent)
    
    
def test_luis_budget():
    """Check LUIS non-regression on *Destination*
    """
    # Instantiate prediction client
    clientRuntime = LUISRuntimeClient(
        CONFIG.LUIS_API_HOST_NAME,
        CognitiveServicesCredentials(CONFIG.LUIS_API_KEY))
    
    # Create request
    request ='from paris to london with 2000'

    # Get response
    response = clientRuntime.prediction.resolve(CONFIG.LUIS_APP_ID, query=request)
    
    check_budget = '2000'
    all_entities = response.entities
    
    for i in range(0, len(all_entities)):
        if all_entities[i].type == 'budget':
            is_budget = all_entities[i].entity

    assert check_budget == is_budget
    print("Budget checked ", is_budget)

In [536]:
test_luis_intent()

Intent is checked  BookFlight


In [538]:
test_luis_budget()

Budget checked  2000


In [577]:
#request ='i want to go to denver from frankfurt for under 2900 from september 8th to 13th'
request="i would like to find a vacation between september 20th 2022 and october 4th 2022 from sao \
 paulo to leon with a budget of 2300 dollars"

In [580]:
request="i would like to find a vacation starting on september 20th 2022 and ending on october 4th 2022 from sao \
 paulo to London with a budget of 2300 dollars"

In [581]:
response = clientRuntime.prediction.resolve(CONFIG.LUIS_APP_ID, query=request)

In [582]:
all_entities = response.entities
for i in range(0, len(all_entities)):
    #print(all_entities[i].type)
    if all_entities[i].type == 'builtin.datetimeV2.daterange':
        print(all_entities[i])
        d=all_entities[i]
     #   is_budget = all_entities[i].entity

{'additional_properties': {'resolution': {'values': [{'timex': '2022-09-20', 'Mod': 'after', 'type': 'daterange', 'sourceEntity': 'datetimepoint', 'start': '2022-09-20'}]}}, 'entity': 'starting on september 20th 2022', 'type': 'builtin.datetimeV2.daterange', 'start_index': 32, 'end_index': 62}
{'additional_properties': {'resolution': {'values': [{'timex': '2022-10-04', 'Mod': 'before', 'type': 'daterange', 'sourceEntity': 'datetimepoint', 'end': '2022-10-04'}]}}, 'entity': 'ending on october 4th 2022', 'type': 'builtin.datetimeV2.daterange', 'start_index': 68, 'end_index': 93}
