<img src='../images/bildungscampus_logo.png' width="40%" align="left" />
<img src='../images/hhn.png' width="25%" align="right" />

# Schritt 5: Response Generation
Masterarbeit - Sebastian Kahlert | Fakultät Wirtschaft und Verkehr | Wirtschaftsinformatik - Informationsmanagement und Data Science | WS 2021/22

<img src='../images/bar.png'/>

## Generierung der Antworten für den Benutzer

Hier wird eine neue Funktion implementiert, die den Output aus der Action Execution nimmt und daraus eine Antwort für den Nutzer generiert. Die Antworten sind vordefinierte Muster mit Platzhalter die befüllt werden müssen.

### 5.1. Import benötigter Bibliotheken

In [34]:
# import of necessary libraries
import nltk 
from nltk.stem.lancaster import LancasterStemmer
stemmer = LancasterStemmer()

import datefinder
import numpy
import tflearn 
import tensorflow
import random 
import json
import pickle
import spacy
import sqlite3
import re
import os
import sqlite3
from difflib import SequenceMatcher
import random

### 4.2. Intent Classification

Das Neuronale Netzwerk wurde zuvor in einem separaten Notebook erstellt, trainert und gespeichert. Der folgende Code versucht das gespeciherte Neuronale Netzwerk zu laden. Falls dies nicht möglich sein sollte wird ein neues Neuronales Netzwerk erstellt und trainiert. Hierzu werden die Trainingsdaten "data.pickle" geladen und die Layer des Netztes definiert.

In [35]:
# open training data for neural net
with open("../2. Intent Classififcation/data.pickle", "rb") as f:
    words, labels, training, output = pickle.load(f)

In [36]:
# open saved model or implement new one if model does not exist yet
tensorflow.compat.v1.reset_default_graph()
    
# Creating the Neural Network
net = tflearn.input_data(shape= [None, len(training[0])]) #input layer Neurons = numer of words in training
net = tflearn.fully_connected(net, 8) #hidden layer fully connected with 8 neuron
net = tflearn.fully_connected(net, len(output[0]), activation="softmax" ) #output layer 6 Neurons = labels
net = tflearn.regression(net)

model = tflearn.DNN(net)

try:
    model.load(r"..\2. Intent Classififcation\model.tflearn")
except:
    model.fit(training, output, n_epoch=1000, batch_size=8, show_metric=True)
    model.save(r"..\2. Intent Classififcation\model.tflearn")

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Restoring parameters from C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\2. Intent Classififcation\model.tflearn


In [40]:
model

<tflearn.models.dnn.DNN at 0x2787b7f53c8>

In [41]:
# transform input in numbers to make it readable for neural net
def bag_of_words(s, words):
    bag = [0 for _ in range(len(words))]
    
    s_words = nltk.word_tokenize(s)
    s_words = [stemmer.stem(word.lower()) for word in s_words]
    
    for se in s_words:
        for i, w in enumerate(words):
            if w == se:
                bag[i] = 1
    
    return numpy.array(bag)

### 4.3. Slot Filling

Für die Extraktion der verschiedenen Zusatzinformationen müssen unteranderem bereits bekannte Namen aus der Datenbank extrahiert werden und in Listen gespeichert werden. Eine Funktion gleicht den Input des Benutzers mit den Namen ab, um zu prüfen ob der vom Benutzer genannte Professor bekannt ist oder nicht. Zusätzlich werden zuvor trainerte Custom NER Modelle geladen und die Slot Filling definiert.

#### 4.3.1. Vornamen

In [42]:
# get first names from database
conn = sqlite3.connect("../PROF_INFO_DB.db")
cur = conn.cursor()
cur.execute("select first_name FROM PROF_INFO_TABLE")
conn.commit()
first_names = cur.fetchall()
conn.close()

#first_names

In [43]:
# store first names in list
first_names_list = []
for i in range(len(first_names)):
    answer = " ".join(first_names[i])
    answer.strip()
    first_names_list.append(answer)

#first_names_list

#### 4.3.2. Nachnamen

In [44]:
# get last names from database
conn = sqlite3.connect("../PROF_INFO_DB.db")
cur = conn.cursor()
cur.execute("select last_name FROM PROF_INFO_TABLE")
conn.commit()
last_names = cur.fetchall()
conn.close()

#last_names

In [45]:
# store last names in list
last_names_list = []
for i in range(len(last_names)):
    answer = " ".join(last_names[i])
    answer.strip()
    last_names_list.append(answer)

#last_names_list

#### 4.3.3. Forschungsschwerpunkte und Studiengänge

In [46]:
#load the custom NER models
nlp_research_area = spacy.load(r"../3. Slot Filling/custom_ner_model_research_area")
nlp_study = spacy.load(r"../3. Slot Filling/custom_ner_model_study")

#### 4.3.4. Slot Filling Funktion

In [47]:
def slot_filling(inp, intent):
    # liste with intents that need information
    intent_prof_contact=["prof_telephone_query_name", "prof_email_query_name", "prof_office_query_name", "prof_research_area_query_name", "prof_study_query_name"]
    intent_generic_conversation=["greeting","greeting_response","courtesy_greeting","courtesy_greeting_response","real_name_query", "goodbye","task_response"]
        
    if intent == "prof_name_query_telephone":
        re_number_1 = r"\D[\d]{2}? [\d]{4} [\d]{3} [\d]{4}"
        re_number_2 = r"\D[\d]{2}? [\d]{4} [\d]{3} [\d]{3}"
        re_number_3 = r"\D[\d]{2} [\(][\d]{1}[\)] [\d]{4} [\d]{3} [\d]{4}"
        extracted_info = re.compile("(%s|%s|%s)" % (re_number_1, re_number_2, re_number_3)).findall(inp) 

    elif intent == "prof_name_query_email":
        extracted_info = re.findall('\S+@\S+', inp)

    elif intent == "prof_name_query_office":
        re_office_1 = r"[A-Z, a-z].\d{1}.\d{2}"
        re_office_2 = r"[A-Z, a-z].\d{3}"
        extracted_info = re.compile("(%s|%s)" % (re_office_1, re_office_2)).findall(inp)

    elif intent == "prof_name_query_research_area":
        doc = nlp_research_area(inp)
        for ent in doc.ents:
            if ent.label_ == "RESEARCH_AREA":
                extracted_info = ent.text    

    elif intent == "prof_name_query_study":
        doc = nlp_study(inp)
        for ent in doc.ents:
            if ent.label_ == "STUDY":
                extracted_info = ent.text  

    elif intent == "prof_name_query_lastname":
        for first_name in first_names_list:
            for word in inp.split():
                if SequenceMatcher(None, first_name, word).ratio() >= 0.7:
                    extracted_info = first_name

    elif intent == "prof_name_query_firstname":
        for last_name in last_names_list:
            for word in inp.split():
                if SequenceMatcher(None, last_name, word).ratio() >= 0.7:
                    extracted_info = last_name

    elif intent in intent_prof_contact:
        for last_name in last_names_list:
            for word in inp.split():
                if SequenceMatcher(None, last_name, word).ratio() >= 0.7:
                    extracted_info = last_name

    elif intent in intent_generic_conversation:
        extracted_info = "generic intent no need for info extraction"

    else:
        extracted_info = "nobody/ nothing"
    
    return extracted_info

###  4.4. Action Execution 

Je nach intent wird ein vordefiniertes SQL-Query mit Platzhalter ausgeführt. Zuvor müssen die Platzhalter mit den extrahierten Informationen aus dem Slot Filling befüllt werden. Output ist die Information aus der Datenbank

In [51]:
def action_execution_retrieval(intent, condition = str):
    # liste with intents that need information
    intent_generic_conversation=["greeting","greeting_response","courtesy_greeting","courtesy_greeting_response","real_name_query", "goodbye","task_response"]

    
    #connect to database
    conn = sqlite3.connect("../PROF_INFO_DB.db")
    cur = conn.cursor()
    
    #TELEPHONE
    if intent == "prof_name_query_telephone":
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where telephone = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()

    elif intent == "prof_name_query_email": 
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where email = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_office":
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where office = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_research_area":
        cur.execute("SELECT title, first_name, last_name FROM PROF_INFO_TABLE INNER JOIN PROF_RESEARCH_AREA_TABLE on PROF_RESEARCH_AREA_TABLE.prof_id = PROF_INFO_TABLE.prof_id WHERE research_area = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchone()
    
    elif intent == "prof_name_query_study":
        cur.execute("SELECT title, first_name, last_name FROM PROF_INFO_TABLE INNER JOIN PROF_STUDY_TABLE on PROF_STUDY_TABLE.prof_id = PROF_INFO_TABLE.prof_id WHERE study = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchone()
        
    elif intent == "prof_name_query_lastname":
        cur.execute("select last_name FROM PROF_INFO_TABLE where first_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_firstname":
        cur.execute("select first_name FROM PROF_INFO_TABLE where last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_telephone_query_name":
        cur.execute("select telephone FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
    
    elif intent == "prof_email_query_name":
        cur.execute("select email FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_office_query_name":
        cur.execute("select office FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_research_area_query_name":
        cur.execute("SELECT research_area FROM PROF_RESEARCH_AREA_TABLE INNER JOIN PROF_INFO_TABLE on PROF_INFO_TABLE.prof_id = PROF_RESEARCH_AREA_TABLE.prof_id WHERE last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_study_query_name":
        cur.execute("SELECT study FROM PROF_STUDY_TABLE INNER JOIN PROF_INFO_TABLE on PROF_INFO_TABLE.prof_id = PROF_STUDY_TABLE.prof_id WHERE last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    
    if intent in intent_generic_conversation:
        answer = "generic conversation no data retrieval necessary"
    
    return answer

### 4.5. Response Generation

Für die Generierung der Antworten des Chatbots werden vordefinierte Muster mit Platzhalter verwendet, die nur noch mit den extrahierten Informationen aus der Eingabe des Nutzers und aus der Datenbank befüllt werden. Hierfür wird zunächst eine json file eingelesen, die für jedes Intent verschiedene vordefinierte Antworten mit Platzhalter enthält. Bei der Ausgabe einer Antwort wird zufällig eine der Antworten gewählt, um zu verhindern, dass der Chatbot immer weider die gleiche Antwort liefert. So soll die Konversation natürlicher und menschlicher wirken. 

In [93]:
# open and load json file with responses and intents
with open("responses.json") as file:
    data = json.load(file)

In [94]:
# get the intent labels and the reponses from the file and save them in a dictionary
tags = [] #all intents in the json file
resp = []

for response in data["responses"]:

    if response["tag"] not in tags:
        tags.append(response["tag"])

    if response["sentence"] not in resp:
        resp.append(response["sentence"])
        
responses_dict = dict(zip(tags, resp))

In [117]:
def reponse_generation(intent, answer, condition):
    
    if intent == "greeting":
        responses = responses_dict["greeting"]
        final_response = random.choice(responses)
    
    elif intent == "greeting_response":
        responses = responses_dict["greeting_response"]
        final_response = random.choice(responses)
        
    elif intent == "courtesy_greeting":
        responses = responses_dict["courtesy_greeting"]
        final_response = random.choice(responses)
        
    elif intent == "courtesy_greeting_response":
        responses = responses_dict["courtesy_greeting_response"]
        final_response = random.choice(responses)
        
    elif intent == "real_name_query":
        responses = responses_dict["real_name_query"]
        final_response = random.choice(responses)
        
    elif intent == "goodbye":
        responses = responses_dict["goodbye"]
        final_response = random.choice(responses)
        
    elif intent == "task_response":
        responses = responses_dict["task_response"]
        final_response = random.choice(responses)     
        
    elif intent == "prof_name_query_lastname":
        responses = responses_dict["prof_name_query_lastname"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0][0]) 
        
        #response = "The last name of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_name_query_firstname":
        responses = responses_dict["prof_name_query_firstname"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0][0]) 
       
        #final_response = "The first name of " + condition + " is " + answer[0][0]
    
    elif intent == "prof_name_query_telephone":
        responses = responses_dict["prof_name_query_telephone"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition[0]) 
        final_response = new_reponse.replace("_answer_", answer[0][0] + " " + answer[0][1] + " " + answer[0][2]) 
        #"The phone number " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0]
        
    elif intent == "prof_name_query_email": 
        responses = responses_dict["prof_name_query_email"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition[0]) 
        final_response = new_reponse.replace("_answer_", answer[0][0] + " " + answer[0][1] + " " + answer[0][2])       
        #final_response = "The email " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0][2]
        
    elif intent == "prof_name_query_office":
        responses = responses_dict["prof_name_query_office"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition[0]) 
        final_response = new_reponse.replace("_answer_", answer[0][0] + " " + answer[0][1] + " " + answer[0][2])      
        #final_response = "The office " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0][2]
        
    elif intent == "prof_name_query_research_area":
        responses = responses_dict["prof_name_query_research_area"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0] + " " + answer[1] + " " + answer[2])  
        #final_response = answer[0] + " " + answer[1] + " " + answer[2] +  " is an expert in "+ condition
    
    elif intent == "prof_name_query_study":
        responses = responses_dict["prof_name_query_study"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0] + " " + answer[1] + " " + answer[2])  
        #final_response = answer[0] + " " + answer[1] + " " + answer[2] + " is a professor in "+ condition
        
    elif intent == "prof_telephone_query_name":
        responses = responses_dict["prof_telephone_query_name"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0][0])  
        #final_response = "The phone number of " + condition + " is " + answer[0][0]
    
    elif intent == "prof_email_query_name":
        responses = responses_dict["prof_email_query_name"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0][0])  
        #final_response = "The email of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_office_query_name":
        responses = responses_dict["prof_office_query_name"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", answer[0][0])  
        #final_response = "The office of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_research_area_query_name":
        new_answer = ""
        for i in range(len(answer)):
            new_answer += answer[i][0] + ", "
            
        responses = responses_dict["prof_research_area_query_name"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", new_answer)  
        #final_response = "The research areas of " + condition + " are " + new_answer
        
    elif intent == "prof_study_query_name":
        new_answer = ""
        for i in range(len(answer)):
            new_answer += answer[i][0] + ", "
        
        responses = responses_dict["prof_study_query_name"]
        response = random.choice(responses)
        new_reponse = response.replace("_condition_", condition) 
        final_response = new_reponse.replace("_answer_", new_answer)  
        
        #final_response = "The studies of " + condition + " are " + new_answer
        
    return final_response

In [118]:
results = model.predict([bag_of_words("hello", words)])[0] #output is just a probability for each label
results_index = numpy.argmax(results) #index of greatest value
intent = labels[results_index] #output is the most probable label
print(intent)

greeting


### 4.6. Testen der Action Execution Funktion

In [119]:
def chat():
    log = []
    print("Start talking with me!(type quit to stop):")
    while True:
        inp = input("You: ")
        log.append(inp) #saves all input of user in list
        if inp.lower() == "quit":
            print("Goodbye :)")
            break
        
        # predict the intent
        
        try:
            # predict the intent
            results = model.predict([bag_of_words(inp, words)])[0] #output is just a probability for each label
            results_index = numpy.argmax(results) #index of greatest value
            intent = labels[results_index] #output is the most probable label
            #print(intent)
            
            extracted_info = slot_filling(inp, intent)
            #print(extracted_info)
            
            answer = action_execution_retrieval(intent, extracted_info)
            
            #print(answer)
        
            response = reponse_generation(intent, answer, extracted_info)
            
            
            print(response)
            
        except:
            print("intent not found")
        
        
                

        
       # print("condition-->", extracted_info)
        #print("answer from db-->", answer)
        


    return log # returns list of the inputs

In [120]:
chat()

Start talking with me!(type quit to stop):
You: hello
Good Morning
You: who is an expert in machine learning
Prof. Dr.-Ing. Carsten Lanquillon should know a lot about machine learning
You: who is an expert in datenökonomie
intent not found
You: who is an expert in machine learning
machine learning is ome of the research areas of Prof. Dr.-Ing. Carsten Lanquillon
You: who is an expert in machine learning
machine learning is ome of the research areas of Prof. Dr.-Ing. Carsten Lanquillon
You: who is an expert in machine learning
machine learning is ome of the research areas of Prof. Dr.-Ing. Carsten Lanquillon
You: who is an expert in machine learning
machine learning is ome of the research areas of Prof. Dr.-Ing. Carsten Lanquillon
You: who is an expert in machine learning
machine learning is ome of the research areas of Prof. Dr.-Ing. Carsten Lanquillon
You: who is an expert in machine learning
Prof. Dr.-Ing. Carsten Lanquillon is doing research in machine learning
You: who is an expert

You: where is the office of herr stern
The research areas of Stern are Agile Methoden, Computer Supported Cooperative Work, Projektmanagement, Softwarearchitektur, Softwaretechnik, Verteilte / dezentrale Kollaboration, Wirtschaftsinformatik , 
You: what is the office numbr of herr stern 
Agile Methoden, Computer Supported Cooperative Work, Projektmanagement, Softwarearchitektur, Softwaretechnik, Verteilte / dezentrale Kollaboration, Wirtschaftsinformatik ,  are the reseach areas of Stern
You: what is the office numbr of herr stern 
Here are the reaerch areas of Stern : Agile Methoden, Computer Supported Cooperative Work, Projektmanagement, Softwarearchitektur, Softwaretechnik, Verteilte / dezentrale Kollaboration, Wirtschaftsinformatik , 
You: what is the office numbr of herr stern 
The research areas of Stern are Agile Methoden, Computer Supported Cooperative Work, Projektmanagement, Softwarearchitektur, Softwaretechnik, Verteilte / dezentrale Kollaboration, Wirtschaftsinformatik , 
Y

['hello',
 'who is an expert in machine learning',
 'who is an expert in datenökonomie',
 'who is an expert in machine learning',
 'who is an expert in machine learning',
 'who is an expert in machine learning',
 'who is an expert in machine learning',
 'who is an expert in machine learning',
 'who is an expert in machine learning',
 'who is an expert in data science',
 'who is an expert in softwarearchitekturen',
 'who is an expert in softwaretechnik',
 'hello',
 'hello',
 'im sebi',
 'my name is sebi',
 'my name is sebi',
 'my name is sebi',
 'my name is sebi',
 'my name is sebi',
 'how are you',
 'how are you',
 'how are you',
 'im good ',
 'im good thank you',
 '',
 '',
 'what is your name',
 '',
 'what is your name',
 'what is your name',
 'what is your name',
 'what is your name',
 'what is your name',
 'what is your name',
 'goodbye',
 'goodbye',
 'goodbye',
 'what are you doing',
 'what are you doing',
 'what are you doing',
 'what are you doing',
 'what is the last name of tho