# Chatbot Prototyp

In diesem Notebook werden alle Komponenten, die in den anderen Notebooks entwickelt, trainert und gespeichert wurden, zusammengefügt, um einen voll funktionsfähigen Chatbot zu bilden. 

Im folgenden sind einige kleinere Vorbereitungen zu erledigen, wie das laden der Modells und die Extraktion einiger Informationen aus der Datenbank um die Funktionsfähigkeit des Chatbots zu gewährleisten. 

## Import benötigter Bibliotheken

In [1]:
# import of necessary libraries
import nltk 
from nltk.stem.lancaster import LancasterStemmer
stemmer = LancasterStemmer()

import datefinder
import numpy
import tflearn 
import tensorflow
import random 
import json
import pickle
import spacy
import sqlite3
import re
import os
import sqlite3
from difflib import SequenceMatcher

Instructions for updating:
non-resource variables are not supported in the long term
curses is not supported on this machine (please install/reinstall curses for an optimal experience)
Scipy not supported!


## Intent Classififcation 

Das Neuronale Netzwerk wurde zuvor in einem separaten Notebook erstellt, trainert und gespeichert. Der folgende Code versucht das gespeciherte Neuronale Netzwerk zu laden. Falls dies nicht möglich sein sollte wird ein neues Neuronales Netzwerk erstellt und trainiert. Hierzu werden die Trainingsdaten "data.pickle" geladen und die Layer des Netztes definiert.

In [2]:
path = r"C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\2. Intent Classififcation"
with open(path + "\data.pickle", "rb") as f:
    words, labels, training, output = pickle.load(f)

In [3]:
tensorflow.compat.v1.reset_default_graph()
    
# Creating the Neural Network
net = tflearn.input_data(shape= [None, len(training[0])]) #input layer Neurons = numer of words in training
net = tflearn.fully_connected(net, 8) #hidden layer fully connected with 8 neuron
net = tflearn.fully_connected(net, len(output[0]), activation="softmax" ) #output layer 6 Neurons = labels
net = tflearn.regression(net)

model = tflearn.DNN(net)

try:
    model.load(r"C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\2. Intent Classififcation\model.tflearn")
except:
    model.fit(training, output, n_epoch=1000, batch_size=8, show_metric=True)
    model.save(r"C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\2. Intent Classififcation\model.tflearn")

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Restoring parameters from C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\2. Intent Classififcation\model.tflearn


In [4]:
model

<tflearn.models.dnn.DNN at 0x1cafaf2c748>

In [5]:
# Transformation des Inputs in Zahlen, um es lesbar für das Neuronale Netz zu machen
def bag_of_words(s, words):
    bag = [0 for _ in range(len(words))]
    
    s_words = nltk.word_tokenize(s)
    s_words = [stemmer.stem(word.lower()) for word in s_words]
    
    for se in s_words:
        for i, w in enumerate(words):
            if w == se:
                bag[i] = 1
    
    return numpy.array(bag)

## Slot filling

Für die Extraktion der verschiedenen Zusatzinformationen müssen unteranderem bereits bekannte Namen aus der Datenbank extrahiert werden und in Listen gespeichert werden. Eine Funktion gleicht den Input des Benutzers mit den Namen ab, um zu prüfen ob der vom Benutzer genannte Professor bekannt ist oder nicht. Zusätzlich werden zuvor trainerte Custom NER Modelle geladen und die Slot Filling definiert.

In [6]:
conn = sqlite3.connect("PROF_INFO_DB.db")
cur = conn.cursor()
cur.execute("select first_name FROM PROF_INFO_TABLE")
conn.commit()
first_names = cur.fetchall()
conn.close()

#first_names

In [7]:
#liste mit allen first_names
first_names_list = []
for i in range(len(first_names)):
    answer = " ".join(first_names[i])
    answer.strip()
    first_names_list.append(answer)

#first_names_list

In [8]:
conn = sqlite3.connect("PROF_INFO_DB.db")
cur = conn.cursor()
cur.execute("select last_name FROM PROF_INFO_TABLE")
conn.commit()
last_names = cur.fetchall()
conn.close()

#last_names

In [10]:
#liste mit allen last_names
last_names_list = []
for i in range(len(last_names)):
    answer = " ".join(last_names[i])
    answer.strip()
    last_names_list.append(answer)

#last_names_list

In [11]:
#lod the custom NER models
nlp_research_area = spacy.load(r"C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\3. Slot Filling/custom_ner_model_research_area")
nlp_study = spacy.load(r"C:\Users\Sebi\OneDrive\Studium\Thesis_Chatbot\3. Slot Filling/custom_ner_model_study")

In [12]:
def slot_filling(inp, intent):
    # liste with intents that need information
    intent_prof_contact=["prof_telephone_query_name", "prof_email_query_name", "prof_office_query_name", "prof_research_area_query_name", "prof_study_query_name"]
    intent_generic_conversation=["greeting","greeting_response","courtesy_greeting","courtesy_greeting_response","real_name_query", "goodbye","task_response"]
        
    if intent == "prof_name_query_telephone":
        re_number_1 = r"\D[\d]{2}? [\d]{4} [\d]{3} [\d]{4}"
        re_number_2 = r"\D[\d]{2}? [\d]{4} [\d]{3} [\d]{3}"
        re_number_3 = r"\D[\d]{2} [\(][\d]{1}[\)] [\d]{4} [\d]{3} [\d]{4}"
        extracted_info = re.compile("(%s|%s|%s)" % (re_number_1, re_number_2, re_number_3)).findall(inp) 

    elif intent == "prof_name_query_email":
        extracted_info = re.findall('\S+@\S+', inp)

    elif intent == "prof_name_query_office":
        re_office_1 = r"[A-Z, a-z].\d{1}.\d{2}"
        re_office_2 = r"[A-Z, a-z].\d{3}"
        extracted_info = re.compile("(%s|%s)" % (re_office_1, re_office_2)).findall(inp)

    elif intent == "prof_name_query_research_area":
        doc = nlp_research_area(inp)
        for ent in doc.ents:
            if ent.label_ == "RESEARCH_AREA":
                extracted_info = ent.text    

    elif intent == "prof_name_query_study":
        doc = nlp_study(inp)
        for ent in doc.ents:
            if ent.label_ == "STUDY":
                extracted_info = ent.text  

    elif intent == "prof_name_query_lastname":
        for first_name in first_names_list:
            for word in inp.split():
                if SequenceMatcher(None, first_name, word).ratio() >= 0.7:
                    extracted_info = first_name

    elif intent == "prof_name_query_firstname":
        for last_name in last_names_list:
            for word in inp.split():
                if SequenceMatcher(None, last_name, word).ratio() >= 0.7:
                    extracted_info = last_name

    elif intent in intent_prof_contact:
        for last_name in last_names_list:
            for word in inp.split():
                if SequenceMatcher(None, last_name, word).ratio() >= 0.7:
                    extracted_info = last_name

    elif intent in intent_generic_conversation:
        extracted_info = "generic intent no need for info extraction"

    else:
        extracted_info = "nobody/ nothing"
    
    return extracted_info

## Action Execution

### Information Retrieval

In [15]:
def action_execution_retrieval(intent, condition = str):
    # liste with intents that need information
    intent_generic_conversation=["greeting","greeting_response","courtesy_greeting","courtesy_greeting_response","real_name_query", "goodbye","task_response"]

    
    #connect to database
    conn = sqlite3.connect("PROF_INFO_DB.db")
    cur = conn.cursor()
    
    #TELEPHONE
    if intent == "prof_name_query_telephone":
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where telephone = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()

    elif intent == "prof_name_query_email": 
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where email = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_office":
        cur.execute("select title, first_name, last_name FROM PROF_INFO_TABLE where office = ? COLLATE NOCASE", (condition[0],))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_research_area":
        cur.execute("SELECT title, first_name, last_name FROM PROF_INFO_TABLE INNER JOIN PROF_RESEARCH_AREA_TABLE on PROF_RESEARCH_AREA_TABLE.prof_id = PROF_INFO_TABLE.prof_id WHERE research_area = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchone()
    
    elif intent == "prof_name_query_study":
        cur.execute("SELECT title, first_name, last_name FROM PROF_INFO_TABLE INNER JOIN PROF_STUDY_TABLE on PROF_STUDY_TABLE.prof_id = PROF_INFO_TABLE.prof_id WHERE study = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchone()
        
    elif intent == "prof_name_query_lastname":
        cur.execute("select last_name FROM PROF_INFO_TABLE where first_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_name_query_firstname":
        cur.execute("select first_name FROM PROF_INFO_TABLE where last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_telephone_query_name":
        cur.execute("select telephone FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
    
    elif intent == "prof_email_query_name":
        cur.execute("select email FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_office_query_name":
        cur.execute("select office FROM PROF_INFO_TABLE where last_name =? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_research_area_query_name":
        cur.execute("SELECT research_area FROM PROF_RESEARCH_AREA_TABLE INNER JOIN PROF_INFO_TABLE on PROF_INFO_TABLE.prof_id = PROF_RESEARCH_AREA_TABLE.prof_id WHERE last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    elif intent == "prof_study_query_name":
        cur.execute("SELECT study FROM PROF_STUDY_TABLE INNER JOIN PROF_INFO_TABLE on PROF_INFO_TABLE.prof_id = PROF_STUDY_TABLE.prof_id WHERE last_name = ? COLLATE NOCASE", (condition,))
        conn.commit()
        answer = cur.fetchall()
        
    
    if intent in intent_generic_conversation:
        answer = "generic conversation no data retrieval necessary"
    
    return answer

## Response Generation

In [16]:
def reponse_generation(intent, answer, condition):
    
    if intent == "greeting":
        response = "Hello :)"
    
    elif intent == "greeting_response":
        response = "Nice to meet you"
        
    elif intent == "courtesy_greeting":
        response = "I'm great. Thanks for asking. How are you?"
        
    elif intent == "courtesy_greeting_response":
        response = "Nice to meet you"
        
    elif intent == "real_name_query":
        response = "My name is Chatbot. Whats your name?"
        
    elif intent == "goodbye":
        response = "Goodbye :)"
        
    elif intent == "task_response":
        response = "I know almost everything about the professors of HHN. Just ask me :)"      
    
    elif intent == "prof_name_query_telephone":
        response = "The phone number " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0][2]

    elif intent == "prof_name_query_email": 
        response = "The email " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0][2]
        
    elif intent == "prof_name_query_office":
        response = "The office " + condition[0] + " belongs to " + answer[0][0] + " " + answer[0][1] + " " + answer[0][2]
        
    elif intent == "prof_name_query_research_area":
        response = answer[0] + " " + answer[1] + " " + answer[2] +  " is an expert in "+ condition
    
    elif intent == "prof_name_query_study":
        response = answer[0] + " " + answer[1] + " " + answer[2] + " is a professor in "+ condition
        
    elif intent == "prof_name_query_lastname":
        response = "The last name of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_name_query_firstname":
        response = "The first name of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_telephone_query_name":
        response = "The phone number of " + condition + " is " + answer[0][0]
    
    elif intent == "prof_email_query_name":
        response = "The email of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_office_query_name":
        response = "The office of " + condition + " is " + answer[0][0]
        
    elif intent == "prof_research_area_query_name":
        new_answer = ""
        for i in range(len(answer)):
            new_answer += answer[i][0] + ", "
        response = "The research areas of " + condition + " are " + new_answer
        
    elif intent == "prof_study_query_name":
        new_answer = ""
        for i in range(len(answer)):
            new_answer += answer[i][0] + ", "
        response = "The studies of " + condition + " are " + new_answer
        
    return response

## Chat

In [19]:
def chat():
    log = []
    print("Start talking with me!(type quit to stop):")
    while True:
        inp = input("You: ")
        log.append(inp) #saves all input of user in list
        if inp.lower() == "quit":
            print("Goodbye :)")
            break
        
        # predict the intent
        
        try:
            results = model.predict([bag_of_words(inp, words)])[0] #output is just a probability for each label
            results_index = numpy.argmax(results) #index of greatest value
            intent = labels[results_index] #output is the most probable label
            #print(intent)
        
            extracted_info = slot_filling(inp, intent)
            
            answer = action_execution_retrieval(intent, extracted_info)
        
            response = reponse_generation(intent, answer, extracted_info)
            
            print(response)
            
        except:
            print("intent not found")
        
        
                

        
       # print("condition-->", extracted_info)
        #print("answer from db-->", answer)
        


    return log # returns list of the inputs

In [20]:
chat()

Start talking with me!(type quit to stop):
You: hi
Hello :)
You: who are you
intent not found
You: whats your name
My name is Chatbot. Whats your name?
You: my name is sebi
Nice to meet you
You: what are you doing?
I know almost everything about the professors of HHN. Just ask me :)
You: ok
Hello :)
You: ok
Hello :)
You: jdjfs
Hello :)
You: sföksdfödsjas
Hello :)
You: who is an expert in data science
Prof. Dr.-Ing. Carsten Lanquillon is an expert in data science
You: wher is the office of herr lanquillon
intent not found
You: where sits herr lanquillon
The office of Lanquillon is S.3.45
You: what is the phone number of herr lanquillon
The phone number of Lanquillon is +49 7131 504 6942
You: what is the email of herr beckmann
The email of Beckmann is helmut.beckmann@hs-heilbronn.de
You: who has this email helmut.beckmann@hs-heilbronn.de
The email helmut.beckmann@hs-heilbronn.de belongs to Prof. Dr. Helmut Beckmann
You: who has the following phone number? +49 7131 504 6942
The phone numb

['hi',
 'who are you',
 'whats your name',
 'my name is sebi',
 'what are you doing?',
 'ok',
 'ok',
 'jdjfs',
 'sföksdfödsjas',
 'who is an expert in data science',
 'wher is the office of herr lanquillon',
 'where sits herr lanquillon',
 'what is the phone number of herr lanquillon',
 'what is the email of herr beckmann',
 'who has this email helmut.beckmann@hs-heilbronn.de',
 'who has the following phone number? +49 7131 504 6942',
 'what is the last name of tomas',
 'what is the first name of herr beckmann',
 'who sits in S.3.45?',
 'what are the research areas of herr backmann?',
 'what are the research areas of herr günter',
 'what are the studies of herr stern',
 'Goodbye',
 'thank you',
 'quit']