# <span style='color:darkblue'> Praktische Anwendungen in Berufsfeldern: Dialogsystem </span>

## *Professor Burkhardt*

### *Shushen Manakhimova*
#### Sommersemester 2021 01.09.2021

In the project we use code by Tobias Wendel (file_update) as well as Elize Project from https://github.com/codeanticode/eliza

We begin with downloading all the required modules for the project. Here, we have all the modules required for different parts of the project. Modules are files with Python code (a code library or a set of functions that you need) that can be imported inside another Python Program.

You can use any Python source file as a module by executing an import statement in the Python source file.

In [1]:
import os, sys #sys&os modules contain functions relevant to the system of your computer, allowing your Python program to interact with it
import io #managing file-related input and output operations
import requests #accessing data from web
from eliza import eliza #impleting eliza chatbot
import json #for working with json structures
import sox #sound processing
from wendel_util import file_update #updates the data from API
from incidence import incidence #updates the incidence data from API
import emorec #emotion recognition
from google.cloud import speech #speech recognition from Google Cloud
import sounddevice as sd #recording audio
import soundfile as sf #saves audio
import numpy as np #module for numerical data
from scipy.io.wavfile import write #managing the sample rate of the file
import pyttsx3 #text-to-speech conversion library

## Data Update

Here, we get and update the data from the API for vaccination & 7-day incidence

In [2]:
file_update()
vaccinations = open('vaccinations.json')
vaccinations = json.load(vaccinations)

Up To Date


In [3]:
incidence()
incidence = open('incidence.json')
incidence = json.load(incidence)
inc = incidence["data"][0]["weekIncidence"]
inc = round(inc)

Up To Date


In [4]:
joke = requests.get("https://v2.jokeapi.dev/joke/Any?lang=de&blacklistFlags=nsfw,racist,sexist&type=single").json()['joke']
joke

'Treffen sich ein Informatiker und ein Wirtschaftsinformatiker.\nInformatiker: "Hast Du schon das neue Ubuntu?"\nDer Wirtschaftsinformatiker: "Nein, ich steh nicht auf Pokemon."'

## Input

### 1. Audio Recording

Here, we define the parameters of the recording that is our input, then record & save it. With the help of Google Could Speech-to-Text we later transcribe the recording

### 2. Speech-to-Text

Transform the input (audio) into text

In [5]:
sr = 16000  # Sample rate
duration = 5  # Duration of recording
filename = 'myfile.wav' #recording of my speech

In [6]:
def record_file(): 
    data = sd.rec(int(duration * sr), samplerate=sr, channels=1)
    sd.wait()  
    sf.write(filename, data, sr)
    # Convert `data` to 16 bit integers:
    y = (np.iinfo(np.int16).max * (data/np.abs(data).max())).astype(np.int16) 
    write(filename, sr, y)

In [7]:
def init_google(): #speech recognition; first we define the function
    credentials='/Users/shushanamanakhimova/S_Dialog/.json'
    os.environ["GOOGLE_APPLICATION_CREDENTIALS"]=credentials

In [8]:
init_google() #initialize

In [9]:
def normalize(in_s): #converting all letters to lower or upper case; converting numbers into words, remove stopwords, lemmatization, stemming, etc. 
    return in_s.lower()

In [10]:
def transcribe(): #transcribing our speech & returning it as text
    client = speech.SpeechClient()
    with io.open(filename, "rb") as audio_file:
        content = audio_file.read()
    audio = speech.RecognitionAudio(content = content)
    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
        language_code="de-DE",
    )
    response = client.recognize(config=config, audio=audio)
    for result in response.results:
        for index, alternative in enumerate(result.alternatives):
            print("Human_Shushana {}: {}".format(index, alternative.transcript))
            return alternative.transcript

In [11]:
def speech_input(): #function that receives file with audio and returns text
    record_file()
    text = transcribe()
    return text

In [12]:
 def do_input():
    return speech_input()

## Semantic Parsing

Semantic parsing is needed to turn natural language into formal meaning representations. 

We create a set of phrases and keywords that will be used in the dialogmanager. This way, keywords will be detected in the input (speech) and forms an answer accordingly

In [13]:
phrases = {'hello':'Willkommen bei der Corona Impfauskunft. Fragen Sie!', 
    'continue':'Weiter!', 
    'goodbye':'Vielen Dank für Ihren Besuch!', 
    'done':'fertig'}




states_d = {'schleswig':'SH', 'hamburg':'HH', 'berlin':'BE', 'bayern':'BY', 
            'niedersachsen': 'NI', 'bremen': 'HB', 
            'nordrhein':'NW', 'hessen':'HE', 'rheinland':'RP', 'baden':'BW', 
            'saarland': 'SL', 'brandenburg':'BB', 'mecklenburg':'MV', 'sachsen':'SN',
            'anhalt':'ST', 'thüringen':'TH', 'deutschland':'DE', 'hier':'DE'}
state_names = {'SH':'Schleswig-Hostein', 'HH':'Hamburg', 'BE':'Berlin', 'BY':'Bayern', 
            'NI':'Niedersachsen', 'HB':'Bremen', 
            'NW': 'Nordrhein Westphalen', 'HE':'Hessen', 'RP':'Rheinland Pfalz', 'BW':'Baden Würthenberg', 
            'SL':'Saarland', 'BB':'Brandenburg', 'MV':'Mecklenburg Vorpommern', 
            'SN': 'Sachsen', 'ST':'Sachsen-Anhalt', 'TH':'Thüringen', 'DE':'Deutschland'}
vaccines_d = {'biontech':'biontech', 'biontec':'biontech', 
              'moderna':'moderna', 
              'janssen':'janssen', 'jansen':'janssen',
              'delta':'delta',
              'astraZeneca':'astraZeneca', 'astra':'astraZeneca', 'zeneca':'astraZeneca'}
vaccine_names = {'biontech':'Biontech', 'moderna':'Moderna', 'janssen':'Janssen', 'delta':'Delta',
              'astraZeneca':'Astra Zeneca'}
incidence_p = {'Lockdown': 'lockdown', 'lockdown': 'Lockdown', 'geschlossen': 'Geschlossen', 'Geschlossen': 'geschlossen'}
joke_p = {'Witze': 'witze', 'witze': 'witze', 'Witz': 'witze', 'witz': 'witze'}

In [15]:
#this function looks for the keywords in the input(speech). when found, it adds the key to the semantics array
def semantic(input_s):
    semantics = {'state':'', 'vaccine':'', 'incidence': '', 'joke':'', 'answer':0} 
    for key in states_d.keys():
        if key in input_s:
            semantics['state'] = states_d[key]
            break
    for key in vaccines_d.keys():
        if key in input_s:
            semantics['vaccine'] = vaccines_d[key]
            break
    for key in incidence_p.keys():
        if key in input_s:
            semantics['incidence'] = incidence_p[key]
            break
    for key in joke_p.keys():
        if key in input_s:
            semantics['joke'] = joke
            break
    return semantics

In [16]:
# checking what kind of information has been added to the semantics array (vaccine, incidince, state) and add the appropriate info to the answer
# expects semantics: semantics[0] == bundesland, semantics[1] == impfstoff 
def data(semantics):
    s = semantics['state']
    v = semantics['vaccine']
    i = semantics['incidence']
    j = semantics['joke']
    if i:
        if inc > 30:
            semantics['answer'] = 'Ja'
        else:
            semantics['answer'] = 'Nein'
    elif s: # state given
        if s != 'DE':
            if v: # and vaccine given
                semantics['answer'] = vaccinations["data"]["states"][s]['vaccination'][v]
            else: # all vaccines for state
                semantics['answer'] = vaccinations["data"]["states"][s]['vaccinated']
        else:
            if v: # and vaccine given
                semantics['answer'] = vaccinations["data"]['vaccination'][v]
            else: # all vaccines for Germany
                semantics['answer'] = vaccinations['data']['vaccinated']
    elif j: 
        semantics['answer'] = joke    
    else: # no state
        if v: # but vaccine
            semantics['answer'] = vaccinations["data"]['vaccination'][v]
        else: # nothing given
            semantics['answer'] = None
    
    return semantics

### Output

The function returns an answer that is an output using the RKI API. 
If in the input we don't have any COVID-19 questions, the output will be generated with Eliza. 

In [17]:
def output(semantics, inputs, elz):
    ret = ''

    s = semantics['state']
    v = semantics['vaccine']
    a = semantics['answer']
    i = semantics['incidence']
    j = semantics['joke']
    if i:
        ret = '{},  denn die 7-Tage Inzidenz ist {}'.format(a, inc)
    elif s: # state given
        s = state_names[s]
        if v: # and vaccine given
            v = vaccine_names[v]
            ret = 'Die Impfungen für {} mit {} sind {}'.format(s, v, a)
        else: # all vaccines for state
            ret = 'Die Impfungen für {} sind {}'.format(s, a)
    elif j: 
        ret = 'Hier ist der Witz: {}'.format(a) 
    else: # no state
        if v: # but vaccine
            v = vaccine_names[v]
            ret = 'Die Impfungen in Deutschland mit {} sind {}'.format(v, a)
        else: # nothing given
            ret =  elz.respond(inputs)
    
    return ret

### Eliza 

Here, we initialize Eliza that generates output not connected to COVID-19 and continues a dialog by asking questions.

We a python file with the code and a text file with the text for dialog in German

In [18]:
def init_eliza():
    root = r'/Users/shushanamanakhimova/S_Dialog/'
    elz = eliza.Eliza()
    elz.load(root+'eliza/deutsch.txt')
    return elz

### Text-to-Speech

We generate the bot's speech

In [19]:
def tts(text):
    engine = pyttsx3.init()
    engine.setProperty('voice', 'german')
    engine.setProperty('rate', 200)
    engine.say(text)
    engine.runAndWait()

In [20]:
def output_s(text):
    print('Alexbot: '+text)
    tts(text) #prints output generated before

### Dialogmanager 

In [21]:
emo_dict = {'happiness':'glücklich', 'neutral': 'wie immer', 'anger': 'irritiert', 'sadness': 'traurig', 
            'fear': 'ängstlich', 'boredom':'gelangweilt', 'disgust':'angeekelt'} #defining emotions for emotion recognisition

In [None]:
#we use dialogmanager that controls the dialog and combines all the other functions (emotion recognition, generating answers)
def dialogmanager(elz):
    output_s(phrases['hello'])
    input_s = do_input()
    if (input_s):
        input_s = normalize(input_s)
    while input_s and input_s != phrases['done']: 
        emotion = emoRec.classify(filename)[0]
        emotion_g = emo_dict[emotion]
        if (emotion_g == 'traurig' or emotion_g == 'gelangweilt'):
            output_s('ich merke du bist '+emotion_g)
            output_s('Hier ist ein Witz: '+joke)
        else: 
            output_s('ich merke du bist '+emotion_g)
        semantics = semantic(input_s)
        semantics = data(semantics)
        out_string = output(semantics, input_s, elz)
        output_s(out_string)
        input_s = do_input()
        if (input_s):
            input_s = normalize(input_s)
    output_s(phrases['goodbye'])

In [29]:
#finally :) 
elz = init_eliza() 
emoRec = emorec.EmoRec() #using the library and code from Emorec to tell emotions in the dialog system
dialogmanager(elz)

Alexbot: Willkommen bei der Corona Impfauskunft. Fragen Sie!
Human_Shushana 0: Berlin
Alexbot: ich merke du bist glücklich
Alexbot: Die Impfungen für Berlin sind 2382412
Human_Shushana 0: geschlossen
Alexbot: ich merke du bist irritiert
Alexbot: Ja,  denn die 7-Tage Inzidenz ist 75
Human_Shushana 0: Hessen Hessen
Alexbot: ich merke du bist traurig
Alexbot: Hier ist ein Witz: Treffen sich ein Informatiker und ein Wirtschaftsinformatiker.
Informatiker: "Hast Du schon das neue Ubuntu?"
Der Wirtschaftsinformatiker: "Nein, ich steh nicht auf Pokemon."
Alexbot: ich merke du bist traurig
Alexbot: Die Impfungen für Hessen sind 4077902
Alexbot: Vielen Dank für Ihren Besuch!
