# Dialogue System Pepper 
The code aims to give Pepper basic conversation abilities, this includes a speech recognition module, a conversational engine to formulate the answers and the speech synthesis. 
The dialogue is purpose-less, this means that no task is pursued except a normal and pleasant interaction (this is the objective). 
It can be used as a fall-back system also for a task oriented interaction, often the counterpart tends to ramble or to test the IQ of the robot asking general questions. <br>

Author: Igor Lirussi <br>
Email: igor.lirussi(at)studio.unibo.it


## Table of Contents:
- [Requirements](#req)
- [CONVERSATIONAL ENGINE](#conv)
- [PEPPER PART](#pepper)
    - [SPEECH SYNTHESIS](#synth)
    - [SPEECH RECOGNITION](#rec)
- [Closing](#close)

## Requirements <a class="anchor" id="req"></a>
The Speech Synthesis works with
* **Python 2.7** ,  because it uses
* [Pepper API (NAOqi 2.5) ](https://developer.softbankrobotics.com/pepper-naoqi-25/naoqi-developer-guide/naoqi-apis)


The Conversational Engine works with
* **Java** (because no AIML-2.0 systems in Python 2 were found)

The Speech Recognition module was built to be able to run ON Pepper computer (in the head) it's only dependencies are
* **Python 2.7** ,  because it uses
* [Pepper API (NAOqi 2.5) ](https://developer.softbankrobotics.com/pepper-naoqi-25/naoqi-developer-guide/naoqi-apis)
* **numpy**

All of them are pre-installed on Pepper, if you want to run on your computer just create an environment that has all them.

In [1]:
#It has been used python 2.7.18, the cell will give you your current verison.
import sys
print("Python version:")
print (sys.version)

Python version:
3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)]


## CONVERSATIONAL ENGINE <a class="anchor" id="conv"></a>
There should be a "lib" folder with the program Ab.jar, the files retrieved from the engine are in another folder "bots/en/" <br>
It starts a process to which it can be passes a string in input to generate a response. 


In [2]:
import subprocess
from subprocess import Popen, PIPE, STDOUT

pobj = subprocess.Popen(['java', '-jar', 'lib/Ab.jar', 'Main', 'bot=en'],
                            stdin =subprocess.PIPE,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE)



In [3]:
import subprocess as sp
from threading import Thread
from Queue import Queue,Empty
import time

def getabit(o,q):
    for c in iter(lambda:o.read(1),b''):
        q.put(c)
    o.close()

def getdata(q):
    r = b''
    while True:
        try:
            c = q.get(False)
        except Empty:
            break
        else:
            r += c
    return r



q = Queue()
t = Thread(target=getabit,args=(pobj.stdout,q))
t.daemon = True
t.start()

while True:
    print('Sleep for 2 seconds...')
    time.sleep(2)#to ensure that the data will be processed completely
    print('Data received:' + getdata(q).decode())
    if not t.isAlive():
        break
    #in_dat = input('Your data to input:')
    pobj.stdin.write(b'hello\n')
    #when human says nothing
    #pobj.stdin.write(b'\n')
    pobj.stdin.flush()
    break
    

Sleep for 2 seconds...
Data received:Working Directory = /home/chloe/Desktop/Pepper Conversation
Program AB 0.0.4.2 beta -- AI Foundation Reference AIML 2.0 implementation
Main
bot=en
trace mode = false
Name = en Path = /home/chloe/Desktop/Pepper Conversation/bots/en
/home/chloe/Desktop/Pepper Conversation
/home/chloe/Desktop/Pepper Conversation/bots
/home/chloe/Desktop/Pepper Conversation/bots/en
/home/chloe/Desktop/Pepper Conversation/bots/en/aiml
/home/chloe/Desktop/Pepper Conversation/bots/en/aimlif
/home/chloe/Desktop/Pepper Conversation/bots/en/config
/home/chloe/Desktop/Pepper Conversation/bots/en/logs
/home/chloe/Desktop/Pepper Conversation/bots/en/sets
/home/chloe/Desktop/Pepper Conversation/bots/en/maps
Preprocessor: 416 norms 56 persons 9 person2 
Get Properties: /home/chloe/Desktop/Pepper Conversation/bots/en/config/properties.txt
Exists: /home/chloe/Desktop/Pepper Conversation/bots/en/config/properties.txt
Loading AIML Sets files from /home/chloe/Desktop/Pepper Conversatio

In [13]:
print('DATA RECEIVED:\n' + getdata(q).decode())

DATA RECEIVED:
udc.aiml
horoscope.aiml
categoryProcessor: unexpected think
copyme.aiml
jokes.aiml
familiar.aiml
favorites.aiml
reductions_update.aiml
recommendations.aiml
utilities.aiml
that.aiml
reductions1.aiml
yomama.aiml
filterinsults.aiml
happy.aiml
inquiry.aiml
train.aiml
currency.aiml
limericks.aiml
filerinappropriate.aiml
date.aiml
knockknock.aiml
bot_profile.aiml
seasons.aiml
binary.aiml
onthisday.aiml
gender.aiml
sraix.aiml
shutup.aiml
dialog.aiml
client_profile.aiml
update.aiml
filterprofanity.aiml
ontology.aiml
drphil.aiml
Loaded 9255 categories in 6.147 sec
--> Bot en 9255 completed 0 deleted 0 unfinished
50909 nodes 40177 singletons 9255 leaves 0 shortcuts 1477 n-ary 50908 branches 0.99998033 average branching 
Human: Robot: Hi!  I can really feel your smile today.
Human: 


### Process response-string 
this function processes the data that has been received: it retrieves just the string of the answer 

In [5]:
def processResponse(raw):
    response = raw.replace("\n", " ") # changes new-line with space 
    #response = response[7:-7]  # cuts beginning and end
    temp = response.partition('Robot:')[-1].rpartition('Human:')[0] #takes response between "Robot:" and "Human:"
    if not temp:
        return response
    return temp

In [6]:
#test
classic_response = "Robot: Hi nice to see you! \nHuman: "
error_response = "[Error string lenght can vary] Robot: I don't have an answer for that. \nHuman: "
print '-----RAW:-----'
print error_response
print '-----PROCESSED:-----'
print processResponse(error_response)

-----RAW:-----
[Error string lenght can vary] Robot: I don't have an answer for that. 
Human: 
-----PROCESSED:-----
 I don't have an answer for that.  


## PEPPER PART  <a class="anchor" id="pepper"></a>

In [7]:
################### Adjusting IP and ports ###########################
IP_number = "192.168.0.118" #this is local one, use the real robot ip
port_number = 9559 #this is local one, use the real robot port number

In [8]:
#IMPORTS
import naoqi
from naoqi import ALProxy
import qi
import os
import time
from random import randint

In [9]:
#SESSION OPENING
session = qi.Session()
try:
    session.connect("tcp://" + IP_number + ":" + str(port_number))
except RuntimeError:
    print ("Can't connect to Naoqi at ip \"" + args.ip + "\" on port " + str(args.port) +".\n"
               "Please check your script arguments. Run with -h option for help.")
    sys.exit(1)

### SPEECH SYNTHESIS <a class="anchor" id="synth"></a>
The text-to-speech is the one integrated in the robot to keep the Pepper-voice and to use the gestures at the same time. <br>
We need the animated-speech service, but the parameters can be set in the normal text-to-speech service, it will influence the animated one.
Multiple voices are available, "naoenu" is the best, "paola" for a "litte bit" of italianity, it sounds good while gesticulating and reflects more the author of the code :)



In [26]:
#ASKING A SERVICE from the session
#we are using the animated speech, to set the parameters we need to set them in the text to speech service
aup = session.service("ALAnimatedSpeech") #aup = ALProxy("ALAnimatedSpeech",  IP_number, port_number)
tts = session.service("ALTextToSpeech")

#available voices
print( "voices available: "+str(tts.getAvailableVoices()) )

voices available: ['naoenu', 'naomnc', 'paola']


In [27]:
#PARAMETERS 
tts.setVoice("naoenu")
#tts.setParameter("speed", 100) #Acceptable range is [50 - 400]. 100 default.
#tts.setParameter("pitchShift", 1.1) #Acceptable range is [0.5 - 4]. 0 disables the effect. 1 default.
tts.setParameter("volume", 70)#[0 - 100] 70 is ok if robot volume is 60

#reset Speed
#tts.resetSpeed()

In [28]:
#test string
string1="Hello, I am Pepper robot! The speech synthesis is working fine."
string2="Hello! ^start(animations/Stand/Gestures/Hey_1) Nice to meet you ^wait(animations/Stand/Gestures/Hey_1)"
string2="Hello. Look I can stop moving ^mode(disabled) and after I can resume moving ^mode(contextual), you see ?"
wake="^pCall(ALMotion.wakeUp()) Ok, I wake up."

aup.say(string1)

### SPEECH RECOGNITION <a class="anchor" id="rec"></a>
For this part it's mandatory to use a service to record audio on pepper and process it with another method, the integrated speech recognition il limited to a bunch of words. The code of the service will analyse the level of sound intensity and, based on the parameters in the code below, decide when start recording and when stopping. <br>
NOTE: since Pepper cannot process the recognition, but just the amount of noise in the environment this is a [really challenging problem of turn-taking.](https://en.wikipedia.org/wiki/Turn-taking) <br>
Be careful changing the parameters because it could happen that the audio file is stopped too early for a long pause in the speech, the audio file is stopped after the initial silence cause it has been detected that nobody is speaking.
Nevertheless, the ideal thing is to minimize the parameters to reduce the amount of time to recognize the sentence of the person. <br>
The service will send the audio file to Google speech recognition API and generate an event when it receives the response. <br><br>
We create modules that subscribe to this event, the Base just writes the result recognized, the Dialogue one is a litte more complicated:
When the result is received the module if there is nothing recognized forced the robot to ask multiple times to repeat, then it will just listen in loop. If the result is intelligible, it sends recognized string to the conversational engine, processes the response and it passes it to the speech synthesis. At the end it starts listening again. 

>**REMEMBER TO TURN ON THE RECOGNITION SERVICE** with a shell in the speech-recognition folder: <br>
>     use python 2.7 or activate the environment with: *conda activate python2* <br>
>     run the service: <br>
>     *python module_speechrecognition.py --pip (your robot IP)* <br>

In [17]:
#COMPUTER MICROPHONE? 
'''
import speech_recognition as sr
with sr.Microphone() as source:
    try:
        r = sr.Recognizer()
        audio = r.listen(source, timeout = 30)
        catched = r.recognize_google(audio,key = None, language = "en-US", show_all = True)
        print catched
    except:
        print("It didn't work")
'''

'\nimport speech_recognition as sr\nwith sr.Microphone() as source:\n    try:\n        r = sr.Recognizer()\n        audio = r.listen(source, timeout = 30)\n        catched = r.recognize_google(audio,key = None, language = "en-US", show_all = True)\n        print catched\n    except:\n        print("It didn\'t work")\n'

In [29]:
class BaseSpeechReceiverModule(naoqi.ALModule):
    """
    Use this object to get call back from the ALMemory of the naoqi world.
    Your callback needs to be a method with two parameter (variable name, value).
    """

    def __init__( self, strModuleName ):
        try:
            naoqi.ALModule.__init__(self, strModuleName )
            self.BIND_PYTHON( self.getName(),"callback" )

        except BaseException, err:
            print( "ERR: ReceiverModule: loading error: %s" % str(err) )

    # __init__ - end
    def __del__( self ):
        print( "INF: ReceiverModule.__del__: cleaning everything" )
        self.stop()

    def start( self ):
        memory = naoqi.ALProxy("ALMemory", IP_number, port_number)
        memory.subscribeToEvent("SpeechRecognition", self.getName(), "processRemote")
        print( "INF: ReceiverModule: started!" )


    def stop( self ):
        print( "INF: ReceiverModule: stopping..." )
        memory = naoqi.ALProxy("ALMemory", IP_number, port_number)
        memory.unsubscribe(self.getName())

        print( "INF: ReceiverModule: stopped!" )

    def version( self ):
        return "1.1"

    def processRemote(self, signalName, message):
        # Do something with the received speech recognition result
        print(message)


In [30]:
class DialogueSpeechReceiverModule(naoqi.ALModule):
    """
    Use this object to get call back from the ALMemory of the naoqi world.
    Your callback needs to be a method with two parameter (variable name, value).
    """
    
    
    def __init__( self, strModuleName ):
        self.misunderstandings=0
        try:
            naoqi.ALModule.__init__(self, strModuleName )
            self.BIND_PYTHON( self.getName(),"callback" )
            

        except BaseException, err:
            print( "ERR: ReceiverModule: loading error: %s" % str(err) )

    # __init__ - end
    def __del__( self ):
        print( "INF: ReceiverModule.__del__: cleaning everything" )
        self.stop()

    def start( self ):
        memory = naoqi.ALProxy("ALMemory", IP_number, port_number)
        memory.subscribeToEvent("SpeechRecognition", self.getName(), "processRemote")
        print( "INF: ReceiverModule: started!" )
        


    def stop( self ):
        print( "INF: ReceiverModule: stopping..." )
        memory = naoqi.ALProxy("ALMemory", IP_number, port_number)
        memory.unsubscribe(self.getName())

        print( "INF: ReceiverModule: stopped!" )

    def version( self ):
        return "2.0"

    def processRemote(self, signalName, message):
        if autodec:
            #always disable to not detect its own speech
            SpeechRecognition.disableAutoDetection()
            #and stop if it was already recording another time
            SpeechRecognition.pause()
        # received speech recognition result
        print("INPUT RECOGNIZED: \n"+message)
        #computing answer
        if message=='error':
            self.misunderstandings +=1
            if self.misunderstandings ==1:
                answer="I didn't understand, can you repeat?"
            elif self.misunderstandings ==0:
                answer="Sorry I didn't get it, can you say it one more time?"
            elif self.misunderstandings ==0:
                answer="Today I'm having troubles uderstanding what you are saying, I'm sorry"
            else:
                answer=" "
            print('ERROR, DEFAULT ANSWER:\n'+answer)
        else:
            self.misunderstandings = 0
            #sending recognized input to conversational engine
            pobj.stdin.write(b''+message+'\n')
            pobj.stdin.flush()
            #getting answer
            time.sleep(1)#to ensure that the data will be processed completely
            answer = getdata(q).decode()
            answer = processResponse(answer)
            print('DATA RECEIVED AS ANSWER:\n'+answer)
        #text to speech the answer
        aup.say(answer)
        
        if autodec:
            print("starting service speech-rec again")
            SpeechRecognition.start()
            print("autodec enabled")
            SpeechRecognition.enableAutoDetection()
        else:
            #asking the Speech Recognition to LISTEN AGAIN
            SpeechRecognition.startRecording()


INF: ReceiverModule.__del__: cleaning everything
INF: ReceiverModule: stopping...


Exception RuntimeError: RuntimeError("\tALMemory::unsubscribe\n\tCan't find method: unsubscribe (resolved to '(s)')\n",) in <bound method DialogueSpeechReceiverModule.__del__ of <__main__.DialogueSpeechReceiverModule; proxy of <Swig Object of type 'AL::module *' at 0x7ffa00113750> >> ignored


In [31]:
# We need this broker to be able to construct
# NAOqi modules and subscribe to other modules
# The broker must stay alive until the program exists
myBroker = naoqi.ALBroker("myBroker",
   "0.0.0.0",   # listen to anyone
   0,           # find a free port and use it
   IP_number,         # parent broker IP
   port_number)       # parent broker port

try:
    p = ALProxy("DialogueSpeechReceiverModule", "192.168.0.118", 9559)
    p.exit()  # kill previous instance
except:
    pass
# Reinstantiate module

# Warning: ReceiverModule must be a global variable
# The name given to the constructor must be the name of the
# variable
'''
global BaseSpeechReceiverModule
BaseSpeechReceiverModule = BaseSpeechReceiverModule("BaseSpeechReceiverModule")
BaseSpeechReceiverModule.start()
'''

global DialogueSpeechReceiverModule
DialogueSpeechReceiverModule = DialogueSpeechReceiverModule("DialogueSpeechReceiverModule")
DialogueSpeechReceiverModule.start()



SpeechRecognition = ALProxy("SpeechRecognition")
SpeechRecognition.start()
SpeechRecognition.calibrate()
#SpeechRecognition.setLanguage("de-de")

#autodetection
autodec=False #to know if we have to enable again after the robot speech
if autodec:
    SpeechRecognition.enableAutoDetection()
    print("waiting calibration to finish")
    time.sleep(6)
    SpeechRecognition.setAutoDetectionThreshold(20) #to avoid movement of the head to trigger the listening
    #the human speech starts from 20, but the head movement sounds can reach 25, there is no perfect value
    print("threshold updated successfully")
    
#NOTES for autodetection:
#1. for the autodetection the threshold should be high, or it recognizes 
# the head movement as a sound high enough to start listening
# --> it will start to say that he does't understand
# --> it can be possible to deactivate the sententence if no words are recognized
# ----> but it will not react if something is not recognized
#2. the auto-detection should be deactivated when the robot speaks and activated again
# when the sentence is finished, or it will pick up his own speech and answer to himself




# /!\ IF THERE IS ERROR 
#  "Can't find service: SpeechRecognition"
#     REMEMBER TO TURN ON THE SERVICE with a shell with python2:
#     conda activate python2
#     python module_speechrecognition.py --pip (your robot IP)
#  "... object is not callable"
#     execute again the cell of the module (es: the definition of the class 'DialogueSpeechReceiverModule' )

INF: ReceiverModule: started!


In [37]:
SpeechRecognition.printInfo()
SpeechRecognition.setAutoDetectionThreshold(10)

INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 


In [22]:
SpeechRecognition.setLookaheadDuration(2)
#amount of seconds, before the threshold trigger, that will be included in the request
#default is 1

In [33]:
SpeechRecognition.setIdleReleaseTime(3)
#idle time (RMS below threshold) after which we stop recording
#default is 2
#NOTE: too short can cut the sentence in a pause between words

INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 


In [35]:
SpeechRecognition.setHoldTime(4) 
#waits at least these sec to stop from the beginning
#default is 3
#NOTE: too short and while the person thinks what to say the recognition stops

INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 


In [38]:
#manual ask for start recording
SpeechRecognition.startRecording()

INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
error
ERROR, DEFAULT ANSWER:
 
INPUT RECOGNIZED: 
hello
DATA RECEIVED AS ANSWER:
 Hi!  I can really feel your smile today. 
INPUT RECOGNIZED: 
what's your name
DATA RECEIVED AS ANSWER:
 I am Pepper. 
INPUT RECOGNIZED: 
how are you
DATA RECEIVED AS ANSWER:
 Feeling really joyful today. 
INPUT RECOGNIZED: 
what can you do
DATA RECEIVED AS ANSWER:
 I can tell the currency of every nation, tell you a joke, dialog with people  and do other tasks. 
INPUT RECOGNIZED: 
do you have parents
DATA RECEIVED AS ANSWER:
 I have no parents, but I have a creator. 
INPUT RECOGNIZED: 
where ar

## Closing process <a class="anchor" id="close"></a>

In [105]:
#conversational engine closing
pobj.stdin.close()
pobj.terminate()

In [124]:
#speech recognition closing
SpeechRecognition.pause()

RuntimeError: 	SpeechRecognition::pause
	module destroyed

In [97]:
DialogueSpeechReceiverModule.stop()

INF: ReceiverModule: stopping...


RuntimeError: 	ALMemory::unsubscribe
	Can't find method: unsubscribe (resolved to '(s)')


In [25]:
myBroker.shutdown()

TODO:
bloccare la testa, 
mettere timer di risposta