# Speech Understanding 
# Lecture 14: Personal Assistant


### Mark Hasegawa-Johnson, KCGI, January 28, 2023

1. <a href="#section1">Internationalize the speech_package</a>
1. <a href="#section2">What time is it?</a>
1. <a href="#section3">Tell me a joke!</a>
1. <a href="#section4">Show me my calendar for today</a>
1. <a href="#section3">Personal assistant</a>
1. <a href="#homework">Homework</a>


<a id='section1'></a>

## 1. Internationalize the speech_package

In order to make this week's exercise work, it's necessary to internationalizer the speech recognizers in `speech_package`.  Rather than re-creating the whole `speech_package`, let's just create a local `speech_package.py` in your current working directory.

Open a text file, and name it `speech_package.py`.  Copy into it the following text.  This will be a local module for only the current directory, but otherwise, it will behave the same as the `speech_package` that you created in week 11, except that it includes the `lang` option for the speech recognizer.

In [1]:
import speech_recognition as sr

def transcribe_wavefile(filename, lang):
    '''
    Use sr.AudioFile(filename) as the source,
    recognize from that source,
    and return the recognized text.
    
    Parameters:
    filename (str) - filename from which to recognize audio
    lang (str) - language code for the language to be recognized
    
    Output:
    text (str) - recognized text
    '''
    speech = sr.Recognizer()
    with sr.AudioFile(filename) as source:
        audio = speech.record(source)
        text = speech.recognize_google(audio, language=lang)
    return text

import gtts

def synthesize(text, lang, filename):
    '''
    Use gtts.gTTs(text=text, lang=lang) to synthesize speech, then write it to filename.
    '''
    tts = gtts.gTTS(text=text, lang=lang)
    with open(filename, "wb") as f:
        tts.write_to_fp(f)

def recognize_microphone(lang):
    '''
    Use sr.Microphone() as the source,
    recognize from that source,
    and return the recognized text.
    
    Parameters:
    lang (str) - language code for the language to be recognized
    
    Output:
    text (str) - recognized text
    '''
    speech = sr.Recognizer()
    while True:
        print('Python is listening...')
        with sr.Microphone() as source:
            speech.adjust_for_ambient_noise(source)
            try:
                audio = speech.listen(source)
                text = speech.recognize_google(audio, language=lang)
                break
            except sr.UnknownValueError:
                continue
            except sr.RequestError:
                continue
            except sr.WaitTimeoutError:
                continue
    return text

<a id='section2'></a>

## 2. What time is it?

We will use the `datetime` package to find out what time it is.  We will get the date and time in ISO standard format, then parse them, and read the result back to the user.  First, let's see what the ISO standard format looks like.

In [2]:
import datetime

print(datetime.datetime.now().isoformat())

2023-01-22T16:56:20.592976


You can see that the ISO standard format has the date, followed by the the hour, minutes, and seconds.  Our personal assistant will only read to us the hour and minutes.  We can use `split` to split the string at the `T` character, and then to split it at every `:` character, then read out only the relevant part:


In [3]:
(date, time) = datetime.datetime.now().isoformat().split("T")

(hour, minutes, seconds) = time.split(":")

print(hour+"時"+minutes+"分です")

16時56分です


Create a text file, and rename it `week14.py`.  Copy into it the following function.

In [4]:
import datetime, speech_package

def what_time_is_it(lang, filename):
    '''
    Tell me what time it is.
    
    Parameters:
    lang (str) - language in which to speak
    filename (str) - the filename into which the audio should be recorded
    '''
    (date, time) = datetime.datetime.now().isoformat().split("T")
    (hour, minutes, seconds) = time.split(":")
    if lang=="en":
        text = hour+" hours and "+minutes+" minutes"
    elif lang=="ja":
        text = hour+"時"+minutes+"分です"
    elif lang=="zh":
        text = "现在是"+hour+"点"+"分"
    else:
        text="I'm sorry, I don't know that language"
    speech_package.synthesize(text,lang,filename)
 

If you've created that file, you can now test it by running the following block:

In [14]:
import week14, importlib, librosa, IPython
importlib.reload(week14)

week14.what_time_is_it("ja", "time.mp3")

x, fs = librosa.load('time.mp3')
IPython.display.Audio(data=x, rate=fs)



<a id='section3'></a>

## 3. Tell me a joke!

If you want your personal assistant to tell jokes, you need a good source of jokes.

Go to http://www.gujo-tv.ne.jp/~circleband/oyaji%20gag.htm.  Look at this beautiful list of jokes!  

Open the page source, and look at `content` at the top.  Notice that this file is stored in `Shift_JIS`, which is a pre-unicode encoding for Japanese characters.  So the first thing we need to do is to convert it to unicode.  One way to do that is by saving the raw binary to a file, and then reading it back in using the `Shift_JIS` decoder:


In [6]:
import requests

rawbinary = requests.get("http://www.gujo-tv.ne.jp/~circleband/oyaji%20gag.htm").content
with open('jokes_ja.htm','wb') as f:
    f.write(rawbinary)
    
with open('jokes_ja.htm', 'r', encoding='shiftjis') as f:
    text = f.read()

print(text[0:1000])


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<META name="GENERATOR" content="IBM WebSphere Homepage Builder V6.0.1 for Windows">
<META http-equiv="Content-Type" content="text/html; charset=Shift_JIS">
<META http-equiv="Content-Style-Type" content="text/css">
<TITLE>おやじギャグ</TITLE>
</HEAD>
<BODY text="#FFFF00" link="#00FFFF" vlink="#00FF00" background="st09_bg.gif">
<P align="center"><IMG alt="直線上に配置" width="512" height="50" border="0" src="st09_l1.gif"><FONT color="#00cccc"><B><FONT size="+3"><BR>
おやじギャグ</FONT></B></FONT></P>
<CENTER>
<TABLE border="1">
  <TBODY>
    <TR>
      <TD align="center">1</TD>
      <TD align="center">１年生</TD>
      <TD align="center">トマトを食べるの　ちょっと待っとって</TD>
    
    <TR>
      <TD align="center">2</TD>
      <TD align="center">１年生</TD>
      <TD align="center">お金を取られた　おっかねー</TD>
    
    <TR>
      <TD align="center">3</TD>
      <TD align="center">１年生</TD>
      <TD align="center">スイカを積んだ　せんすいかん</TD>
    
    <TR>
      <TD a

Notice that each joke is the third `<td>` tag under a `<tr>` tag.  Let's use BeautifulSoup to find those.


In [7]:
import bs4
soup = bs4.BeautifulSoup(text, "html.parser")

jokes = []
for tr in soup.find_all('tr'):
    tdlist = tr.find_all('td')
    jokes.append(tdlist[2].text)

for n in range(5):
    print(jokes[n])


トマトを食べるの　ちょっと待っとって
お金を取られた　おっかねー
スイカを積んだ　せんすいかん
トイレでバッタが　ふんばった
このアジ　とっても味がある


Let's also download some jokes in English and Chinese.

#### Warning:
I have no idea if these jokes in Chinese are funny or not.  I am quite sure that the jokes in English and Japanese are not very funny.

In [8]:
rawbinary = requests.get("https://raw.githubusercontent.com/yesinteractive/dadjokes/master/controllers/jokes.txt").content
with open('jokes_en.txt','wb') as f:
    f.write(rawbinary)
    
jokes_en = []
with open('jokes_en.txt', 'r') as f:
    for line in f:
        jokes_en.append(line.replace("<>"," --- "))

for n in range(6):
    print(jokes_en[n])


What did one pirate say to the other when he beat him at chess? --- Checkmatey.

I burned 2000 calories today --- I left my food in the oven for too long.

I startled my next-door neighbor with my new electric power tool.  --- I had to calm him down by saying “Don’t worry, this is just a drill!”

I broke my arm in two places.  --- My doctor told me to stop going to those places.

I quit my job at the coffee shop the other day.  --- It was just the same old grind over and over.

I never buy anything that has Velcro with it... --- it’s a total rip-off.



In [9]:
jokes_zh = [
    '人们都说我没有方向感，我一直不承认，直到他们让我买西瓜我买成了南瓜后',
    '狗是人类最好的朋友。如果你不相信的话，可以试试这样做：放你的狗狗和妻子一起关进汽车的后备箱里。一小时后你再打开后备箱，哪一个会因为见到了你真心地高兴呢?',
    '昨晚家里停电了，可是邻居家有电，所以我打电话叫电工来看看，等了半天到最后也没来，第二天遇到了问他：“昨晚怎么没来？”他说：“昨晚去了，看你家黑灯瞎火的以为没人，我就走了……”',
    '我的WIFI密码设置成了2444666668888888，有人问你时, 直接告诉他密码就是： 12345678。',
    '和同事一起去吃自助。 说好aa制，结果付钱的时候 他却掏出一张五折券！',
    '你遇到的99%的问题都可以 用钱解决，剩下的1%， 需要用更多的钱。'
]
for n in range(5):
    print(jokes_zh[n],"\n")

人们都说我没有方向感，我一直不承认，直到他们让我买西瓜我买成了南瓜后 

狗是人类最好的朋友。如果你不相信的话，可以试试这样做：放你的狗狗和妻子一起关进汽车的后备箱里。一小时后你再打开后备箱，哪一个会因为见到了你真心地高兴呢? 

昨晚家里停电了，可是邻居家有电，所以我打电话叫电工来看看，等了半天到最后也没来，第二天遇到了问他：“昨晚怎么没来？”他说：“昨晚去了，看你家黑灯瞎火的以为没人，我就走了……” 

我的WIFI密码设置成了2444666668888888，有人问你时, 直接告诉他密码就是： 12345678。 

和同事一起去吃自助。 说好aa制，结果付钱的时候 他却掏出一张五折券！ 



Let's ask gtts to tell us a joke chosen at random from one of these lists.  Add the following to your `week14.py` file:

In [10]:
import speech_package, bs4, random

jokes_zh = [
    '人们都说我没有方向感，我一直不承认，直到他们让我买西瓜我买成了南瓜后',
    '狗是人类最好的朋友。如果你不相信的话，可以试试这样做：放你的狗狗和妻子一起关进汽车的后备箱里。一小时后你再打开后备箱，哪一个会因为见到了你真心地高兴呢?',
    '昨晚家里停电了，可是邻居家有电，所以我打电话叫电工来看看，等了半天到最后也没来，第二天遇到了问他：“昨晚怎么没来？”他说：“昨晚去了，看你家黑灯瞎火的以为没人，我就走了……”',
    '我的WIFI密码设置成了2444666668888888，有人问你时, 直接告诉他密码就是： 12345678。',
    '和同事一起去吃自助。 说好aa制，结果付钱的时候 他却掏出一张五折券！',
    '你遇到的99%的问题都可以 用钱解决，剩下的1%， 需要用更多的钱。'
    ]

def tell_me_a_joke(lang, audiofile):
    '''
    Tell me a joke.
    
    Parameters:
    lang (str) - language in which to tell the joke
    audiofile (str) - audiofile in which to record the joke
    '''
    jokes = []
    if lang=="ja":
        with open('jokes_ja.htm', 'r', encoding='shiftjis') as f:
            text = f.read()
        soup = bs4.BeautifulSoup(text, "html.parser")
        for tr in soup.find_all('tr'):
            tdlist = tr.find_all('td')
            jokes.append(tdlist[2].text)
        speech_package.synthesize(random.choice(jokes),lang,audiofile)
    elif lang=="en":
        with open('jokes_en.txt', 'r') as f:
            for line in f:
                jokes.append(line.replace("<>"," --- "))
        speech_package.synthesize(random.choice(jokes),lang,audiofile)
    elif lang=="zh":
        jokes = jokes_zh
        speech_package.synthesize(random.choice(jokes),lang,audiofile)
    else:
        speech_package.synthesize("I'm sorry, I don't know that language",en,audiofile)
        return
  

If you've copied that code into `week14.py`, you can test it using the following block:

In [15]:
import librosa, IPython
importlib.reload(week14)

week14.tell_me_a_joke("ja", 'joke.mp3')

x, fs = librosa.load('joke.mp3')
IPython.display.Audio(data=x, rate=fs)



<a id='section4'></a>

## Show me my calendar for today

Let's ask our assistant to show us today's calendar.  In order to give her something to say, we will also ask her to say what is today's current date, using the python `datetime` package.  

You can use `webbrowser` to open your personal calendar, but to keep this generic, let's use a a general calendar website.  The website https://www.timeanddate.com/calendar/monthly.html?year=2023&month=1&country=26 also has a nice feature that you can specify the year, month, and country whose holidays you want to see:

In [13]:
import webbrowser
webbrowser.open("https://www.timeanddate.com/calendar/monthly.html?year=2023&month=1&country=26")

True

Try adding the following function to your code in `week14.py`:

In [None]:
def show_me_my_calendar(lang, audiofile):
    '''
    Show me my calendar for today.
    
    Parameters:
    lang (str) - language in which to record the date
    audiofile (str) - filename in which to read the date
    
    Returns:
    url (str) - URL that you can look up in order to see the calendar for this month and year
    '''
    (date, time) = datetime.datetime.now().isoformat().split("T")
    (year, month, day) = date.split("-")
    url = "https://www.timeanddate.com/calendar/monthly.html?year="+year+"&month="+month
    if lang=="en":
        speech_package.synthesize("Today is day"+day+"of month"+month,"en",audiofile)
        return url+"&country=1"
    elif lang=="ja":
        speech_package.synthesize("今日は"+month+"月"+day+"日","ja",audiofile)
        return url+"&country=26"
    elif lang=="zh":
        speech_package.synthesize("今天是"+month+"月"+day+"日","zh",audiofile)
        return url+"&country=41"


If that worked, then you should be able to open your calendar by running the following block of code:

In [21]:
import librosa, IPython, webbrowser
importlib.reload(week14)

url = week14.show_me_my_calendar("en","date.mp3")
webbrowser.open(url)

x, fs = librosa.load('date.mp3')
IPython.display.Audio(data=x, rate=fs)



<a id='section4'></a>

## Personal assistant

Now let's put it all together in a personal assistant app.  Your personal assistant will listen to you, and respond when you make any of the following four types of requests:

* If you say anything containing "What time," it will tell you the time
* If you say anythin containing the word "joke," it will tell you a joke
* If you say anything containing the word "calendar," it will open your calendar
* If you say something else, it will say "I'm sorry, I didn't understand you!"

Copy the following into `week14.py`:

In [22]:
def personal_assistant(lang, filename):
    if lang=="en":
        keywords = ["what time", "joke", "calendar", "I'm sorry, I didn't understand you"]
    elif lang=="ja":
        keywords = ["何時","冗談","カレンダー","すみません、よくわかりませんでした"]
    elif lang=="zh":
        keywords = ["几奌","玩笑","日历","对不起，我没听懂你的话"]
    else:
        speech_package.synthesize("I don't know that language!","en",filename)
        return

    inp = speech_package.recognize_microphone(lang)
    if keywords[0] in inp:
        what_time_is_it(lang, filename)
    elif keywords[1] in inp:
        tell_me_a_joke(lang, filename)
    elif keywords[2] in inp:
        show_me_my_calendar(lang, filename)
    else:
        speech_package.synthesize(keywords[3], lang, filename)

Now let's try running the personal assistant:

In [28]:
importlib.reload(week14)

week14.personal_assistant("ja", "test.mp3")

x, fs = librosa.load("test.mp3")
IPython.display.Audio(data=x, rate=fs)

Python is listening...
result2:
{   'alternative': [   {'confidence': 0.95782542, 'transcript': 'すみません 今何時ですか'},
                       {'transcript': '見ません 今何時ですか'},
                       {'transcript': 'みません 今何時ですか'}],
    'final': True}




<a id='homework'></a>

## Homework for Week 14

Once you have all of the sections above working, try submitting the file `week14.py` to Gradescope.  If everything in the sections above was successful, then your file should also work in Gradescope.