# Speech Understanding 
# Lecture 10: Internationalization; Speech Synthesis


### Mark Hasegawa-Johnson, KCGI, December 24, 2022

1. <a href="#section_1">Internationalization</a>
1. <a href="#section_2">Installing gTTs, the Google speech synthesizer</a>
1. <a href="#section_3">Using gTTs</a>
1. <a href="#homework">Homework</a>


<a id='section_1'></a>

## 1. Internationalization

In order to run the Google speech recognizer in a language other than English, you only need to specify the language ID in the `recognize_google` function, like this:

In [1]:
import speech_recognition as sr
speech = sr.Recognizer()

print('Python はリッスンしています...')
while True:
    with sr.Microphone() as source:
        speech.adjust_for_ambient_noise(source)
        try:
            audio = speech.listen(source)
            inp = speech.recognize_google(audio, language="ja")
            print('"',inp,'" と言いました.')
            break
        except sr.UnknownValueError:
            continue
        except sr.RequestError:
            continue
        except sr.WaitTimeoutError:
            continue


Python はリッスンしています...
Could not import the PyAudio C module '_portaudio'.


AttributeError: Could not find PyAudio; check installation

In [None]:
import speech_recognition as sr
speech = sr.Recognizer()

print('Python 正在倾听...')
while True:
    with sr.Microphone() as source:
        speech.adjust_for_ambient_noise(source)
        try:
            audio = speech.listen(source)
            inp = speech.recognize_google(audio, language="zh")
            print('你说 “',inp,'".')
            break
        except sr.UnknownValueError:
            continue
        except sr.RequestError:
            continue
        except sr.WaitTimeoutError:
            continue


The language codes you can use are listed here: https://cloud.google.com/speech-to-text/docs/speech-to-text-supported-languages

<a id='section_2'></a>

## 2. Installing gTTs, the Google speech synthesizer

For speech synthesis, we will use Google's text-to-speech synthesis system (gTTs).  You need to be connected to the internet in order to use it. Documentation for gTTs is here: https://gtts.readthedocs.io/en/latest/ 

gTTs is installed like this (either in the window below, or in a terminal):

In [2]:
!pip install gTTs

Collecting gTTs
  Downloading gTTS-2.3.0-py3-none-any.whl (26 kB)
Collecting click~=8.1.3
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Installing collected packages: click, gTTs
Successfully installed click-8.1.3 gTTs-2.3.0


You will also need some way to read `mp3` files into Python.  <a href="https://librosa.org/doc/latest/index.html">Librosa</a> will do that.  You should already have librosa installed, but if you don't, install it like this:

In [3]:
!pip install librosa



<a id="section_3"></a>

## 3. Using gTTs

gTTs can't play the audio directly.  We need to create the audio output, save it to a file, and then play back the file.

In [4]:
import gtts, librosa, IPython

tts = gtts.gTTS(text="これが音声認識です", lang="ja")
with open("speech.mp3", "wb") as f:
    tts.write_to_fp(f)
    
speech_wave, speech_rate = librosa.load("speech.mp3")
IPython.display.Audio(data=speech_wave, rate=speech_rate)

The `wb` modifier in `open` is important.  It specifies that the file is
* binary (`b`)
* writable (`w`)

<a id='homework'></a>

## Homework for Week 10

Create a text file called `week10.py`.

This file should `def` a function called `synthesize`, with the following input parameters:
* text = a string specifying the text that you want to synthesize
* lang = a language code, specifying the language in which you want it synthesized
* filename = name of the filename to which the audio should be written

Here is a template that you can cut and paste, to get started:

In [None]:
import gtts

def synthesize(text, lang, filename):
    '''
    Use gtts.gTTs(text=text, lang=lang) to synthesize speech, then write it to filename.
    '''
    raise RuntimeError("You need to write this part!")


Test whether your code works by running the following block:

In [7]:
import week10, librosa, IPython, importlib
importlib.reload(week10)

week10.synthesize("You will be able to become google engineer","en","english.mp3")
y, sr = librosa.load("english.mp3")
IPython.display.Audio(data=y, rate=sr)

When the block above is working, try uploading your text file `week10.py` to <a href="https://www.gradescope.com/">Gradescope</a>.  The autograder checks the following things:

1. Did you submit a text file called `week10.py`?
1. Does your text file contains a method called `synthesize`?
1. Does `week10.synthesize("This is speech synthesis","en","english.mp3")` create a file called `english.mp3`?
1. If so, does the file `english.mp3` have the right content?
1. Does `week10.synthesize` also work if applied to a secret text string, in a secret language, with a secret filename?