# Chat & TTS
This notebook is just to survey text-to-speech, speech recognition, and chat techniques.

In Python, they all have online and offline solutions. We will compare the performance of the packages/approaches with and without Internet here.

### 1. Text-To-Speech (TTS)
TTS or Text-To-Speech aims to convert texts to voice.

- Offline:
pyttsx3 is a text-to-speech conversion library in Python. Please install pyttsx3 with version 2.71, because the higher version does not work.
    ```
    pip install pyttsx3==2.71
    ```

In [54]:
import pyttsx3
engine = pyttsx3.init()

In [55]:
rate = engine.getProperty('rate')   # 200 by default
engine.setProperty('rate', 150)

Check languages and gender.
In Windows, language and gender are not specified as key-value pair, but in ID and name. Such as
```xml
<Voice id=HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_DAVID_11.0
          name=Microsoft David Desktop - English (United States)
          languages=[]
          gender=None
          age=None>
```

In [56]:
import os
available_languages = {}

print(os.name)
if os.name == "nt":
    print('Windows')
else:
    print('Linux')

for index, voice in enumerate(engine.getProperty('voices')):
    # print(index, voice)
    if os.name == "nt":
        if 'EN-US' in voice.id:
            available_languages['English'] = voice.id
        elif 'DE-DE' in voice.id:
            available_languages['German'] = voice.id
    else:
        if 'en_US' in voice.languages:
            available_languages['English'] = voice.id
        elif 'de_DE' in voice.languages:
            available_languages['German'] = voice.id

print(available_languages)
engine.setProperty('voice', available_languages['English'])
engine.say("Which language do you want to speak?")
engine.runAndWait()

options = list(available_languages.keys())
if len(options) > 1:
    options.insert(-1, 'or')
options_text = ' '.join(options)
engine.say(options_text)
engine.runAndWait()

nt
Windows
{'English': 'HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Speech\\Voices\\Tokens\\TTS_MS_EN-US_ZIRA_11.0', 'German': 'HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Speech\\Voices\\Tokens\\TTS_MS_DE-DE_HEDDA_11.0'}


In [57]:
german_text = '''
Bilder eines TV-Senders zeigen, 
wie die Unterstützer des abgewählten rechtsextremem Präsidenten Fensterscheiben einschlagen 
und in die Eingangshalle vordringen. 
Auch auf den Parkplatz des Präsidentenpalastes sollen die Randalierer vorgedrungen sein. 
'''
english_text = '''
I will speak this text.
'''

In [58]:
engine.setProperty('voice', available_languages['German'])
engine.say(german_text)
engine.runAndWait()
engine.save_to_file('Hello World', 'test.mp3')
engine.runAndWait()

In [59]:
engine.stop()

### 2. Speech Recognition
```bash
pip install PyAudio
pip install pocketsphinx
pip install SpeechRecognition
```

In [97]:
import speech_recognition as sr
import time
r = sr.Recognizer()
m = sr.Microphone()
recognizer_type = 'google'  # sphinx, wit, bing, houndify, ibm, whisper

In [98]:
def sphinx_callback(recognizer: sr.Recognizer, audio: sr.AudioData):
    try:
        print("Sphinx thinks you said \'" + r.recognize_sphinx(audio) + "\'")
    except sr.UnknownValueError:
        print("Sphinx could not understand audio")
    except sr.RequestError as e:
        print("Sphinx error; {0}".format(e))

In [101]:
def google_callback(recognizer: sr.Recognizer, audio: sr.AudioData):
    try:
        print("Google thinks you said \'" + r.recognize_google(audio) + "\'")
    except sr.UnknownValueError:
        print("Google could not understand audio")
    except sr.RequestError as e:
        print("Google error; {0}".format(e))

In [102]:
with m as source:
    r.adjust_for_ambient_noise(source)

print('Please speak ...')
time.sleep(0.1)
stop_listening = r.listen_in_background(m, google_callback)

for _ in range(50): time.sleep(0.1)
stop_listening(wait_for_stop=False)

Please speak ...
result2:
{   'alternative': [   {   'confidence': 0.90512341,
                           'transcript': 'what can I do for you'}],
    'final': True}
Google thinks you said 'what can I do for you'
