# Voiceshell Work Notebook

*Adapted from Allison Parrish's Understanding Word Vectors notebook*

#### Python Package Installation for Speech Recognition

*Run in terminal to install SpeechRecognition.py & PocketSphinx engine:*

`pip install SpeechRecognition`

`pip install pocketsphinx`

#### Importing required libraries:

In [13]:
# IMPORTS
import math
import random
from __future__ import unicode_literals
import spacy
import speech_recognition as sr
import os
import csv
import numpy as np
from numpy import dot
from numpy.linalg import norm

#### Defining basic mathematical functions for vector math:

In [2]:
def meanv(coords):
    sumv = [0] * len(coords[0])
    for item in coords:
        for i in range(len(item)):
            sumv[i] += item[i]
    mean = [0] * len(sumv)
    for i in range(len(sumv)):
        mean[i] = float(sumv[i]) / len(coords)
    return mean

In [3]:
def cosine(v1, v2):
    if norm(v1) > 0 and norm(v2) > 0:
        return dot(v1, v2) / (norm(v1) * norm(v2))
    else:
        return 0.0

#### Defining Spacy/Word Vector low level functions:

In [4]:
def sentvec(s):
    sent = nlp(s)
    return meanv([w.vector for w in sent])

In [5]:
def spacy_closest_sent(space, input_str, n=1):
    input_vec = sentvec(input_str)
    return sorted(space,
                  key=lambda x: cosine(np.mean([w.vector for w in x], axis=0), input_vec),
                  reverse=True)[:n]

#### Creating Spacy processor and loading a corpus:

In [6]:
nlp = spacy.load('en')
doc = nlp(open("corpus/andrews1.txt").read().replace('\n', ' ').replace('  ', ' '))

#### Splitting corpus into individual, complete sentences:

In [7]:
sentences = list(doc.sents)

#### Define list of prompts to user:

In [8]:
inquiries = ["How are you feeling?", "Tell me a quote you like.", "What are the first three words that pop into your head?", "What day is it today?", "What's your favourite colour?", "Where are you right now?", "What's your job?", "What do you like to eat?", "What animal would you want as a pet?"]

In [9]:
print(sentences[2])

A buckling dent.


#### Defining user input prompts, and getting input to process, return result:

In [10]:
inquiry = random.choice(inquiries)
print(inquiry)
user_input = input()

for sent in spacy_closest_sent(sentences, user_input):
    sentNum = sentences.index(sent)
    print("---")
    print(sentNum)
    print(sent.text)
    print("---")

What day is it today?
Tuesday
---
0
White Saris.
---


#### Setting up basic SpeechRecognition input:
[Source](http://www.codesofinterest.com/2017/03/python-speech-recognition-pocketsphinx.html)

In [11]:
# get microphone input
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Just a moment, calibrating the microphone...")
    # 5 second listen and find ambient noise level
    r.adjust_for_ambient_noise(source, duration=5)
    inquiry = random.choice(inquiries)
    
    # read question aloud and print it
    os.system("say '" + inquiry + "'")
    print(inquiry," (Speak aloud, please.)")
    audio = r.listen(source)
    
# recognise speech with Sphinx
try:
    recogSpeech = r.recognize_sphinx(audio) + "'"
    print("I heard you say: ", recogSpeech)
    for sent in spacy_closest_sent(sentences, recogSpeech):
        print("----------")
        print("...")
        print(sent.text)
        print("----------")
except sr.UnknownValueError:
    print("Sorry, I couldn't understand you.")
except sr.RequestError as e:
    print("Sphinx error; {0}".format(e))

Just a moment, calibrating the microphone...


KeyboardInterrupt: 

In [61]:
os.system("say 'White Saris, by Angela Andrews'")

0

#### Testing lookup of audio file

In [29]:
with open('revisedCorpus/voiceshell_audio_LUT.csv', 'r') as f:
    reader = csv.reader(f)
    all_lines = list(reader)
    
audio_lookup_table ={}
for line in all_lines:
    sentence = line[0]
    filename = line[1]
    audio_lookup_table[sentence] = filename
    
example_sentence = all_lines[0][0]
print("First sentence is {}".format(example_sentence))
example_audiofile = audio_lookup_table[example_sentence]
print("Look for that sentence in the file {}".format(example_audiofile))

First sentence is White Saris
Look for that sentence in the file PoemsAudio/Andrews/andrews0.wav


#### Playing with Python WAV Audio playback

In [12]:
import wave, sys, pyaudio

inquiry = random.choice(inquiries)
print(inquiry)
user_input = input()

for sent in spacy_closest_sent(sentences, user_input):
    sentNum = sentences.index(sent)
    print("---")
    print(sentNum)
    print(sent.text)
    print("---")

    audioFile = 'PoemsAudio/Andrews/andrews{}.wav'.format(sentNum)

# sound = wave.open(audioFile)

print(audioFile)

sound = wave.open(audioFile)
p = pyaudio.PyAudio()
chunk = 1024
stream = p.open(format = p.get_format_from_width(sound.getsampwidth()), channels = sound.getnchannels(), rate = sound.getframerate(), output = True)
data = sound.readframes(chunk)
while len(data) > 0:
    stream.write(data)
    data = sound.readframes(chunk)
stream.stop_stream()
stream.close()

p.terminate()

What animal would you want as a pet?
cat
---
10
How silk might hang in a cold wardrobe.
---
PoemsAudio/Andrews/andrews10.wav


In [60]:
lookup_table = {}
# for each line in csv:
   1) text = column[1]
   2) filename = column[2]
   lookup_table[text] = filename

In [None]:
audio_file = lokup_table[closest_sentence]