Skip to content

Installing Voice Interface [English]

Lewis edited this page Jun 13, 2024 · 1 revision

Voice Interface Setup Guide

Important Notice

Note: Even if you do not intend to use the Voice Interface, it is necessary to install Porcupine to compile carmen_lcad.

 - Python3 => 3.5.2

Google Cloud Setup

Create a Google Cloud Account

  • Select or create a GCP project name: voice_interface. Go to the Manage Page Resources

  • Make sure that billing is enabled for your project. Learn how to enable Billing.

  • Enable the APIs. First, access the search page, select the project in the top bar, and search for "Cloud Text-to-Speech API". Click the banner and select activate in the next page. Repeat the process, but this time search for "Cloud Speech API" to activate the Speech-to-Text API.

  • Go to the Create Service Account Key page in the GCP Console. (Menu(≡) -> API&Services -> Credentials) _ From the Service account drop-down list, select New service account. _ Enter a name into the Service account name field: voice_interface_account (the Service account name will come with a number series, you'll rename it later). _ From the Role drop-down list, set Owner. _ Click Create. _ A JSON file that contains your key will start to download with a default name: project-name-number-series.json _ Rename it to our default name (voice_interface_credentials.json) and save it at ~/credentials/

cd ~
mkdir credentials
mv ~/Downloads/voice_interface_number_series.json ~/credentials/voice_interface_credentials.json

If You Already Have a Google Cloud Account and Switched Machines

  • Create a new service account, but at the same service account and project: Create NEW Service Account Key.
  • Verify if the right project is set at the top of the page.
  • Select the service account: voice_interface_account.
  • Select 'JSON' as key type. _ A JSON file that contains your key will start to download with a default name: project-name-number-series.json _ Rename it to our default name (voice_interface_credentials.json) and save it at ~/credentials/
 cd ~
 mkdir credentials
 mv ~/Downloads/voice_interface_......json ~/credentials/voice_interface_credentials.json

Install Google Cloud Text-to-Speech and Speech-to-Text Libraries

 pip3 install --upgrade google-cloud-texttospeech
 pip3 install --upgrade google-cloud-speech
 pip3 install pyaudio

Install Porcupine:

 cd carmen_packages
 git clone https://github.com/Picovoice/Porcupine.git
 cd Porcupine
 tools/optimizer/linux/x86_64/pv_porcupine_optimizer -r resources/ -p linux -o . -w "ok e ara"
 export SYSTEM=linux
 export MACHINE=x86_64
 cd demo/alsa
 g++ -O3 -o alsademo -I../../include -L../../lib/${SYSTEM}/$MACHINE -Wl,-rpath ../../lib/${SYSTEM}/$MACHINE main.cpp -lpv_porcupine -lasound
 cp ../../ok\ e\ ara_linux.ppn ../../resources/keyword_files/pineapple_linux.ppn
 ./alsademo
  • Test by saying the hotword/wake-word: "Ok, Iara!"

"Hotword detected!"

  • Using Porcupine with CARMEN:
 cp ~/carmen_packages/Porcupine/ok\ e\ ara_linux.ppn $CARMEN_HOME/data/voice_interface_hotword_data/hotword_oi_iara.ppn
 cp ~/carmen_packages/Porcupine/lib/common/porcupine_params.pv $CARMEN_HOME/data/voice_interface_hotword_data/
 cp ~/carmen_packages/Porcupine/include/picovoice.h $CARMEN_HOME/src/voice_interface/
 cp ~/carmen_packages/Porcupine/lib/linux/x86_64/libpv_porcupine.a libpv_porcupine.a.copy
 ln -s ~/carmen_packages/Porcupine/libpv_porcupine.a.copy $CARMEN_HOME/lib/libpv_porcupine.a

Install RASA NLU

  • Tensorflow == 1.5
 sudo pip3 install -U spacy
 sudo pip3 install rasa_nlu
 sudo pip3 install rasa_nlu[spacy]
 sudo python3 -m spacy download en_core_web_md
 sudo python3 -m spacy link en_core_web_md en
 sudo pip3 install rasa_nlu[tensorflow]

To test:

In Python:

Option 1: Command Line

 cd 
 cd $CARMEN_HOME/src/voice_interface
 python3 -m rasa_nlu.train -c nlu_config.yml --data iara_nlu.md -o models --fixed_model_name nlu --project current --verbose

Option 2: Web Server python3 -m rasa_nlu.server --path models --response_log logs

In another terminal...

 curl -XPOST localhost:5000/parse -d '{"q":"I would like to find a Mexican restaurant in the north", "project":"current", "model":"nlu"}'

In C++:

 cd 
 cd $CARMEN_HOME/src/voice_interface
 g++ c_post_example.cpp -lcurl -ljsoncpp
 python3 -m rasa_nlu.server --path models --response_log logs
./a.out
Clone this wiki locally