An interactive automaton for a domotic system that commands a wireless plug and interact with synthesized voice and automatic Speech recognition. Tested on Raspbian but developed in pure Java. Internally using MaryTTS for Linux. This project requires the MaryTTSManager, the WholeWordSpeechRecognizer, and the PlugsController projects.
The distribution is made up of a Linux software (Windows is available too if MaryTTS - Windows is installed) with scripts to stop and start the interactive automaton. The distro folder contains a compiled version of the software with start and stop scripts.
sudo apt-get update sudo apt-get install git mkdir /home/pi/workspace/ cd /home/pi/workspace/ git clone firstname.lastname@example.org:gianpaolocoro/InteractiveAutomaton.git git clone email@example.com:gianpaolocoro/MaryTTSManager.git git clone firstname.lastname@example.org:gianpaolocoro/WholeWordAutomaticSpeechRecognizer.git git clone email@example.com:gianpaolocoro/PlugsController.git cd /home/pi/workspace/InteractiveAutomaton/distro ./startInteractionAutomaton.sh (wait until you hear the voice) ./stopInteractionAutomaton.sh (to stop the automaton)
Visual Communication via LED Blinking
On Rasberry Pi 3, the green Led blinks three times just after a word is correctly recognized, one time when it is not recognized. Thus, you can communicate even if you don't want to (or can't) hear the voice.
The process.properties file allows tuning some parameters, like the sensitivity of the microphone and the accuracy of the Speech recognizer:
address of the remote plug to control
path to the MaryTTS installation (to change if Windows is to use or the Text-To-Speech engine is installed in another folder)
voice to use
hotword to activate the assistant (choose alternative words from /home/pi/workspace/WholeWordSpeechRecognizer/MODELS/IT/WORDS/)
hotword to confirm the activation of the plug
Speech recognition score thresholds for hotword and activatio word. Increasing towards 0 means more rigid recognition by the automaton, i.e. word should be uttered clearer and clearer
sensitivity of the microphone (0,100) - higher is less sensitive
maximum silence (in sec) while expecting a word to be uttered
maximum silence (in sec) to wait after a word is uttered
maximum time length (in sec) of an uttered word
tolerance of a word: it is allowed to be in the set of the first nBest recognized words
path to the Speech recognition models
use bluetooth speakers (experimental)