Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry PI 2 always returns no translation. #55

Open
craql opened this issue Jan 30, 2016 · 16 comments
Open

Raspberry PI 2 always returns no translation. #55

craql opened this issue Jan 30, 2016 · 16 comments

Comments

@craql
Copy link

craql commented Jan 30, 2016

Hey Steve, it seems like you've got a great plugin, but I can't seem to get it working. I'm using a Pi 2. is it not compatible with that model? Also, the usb headset I'm using records and plays back audio just fine. Whenever I run voicecommand, it returns with "No translation" and no audio response.

Thanks for your help.

@alx5962
Copy link

alx5962 commented Feb 19, 2016

I had a similar problem, but on Banana Pi.
Default framerate used in speech-recog.sh is 16000 and it's not working for me. I tried several ones and 32000 or 48000 are working fine! Also the file format is important (mine is S16_LE).
Try something like:
arecord -D "plughw:2,0" -f S16_LE -d 2 -r 32000 /dev/shm/out.wav
then
aplay /dev/shm/out.wav

If everything works fine, just update speech-recog.sh (located in /usr/bin).If not, try a different frame rate or file format.

@patrickreidglennon
Copy link

I'm wondering if the Google speech api is working at the moment, or if there's a requirement for a key now or something? I'm also returning no translations, but the noise.wav files are clear and perfect

@ripleyXLR8
Copy link

It seems that google doesn't accept Stereo recording.

A solution is to use in speechrecognition.sh 👍

arecord -D $hardware -f S16_LE -d $duration -r 16000 | flac - -f --best --sample-rate 16000 -o /dev/shm/out.flac 1>/dev/shm/voice.log 2>/dev/shm/voice.log; curl -X POST --data-binary @/dev/shm/out.flac --user-agent 'Mozilla/5.0' --header 'Content-Type: audio/x-flac; rate=16000;' "https://www.google.com/speech-api/v2/recognize?output=json&lang=$lang&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw&client=Mozilla/5.0" | sed -e 's/[{}]/''/g' | awk -F":" '{print $4}' | awk -F"," '{print $1}' | tr -d '\n'

and to recompile voicecommand.cpp with the following modification on the getvolume function 👍

inline float GetVolume(string recordHW, string com_duration, bool nullout) { FILE *cmd; float vol = 0.0f; string run = "arecord -D "; run += recordHW; run += " -f S16_LE -d "; run += com_duration; run += " -r 16000 /dev/shm/noise.wav"; if(nullout) run += " 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log"; system(run.c_str()); cmd = popen("sox /dev/shm/noise.wav -n stats -s 16 2>&1 | awk '/^Max\\ level/ {print $3}'","r"); fscanf(cmd,"%f",&vol); fclose(cmd); return vol; }

@galuhboy123
Copy link

where u can find voicecommand.cpp?
is it on setup installation PiAUISuite ? and we should recompile that file ?
thank for help @ripleyXLR8

@krist-jin
Copy link

The answer of @ripleyXLR8 works for me!

How to fix:

  1. go to the VoiceCommand folder
  2. edit speechrecognition.sh as @ripleyXLR8 mentioned above (basically it's change "-f cd -t wav" to "-f S16_LE")
  3. edit voicecommand.cpp as @ripleyXLR8 mentioned above (basically it's change "-f cd -t wav" to "-f S16_LE" )
  4. sudo apt-get install g++-4.8
  5. make
  6. go to install folder and sudo ./InstallAUISuite.sh
  7. change the keyword in configuration to something else because "Pi" is easy to be recognized as "pie"...
    And then you are good to go!

Why:
I think the problem is that google speech api doesn't support muti-channel recording somehow, which will return an empty result.
"arecord -f cd" equals to "arecord -f S16_LE -c2 -r44100]" which means there are two channels when recording. If you set it to single channel it will work.

You can do some experiment to prove this:
first, you do the recording with arecord -D plughw:1,0 -f cd -t wav -r 16000 test.wav, and curl -X POST --data-binary @'test.wav' --header 'Content-Type: audio/l16; rate=16000;' 'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw' to call the speech api
then you do the rrcording with arecord -D plughw:1,0 -f S16_LE -r 16000 test.wav, and call the google api.
You will find the the second way works but the first way results you nothing.

However, I have no idea why google does not support multi-channel...

@Colin1964
Copy link

Krist-jin

tried your step by step of RipleyXLR8 fix but TTS is still not working for me. Was this only a fix before Google stopped TTS service?

@krist-jin
Copy link

@Colin1964 Sorry for the confusion but this fix was for the google speech recognition api, not text to speech api. I just gave another fix for the tts in #56 hope it helps

@Colin1964
Copy link

@krist-jin My bad - I was having TTS and STT problems with this and looking at this issue and #56 but all sorted now (see update on #56) - well almost.. She can translate what I say and will speak back to me but I still can't get her to respond to the keyword?

@pinftv
Copy link

pinftv commented Apr 10, 2016

@krist-jin when I recompile voicecommand and install it again then when I'm trying to run voicecommand I get error message: Illegal Instruction. I need some help

@krist-jin
Copy link

@pinftv Can you provide what have you run and the full error message?

@pinftv
Copy link

pinftv commented Apr 16, 2016

@krist-jin I was followed your instruccions but when running voicecommand I was getting Illega instruction. but I have solved it instead of changing to S16_LE, after -r 1600 I added -c 1 and now it works normally :)

@HangLoooose
Copy link

I followed all instructions given by @krist-jin
Now speech-recog.sh kann translate my voice into the correct text and tts says whatever you want it to say. But as soon as I start voicecommand and say my keyword the following lines appear:

Found audio
Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 25258 0 14 100 25244 9 16760 0:00:01 0:00:01 --:--:-- 16762
No translation

Can anyone help me to fix this issue?

@siddharthksuri
Copy link

This thread is a life saver. Incorporated fixes as mentioned by ripleyxlr8 and clarified by krist-jin. I'm now able to get voice recognition working ( i.e. Speech to text ). Will try to get TTS over the next week, from thread 56.

I'm using a Pi3 running Raspbian Jessie, and a cheap $2 mike from Amazon.

@derrapf
Copy link

derrapf commented Feb 8, 2017

Hi maybe somebody can also help me.
I'm using a Pi3 with Raspbian.
When I run voicecommand -s, it somewhere along the way asks me if I would hear a sound. Unfortunatelly I don't. Then I tried speech-recog.sh and only got the following output
Aufnahme: WAVE 'stdin' : Unsigned 8 bit, Rate: 16000 Hz, mono
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 21618 0 14 100 21604 9 15385 0:00:01 0:00:01 --:--:-- 15398

I found that the flac conversion does not work for some reason:
When i perform:
arecord -D "plughw:0,0" -t wav -d 3 -r 16000 test.wav
I get
arecord: main:722: Fehler beim Öffnen des Gerätes: Datei oder Verzeichnis nicht gefunden
which means "Error opening device: File or directory not found.
I then tried
arecord -t wav -d 3 -r 16000 test.wav
and I got a nice recording which I can play with aplay. There is some hum but I can live with it.

Then I tried
arecord -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o /dev/shm/out.flac
But when I try to play it with
aplay /dev/shm/out.flac
I only can hear terrible white noise.

Now I have no idea how I can fix that.

Any help appreciated
Rlaf

TimoSchm added a commit to TimoSchm/PiAUISuite that referenced this issue Mar 7, 2017
Fixed the Bug from "StevenHickson#55" with solution from "krist-jin"
@disheet
Copy link

disheet commented Mar 22, 2017

For me its giving this error ...
plz help me..

pi@raspberrypi:~/PiAUISuite/VoiceCommand $ sudo voicecommand -c
Opening config file...
running in continuous mode
keyword duration is 2 and duration is 3
Found audio
arecord: main:556: unrecognized file format S16_LE
Warning: Couldn't read data from file "/dev/shm/out.flac", this makes an empty
Warning: POST.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 14 0 14 0 0 11 0 --:--:-- 0:00:01 --:--:-- 11
rm: cannot remove ‘/dev/shm/out.flac’: No such file or directory
No translation

@disheet
Copy link

disheet commented Mar 22, 2017

Do you want to permanently change the default duration of the speech recognition (3 seconds)? (y/n)
n
Do you want to permanently change the default command duration of the speech recognition (2 seconds)? (y/n)
y
Type the number of seconds you want it to run: ex 2
2
Do you want to set up and check the text to speech options? (y/n)
y
First I'm going to say something and see if you hear it
/usr/bin/tts: line 13: /dev/shm/speak.mp3: Permission denied
/dev/shm/tmp.mp3: Permission denied
/usr/bin/tts: line 40: /dev/shm/speak.mp3: Permission denied
/usr/bin/tts: line 42: /dev/shm/voice.log: Permission denied

while entering voicecommand -s it giving some error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests