Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tag: v0.4
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 104 lines (89 sloc) 3.272 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
==============================================
    Speech recognition script for Asterisk
==============================================

This script makes use of Google's speech recognition engine
in order to redner speech to text and return it back to the dialplan
as an asterisk channel variable.

------------
Requirements
------------
Perl The Perl Programming Language
perl-libwww The World-Wide Web library for Perl
flac Free Lossless Audio Codec
Internet access in order to contact google and get the speech data.

The script can optionally use sox for sound conversion. It works with recent
versions of sox (It will not work in RHEL/Centos 5).

------------
Installation
------------
To install copy speech-recog.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file

-----
Usage
-----
agi(speech-recog.agi,[lang],[timeout])
Records from the current channel untill the pound key (#) is pressed or the
timeout (set to 10 seconds by default, -1 for no timeout) is reached.
The recording is send over to googles speech recognition service and the
returned text string is assigned as the value of the channel variable 'utterance'.
The scripts sets the following channel variables:
status : Return status. 0 means success, non zero values indicating different errors.
id : Some id string that googles engine returns, not very useful(?).
utterance : The generated text string.
confidence : A value between 0 and 1 indicating the probability of a correct recognition.
             Values bigger than 0.95 usually mean that the resulted text is correct.

--------
Examples
--------
sample dialplan code for your extensions.conf

;Simple speech recognition
exten => 1234,1,Answer()
exten => 1234,n,agi(speech-recog.agi,en-US)
exten => 1234,n,Noop(== The text you just said was: ${utterance} ==)
exten => 1234,n,Noop(== The probability to be right is: ${confidence} ==)
exten => 1234,n,Hangup()

;Speech recognition demo also using googletts.agi for text to speech synthesis:
exten => 1235,1,Answer()
exten => 1235,n,agi(googletts.agi,"Please say something in English. When done press the pound key.",en)
exten => 1235,n(record),agi(speech-recog.agi,en-US)
exten => 1235,n,Noop(== Script returned: ${status} , ${id} , ${confidence} , ${utterance} ==)
exten => 1235,n,GotoIf($["${status}" = "0"]?success:fail)

exten => 1235,n(success),GotoIf($["${confidence}" > "0.9"]?playback:retry)

exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en)
exten => 1235,n,agi(googletts.agi,"${utterance}",en)
exten => 1235,n,goto(end)

exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en)
exten => 1235,n,goto(record)

exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en)
exten => 1235,n(end),Hangup()

-------------------
Supported Languages
-------------------
English
Afrikaans
Arabic
Chinese
Czech
Dutch
French
German
Hebrew
Italian
Indonesian
Japanese
Korean
Latin
Malaysian
Polish
Portuguese
Russian
Spanish
Turkish
Yue Chinese (Traditional Hong Kong)
Zulu

-------
License
-------
The speech-recog script for asterisk is distributed under the GNU General Public
License v2. See COPYING for details.
Something went wrong with that request. Please try again.