Permalink
Browse files

documenting classes to clarify TODOs to turn project into a general p…

…urpose speech recognition service which is offline, eyes-free and also accepts audio file input rather than recorder input
  • Loading branch information...
cesine committed May 29, 2011
1 parent 16c7885 commit 58ebe780ab131c0dafe2a2970a7c9c10efbc3089
View
@@ -10,5 +10,6 @@
<classpathentry kind="con" path="com.android.ide.eclipse.adt.ANDROID_FRAMEWORK"/>
<classpathentry kind="con" path="org.eclipse.jdt.junit.JUNIT_CONTAINER/4"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/J2SE-1.3"/>
+ <classpathentry kind="lib" path="libs/LIUM_SpkDiarization-3.1.jar"/>
<classpathentry kind="output" path="bin"/>
</classpath>
View
37 README
@@ -2,19 +2,40 @@ About
This currently a demo of the Sphinx (specifically PocketSphinx) Automatic Speech Recognition system which runs on any Android Device 2.2 or higher. The Sphinx service is actually running on the Android, so internet connection or server connnection is needed.
+TODOs
+
+
+ TODO currently this demo can only demo the PocketSphinx speech recognizer, it doesn't make it available for other developers to call, or for the user to use generally.
+
+ TODO implement service.SpeechRecognizerViaFilePocketSphinx so that developers can pass a file to the speech recognizer and get back an array of array of hypotheses for utterances in the audio
+
+ TODO implement service.SpeechRecognizerViaRecorderSphinx so that users can do speech recognition offline, without a network connection (the default speech recognizer provided by com.google.android.voicesearch has to be online and only accepts short utterances. it cannot be used eyes-free).
+
+
+
+History of this Demo:
+ Created by David Huggins-Daines <dhuggins@cs.cmu.edu> sourceforge:dhdfu and other contributors at the cmusphinx project
+ Turned into a very user friendly Demo app and apk with very little dependencies by Aasish Pappu sourceforge: aasishp , github aasish
+ Infrastructure laid out for eyes-free offline speech recognition by github: cesine
+ Eyes-free offline speech recognition implemented by: maybe someone who knows pocketsphinx... ;)
-History
+The original source of the Android PocketSphinx Demo was started by the folks at cmusphinx updates and improvements of the Demo should ultimately appear there.
-This is a fork of aasish, who added quite a bit of work to a fork of either
-* zachrattner pre Oct 27 2010, which was a fork of cmusphinx pre Oct 27 2010
- http://www.zachrattner.com/PocketSphinxDemo.tar.gz
-* or the original project on sourceforge
- http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/PocketSphinxAndroidDemo/
+How to contribute
-The original source of the Android PocketSphinx Demo was started by the folks at cmusphinx updates and improvements of the Demo should ultimately appear there.
+* Contributors needed:
+
+** Service to annotate utterances in audio using the srt format or the WebVTT format to be compatible with future ASR recognition services provided eg Google Listen or in the YouTube Captioning API http://www.youtube.com/watch?v=tua3DdacgOo&feature=player_embedded
+
+*** Potential directions:
+ 1. Java Sound API
+ 2. LIUM tools (already used with Sphinx)
+ http://liumtools.univ-lemans.fr//index.php?option=com_content&task=blogcategory&id=32&Itemid=60
+ 3. Praat Phonetic toolkit Port to Android (allows for complex phonetic analysis (more than just silence detection) of utterance final features.
+** Turning the RecognizerTask and PocketSphinxAndroidDemo code into a SpeechRecognizer to allow it to register for the RecognizerIntent so that it can be used outside the demo
-How to use this project
+How to setup this project on your machine
Do the following steps to setup your environment in Eclipse, to run the Demo.
Binary file not shown.
@@ -2,61 +2,51 @@
import android.os.Bundle;
import android.speech.RecognitionListener;
-
-public class RecognitionListenerPocketSphinx implements RecognitionListener {
-
- @Override
- public void onBeginningOfSpeech() {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onBufferReceived(byte[] arg0) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onEndOfSpeech() {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onError(int error) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onEvent(int eventType, Bundle params) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onPartialResults(Bundle partialResults) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onReadyForSpeech(Bundle params) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onResults(Bundle results) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onRmsChanged(float rmsdB) {
- // TODO Auto-generated method stub
-
- }
+/**
+ *
+ * This in theory should be identical to {@link android.speech.RecognitionListener} and therefore unnecessary.
+ * However it would require implementing a lot of functions in any classes that extend it (ie, the SpeechRecognizer for PocketSphinx)
+ *
+ * TODO decide if this should be a reduced version of the RecognitionListener interface (as was done in the original
+ * by David Huggins-Daines <dhuggins@cs.cmu.edu (dhdfu), or decide to implement a middle ground version of the RecognitionListner, enough so that it is possible to register
+ * PocketSphinx with the device as a speech recognition system that is capable of handling the the full functionality of the RecognizerIntent.
+ *
+ *
+ * TODO refer to {@link RecognitionListenerReduced} for an implementation of a subset of the
+ * {@link android.speech.RecognitionListener}, which dhdfu did to avoid dependencies on Froyo and methods we don't need or can't provide. But the the
+ * original manifest was changed to Froyo, so maybe this design consideration is depreciated.
+ *
+ * The full list of methods are here..
+ *
+ *
+ */
+public interface RecognitionListenerPocketSphinx {
+
+
+ public void onBeginningOfSpeech() ;
+
+
+ public void onBufferReceived(byte[] arg0) ;
+
+
+ public void onEndOfSpeech() ;
+
+
+ public void onError(int error) ;
+
+
+ public void onEvent(int eventType, Bundle params) ;
+
+
+ public void onPartialResults(Bundle partialResults) ;
+
+
+ public void onReadyForSpeech(Bundle params) ;
+
+
+ public void onResults(Bundle results) ;
+
+
+ public void onRmsChanged(float rmsdB) ;
}
@@ -2,7 +2,47 @@
import android.content.Intent;
import android.speech.RecognitionService;
-
+/**
+ * TODO Implementation and registration of this service in the Manifest will allow it to pop up in the Device settings as a
+ * speech recognizer service. The user then can choose which service will be the default. The service should be callable in two ways
+ *
+ *
+ * 1. As the default speech recognition service device wide (configured by user in settings)
+ Example registration in the manifest:
+ <service android:name="ca.ilanguage.labs.pocketsphinx.service.RecognizerServicePocketSphinx"
+ android:label="@string/service_name">
+
+ <intent-filter>
+ <!-- Here we identify that we are a RecognitionService by specifying that
+ we satisfy RecognitionService's interface intent.
+ The constant value is defined at RecognitionService.SERVICE_INTERFACE. -->
+ <action android:name="android.speech.RecognitionService" />
+ <category android:name="android.intent.category.DEFAULT" />
+ </intent-filter>
+
+ <!-- This points to a metadata xml file that contains information about this
+ RecognitionService - specifically, the name of the settings activity to
+ expose in system settings.
+ The constant value is defined at RecognitionService.SERVICE_META_DATA. -->
+ <meta-data android:name="android.speech" android:resource="@xml/recognizer" />
+
+ </service>
+ * 2. As a speech recognition service called by an open Intent (hardcoded by developers in their code, they can also check to see if the package manager
+ * has this package installed, if not, prompt the user to install it.
+ * Example client code:
+ PackageManager pm = getPackageManager();
+ List<ResolveInfo> activities = pm.queryIntentActivities(
+ new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH), 0);
+ if (activities.size() == 0) {
+ Intent goToMarket = new Intent(Intent.ACTION_VIEW)
+ .setData(Uri.parse("market://details?id=ca.ilanguage.labs.pocketsphinx"));
+ startActivity(goToMarket);
+ }
+
+ *
+ * @author cesine
+ *
+ */
public class RecognizerServicePocketSphinx extends RecognitionService {
@Override
Oops, something went wrong.

0 comments on commit 58ebe78

Please sign in to comment.