Skip to content
Browse files

documenting classes to clarify TODOs to turn project into a general p…

…urpose speech recognition service which is offline, eyes-free and also accepts audio file input rather than recorder input
  • Loading branch information...
1 parent 16c7885 commit 58ebe780ab131c0dafe2a2970a7c9c10efbc3089 @cesine committed
View
1 .classpath
@@ -10,5 +10,6 @@
<classpathentry kind="con" path="com.android.ide.eclipse.adt.ANDROID_FRAMEWORK"/>
<classpathentry kind="con" path="org.eclipse.jdt.junit.JUNIT_CONTAINER/4"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/J2SE-1.3"/>
+ <classpathentry kind="lib" path="libs/LIUM_SpkDiarization-3.1.jar"/>
<classpathentry kind="output" path="bin"/>
</classpath>
View
37 README
@@ -2,19 +2,40 @@ About
This currently a demo of the Sphinx (specifically PocketSphinx) Automatic Speech Recognition system which runs on any Android Device 2.2 or higher. The Sphinx service is actually running on the Android, so internet connection or server connnection is needed.
+TODOs
+
+
+ TODO currently this demo can only demo the PocketSphinx speech recognizer, it doesn't make it available for other developers to call, or for the user to use generally.
+
+ TODO implement service.SpeechRecognizerViaFilePocketSphinx so that developers can pass a file to the speech recognizer and get back an array of array of hypotheses for utterances in the audio
+
+ TODO implement service.SpeechRecognizerViaRecorderSphinx so that users can do speech recognition offline, without a network connection (the default speech recognizer provided by com.google.android.voicesearch has to be online and only accepts short utterances. it cannot be used eyes-free).
+
+
+
+History of this Demo:
+ Created by David Huggins-Daines <dhuggins@cs.cmu.edu> sourceforge:dhdfu and other contributors at the cmusphinx project
+ Turned into a very user friendly Demo app and apk with very little dependencies by Aasish Pappu sourceforge: aasishp , github aasish
+ Infrastructure laid out for eyes-free offline speech recognition by github: cesine
+ Eyes-free offline speech recognition implemented by: maybe someone who knows pocketsphinx... ;)
-History
+The original source of the Android PocketSphinx Demo was started by the folks at cmusphinx updates and improvements of the Demo should ultimately appear there.
-This is a fork of aasish, who added quite a bit of work to a fork of either
-* zachrattner pre Oct 27 2010, which was a fork of cmusphinx pre Oct 27 2010
- http://www.zachrattner.com/PocketSphinxDemo.tar.gz
-* or the original project on sourceforge
- http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/PocketSphinxAndroidDemo/
+How to contribute
-The original source of the Android PocketSphinx Demo was started by the folks at cmusphinx updates and improvements of the Demo should ultimately appear there.
+* Contributors needed:
+
+** Service to annotate utterances in audio using the srt format or the WebVTT format to be compatible with future ASR recognition services provided eg Google Listen or in the YouTube Captioning API http://www.youtube.com/watch?v=tua3DdacgOo&feature=player_embedded
+
+*** Potential directions:
+ 1. Java Sound API
+ 2. LIUM tools (already used with Sphinx)
+ http://liumtools.univ-lemans.fr//index.php?option=com_content&task=blogcategory&id=32&Itemid=60
+ 3. Praat Phonetic toolkit Port to Android (allows for complex phonetic analysis (more than just silence detection) of utterance final features.
+** Turning the RecognizerTask and PocketSphinxAndroidDemo code into a SpeechRecognizer to allow it to register for the RecognizerIntent so that it can be used outside the demo
-How to use this project
+How to setup this project on your machine
Do the following steps to setup your environment in Eclipse, to run the Demo.
View
BIN libs/LIUM_SpkDiarization-3.1.jar
Binary file not shown.
View
102 src/ca/ilanguage/labs/pocketsphinx/service/RecognitionListenerPocketSphinx.java
@@ -2,61 +2,51 @@
import android.os.Bundle;
import android.speech.RecognitionListener;
-
-public class RecognitionListenerPocketSphinx implements RecognitionListener {
-
- @Override
- public void onBeginningOfSpeech() {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onBufferReceived(byte[] arg0) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onEndOfSpeech() {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onError(int error) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onEvent(int eventType, Bundle params) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onPartialResults(Bundle partialResults) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onReadyForSpeech(Bundle params) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onResults(Bundle results) {
- // TODO Auto-generated method stub
-
- }
-
- @Override
- public void onRmsChanged(float rmsdB) {
- // TODO Auto-generated method stub
-
- }
+/**
+ *
+ * This in theory should be identical to {@link android.speech.RecognitionListener} and therefore unnecessary.
+ * However it would require implementing a lot of functions in any classes that extend it (ie, the SpeechRecognizer for PocketSphinx)
+ *
+ * TODO decide if this should be a reduced version of the RecognitionListener interface (as was done in the original
+ * by David Huggins-Daines <dhuggins@cs.cmu.edu (dhdfu), or decide to implement a middle ground version of the RecognitionListner, enough so that it is possible to register
+ * PocketSphinx with the device as a speech recognition system that is capable of handling the the full functionality of the RecognizerIntent.
+ *
+ *
+ * TODO refer to {@link RecognitionListenerReduced} for an implementation of a subset of the
+ * {@link android.speech.RecognitionListener}, which dhdfu did to avoid dependencies on Froyo and methods we don't need or can't provide. But the the
+ * original manifest was changed to Froyo, so maybe this design consideration is depreciated.
+ *
+ * The full list of methods are here..
+ *
+ *
+ */
+public interface RecognitionListenerPocketSphinx {
+
+
+ public void onBeginningOfSpeech() ;
+
+
+ public void onBufferReceived(byte[] arg0) ;
+
+
+ public void onEndOfSpeech() ;
+
+
+ public void onError(int error) ;
+
+
+ public void onEvent(int eventType, Bundle params) ;
+
+
+ public void onPartialResults(Bundle partialResults) ;
+
+
+ public void onReadyForSpeech(Bundle params) ;
+
+
+ public void onResults(Bundle results) ;
+
+
+ public void onRmsChanged(float rmsdB) ;
}
View
42 src/ca/ilanguage/labs/pocketsphinx/service/RecognizerServicePocketSphinx.java
@@ -2,7 +2,47 @@
import android.content.Intent;
import android.speech.RecognitionService;
-
+/**
+ * TODO Implementation and registration of this service in the Manifest will allow it to pop up in the Device settings as a
+ * speech recognizer service. The user then can choose which service will be the default. The service should be callable in two ways
+ *
+ *
+ * 1. As the default speech recognition service device wide (configured by user in settings)
+ Example registration in the manifest:
+ <service android:name="ca.ilanguage.labs.pocketsphinx.service.RecognizerServicePocketSphinx"
+ android:label="@string/service_name">
+
+ <intent-filter>
+ <!-- Here we identify that we are a RecognitionService by specifying that
+ we satisfy RecognitionService's interface intent.
+ The constant value is defined at RecognitionService.SERVICE_INTERFACE. -->
+ <action android:name="android.speech.RecognitionService" />
+ <category android:name="android.intent.category.DEFAULT" />
+ </intent-filter>
+
+ <!-- This points to a metadata xml file that contains information about this
+ RecognitionService - specifically, the name of the settings activity to
+ expose in system settings.
+ The constant value is defined at RecognitionService.SERVICE_META_DATA. -->
+ <meta-data android:name="android.speech" android:resource="@xml/recognizer" />
+
+ </service>
+ * 2. As a speech recognition service called by an open Intent (hardcoded by developers in their code, they can also check to see if the package manager
+ * has this package installed, if not, prompt the user to install it.
+ * Example client code:
+ PackageManager pm = getPackageManager();
+ List<ResolveInfo> activities = pm.queryIntentActivities(
+ new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH), 0);
+ if (activities.size() == 0) {
+ Intent goToMarket = new Intent(Intent.ACTION_VIEW)
+ .setData(Uri.parse("market://details?id=ca.ilanguage.labs.pocketsphinx"));
+ startActivity(goToMarket);
+ }
+
+ *
+ * @author cesine
+ *
+ */
public class RecognizerServicePocketSphinx extends RecognitionService {
@Override
View
482 src/ca/ilanguage/labs/pocketsphinx/service/SpeechRecognizerViaFilePocketSphinx.java
@@ -18,471 +18,35 @@
* limitations under the License.
*/
-
-import android.content.ComponentName;
import android.content.Context;
-import android.content.Intent;
-import android.content.ServiceConnection;
-import android.content.pm.ResolveInfo;
-import android.os.Bundle;
-import android.os.Handler;
-import android.os.IBinder;
-import android.os.Looper;
-import android.os.Message;
-import android.os.RemoteException;
-import android.provider.Settings;
-import android.speech.RecognitionListener;
-import android.speech.RecognitionService;
-import android.text.TextUtils;
-import android.util.Log;
-
-import java.util.LinkedList;
-import java.util.List;
-import java.util.Queue;
/**
- * This class provides access to the speech recognition service. This service allows access to the
+ * This class provides access to the speech recognition service to run on an MP3 file which is present on the device. This service allows access to the
* speech recognizer. Do not instantiate this class directly, instead, call
* {@link SpeechRecognizerViaFilePocketSphinx#createSpeechRecognizer(Context)}. This class's methods must be
- * invoked only from the main application thread. Please note that the application must have
+ * invoked only from the main application thread.
+ *
+ * TODO This class expects an audio file
+ * 1. chunks on pauses
+ * 2. sends each chunk to be recognized (either on the device or on a server)
+ * 3. returns results as an array of array hypotheses
+ *
+ *
+ * TODO For inspiration on how this can be done see a combination of android.speech.SpeechRecognizer, PocketSphinxAndroidDemo, RecognizerTask
+ *
+ * TODO Learn how PocketSphinx works and how change PocketSphinxAndroidDemo to work for a file instead of recording
+ * some info might be here: http://sourceforge.net/projects/cmusphinx/forums/forum/5471/topic/4023606
+ * "In recent version pocketsphinx_continuous has -infile argument to pass file to decode."
+ *
+ * TODO How to do the chunking, two options:
+ * 1. LIUM tools allows for speech stream segmentation and speaker recognition.
+ * Documentation is for command line use (or perl) TODO figure out how to use it programatically
+ * 2. Port Praat to Android
+ * Praat is written in C++ and is a powerful phonetic analysis tool used by Phoneticians, might need it anyway to use additional prosodic information to improve Sphinx's recognition
+ *
+ * Please note that the application DOES NOT NEED
* {@link android.Manifest.permission#RECORD_AUDIO} permission to use this class.
*/
public class SpeechRecognizerViaFilePocketSphinx {
-// /** DEBUG value to enable verbose debug prints */
-// private final static boolean DBG = false;
-// /*workaround for Settings.Secure. missing this constant */
-// public static final String VOICE_RECOGNITION_SERVICE = "voice_recognition_service";
-//
-//
-// /** Log messages identifier */
-// private static final String TAG = "SpeechRecognizer";
-//
-// /**
-// * Used to retrieve an {@code ArrayList&lt;String&gt;} from the {@link Bundle} passed to the
-// * {@link RecognitionListener#onResults(Bundle)} and
-// * {@link RecognitionListener#onPartialResults(Bundle)} methods. These strings are the possible
-// * recognition results, where the first element is the most likely candidate.
-// */
-// public static final String RESULTS_RECOGNITION = "results_recognition";
-//
-// /** Network operation timed out. */
-// public static final int ERROR_NETWORK_TIMEOUT = 1;
-//
-// /** Other network related errors. */
-// public static final int ERROR_NETWORK = 2;
-//
-// /** Audio recording error. */
-// public static final int ERROR_AUDIO = 3;
-//
-// /** Server sends error status. */
-// public static final int ERROR_SERVER = 4;
-//
-// /** Other client side errors. */
-// public static final int ERROR_CLIENT = 5;
-//
-// /** No speech input */
-// public static final int ERROR_SPEECH_TIMEOUT = 6;
-//
-// /** No recognition result matched. */
-// public static final int ERROR_NO_MATCH = 7;
-//
-// /** RecognitionService busy. */
-// public static final int ERROR_RECOGNIZER_BUSY = 8;
-//
-// /** Insufficient permissions */
-// public static final int ERROR_INSUFFICIENT_PERMISSIONS = 9;
-//
-// /** action codes */
-// private final static int MSG_START = 1;
-// private final static int MSG_STOP = 2;
-// private final static int MSG_CANCEL = 3;
-// private final static int MSG_CHANGE_LISTENER = 4;
-//
-// /** The actual RecognitionService endpoint */
-// private IRecognitionService mService;
-//
-// /** The connection to the actual service */
-// private Connection mConnection;
-//
-// /** Context with which the manager was created */
-// private final Context mContext;
-//
-// /** Component to direct service intent to */
-// private final ComponentName mServiceComponent;
-//
-// /** Handler that will execute the main tasks */
-// private Handler mHandler = new Handler() {
-// @Override
-// public void handleMessage(Message msg) {
-// switch (msg.what) {
-// case MSG_START:
-// handleStartListening((Intent) msg.obj);
-// break;
-// case MSG_STOP:
-// handleStopMessage();
-// break;
-// case MSG_CANCEL:
-// handleCancelMessage();
-// break;
-// case MSG_CHANGE_LISTENER:
-// handleChangeListener((RecognitionListener) msg.obj);
-// break;
-// }
-// }
-// };
-//
-// /**
-// * Temporary queue, saving the messages until the connection will be established, afterwards,
-// * only mHandler will receive the messages
-// */
-// private final Queue<Message> mPendingTasks = new LinkedList<Message>();
-//
-// /** The Listener that will receive all the callbacks */
-// private final InternalListener mListener = new InternalListener();
-//
-// /**
-// * The right way to create a {@code SpeechRecognizer} is by using
-// * {@link #createSpeechRecognizer} static factory method
-// */
-// private SpeechRecognizerViaFilePocketSphinx(final Context context, final ComponentName serviceComponent) {
-// mContext = context;
-// mServiceComponent = serviceComponent;
-// }
-//
-// /**
-// * Basic ServiceConnection that records the mService variable. Additionally, on creation it
-// * invokes the {@link IRecognitionService#startListening(Intent, IRecognitionListener)}.
-// */
-// private class Connection implements ServiceConnection {
-//
-// public void onServiceConnected(final ComponentName name, final IBinder service) {
-// // always done on the application main thread, so no need to send message to mHandler
-// mService = IRecognitionService.Stub.asInterface(service);
-// if (DBG) Log.d(TAG, "onServiceConnected - Success");
-// while (!mPendingTasks.isEmpty()) {
-// mHandler.sendMessage(mPendingTasks.poll());
-// }
-// }
-//
-// public void onServiceDisconnected(final ComponentName name) {
-// // always done on the application main thread, so no need to send message to mHandler
-// mService = null;
-// mConnection = null;
-// mPendingTasks.clear();
-// if (DBG) Log.d(TAG, "onServiceDisconnected - Success");
-// }
-// }
-//
-// /**
-// * Checks whether a speech recognition service is available on the system. If this method
-// * returns {@code false}, {@link SpeechRecognizerViaFilePocketSphinx#createSpeechRecognizer(Context)} will
-// * fail.
-// *
-// * @param context with which {@code SpeechRecognizer} will be created
-// * @return {@code true} if recognition is available, {@code false} otherwise
-// */
-// public static boolean isRecognitionAvailable(final Context context) {
-// final List<ResolveInfo> list = context.getPackageManager().queryIntentServices(
-// new Intent(RecognitionService.SERVICE_INTERFACE), 0);
-// return list != null && list.size() != 0;
-// }
-//
-// /**
-// * Factory method to create a new {@code SpeechRecognizer}. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called before dispatching any
-// * command to the created {@code SpeechRecognizer}, otherwise no notifications will be
-// * received.
-// *
-// * @param context in which to create {@code SpeechRecognizer}
-// * @return a new {@code SpeechRecognizer}
-// */
-// public static SpeechRecognizerViaFilePocketSphinx createSpeechRecognizer(final Context context) {
-// return createSpeechRecognizer(context, null);
-// }
-//
-// /**
-// * Factory method to create a new {@code SpeechRecognizer}. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called before dispatching any
-// * command to the created {@code SpeechRecognizer}, otherwise no notifications will be
-// * received.
-// *
-// * Use this version of the method to specify a specific service to direct this
-// * {@link SpeechRecognizerViaFilePocketSphinx} to. Normally you would not use this; use
-// * {@link #createSpeechRecognizer(Context)} instead to use the system default recognition
-// * service.
-// *
-// * @param context in which to create {@code SpeechRecognizer}
-// * @param serviceComponent the {@link ComponentName} of a specific service to direct this
-// * {@code SpeechRecognizer} to
-// * @return a new {@code SpeechRecognizer}
-// */
-// public static SpeechRecognizerViaFilePocketSphinx createSpeechRecognizer(final Context context,
-// final ComponentName serviceComponent) {
-// if (context == null) {
-// throw new IllegalArgumentException("Context cannot be null)");
-// }
-// checkIsCalledFromMainThread();
-// return new SpeechRecognizerViaFilePocketSphinx(context, serviceComponent);
-// }
-//
-// /**
-// * Sets the listener that will receive all the callbacks. The previous unfinished commands will
-// * be executed with the old listener, while any following command will be executed with the new
-// * listener.
-// *
-// * @param listener listener that will receive all the callbacks from the created
-// * {@link SpeechRecognizerViaFilePocketSphinx}, this must not be null.
-// */
-// public void setRecognitionListener(RecognitionListener listener) {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_CHANGE_LISTENER, listener));
-// }
-//
-// /**
-// * Starts listening for speech. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// *
-// * @param recognizerIntent contains parameters for the recognition to be performed. The intent
-// * may also contain optional extras, see {@link RecognizerIntent}. If these values are
-// * not set explicitly, default values will be used by the recognizer.
-// */
-// public void startListening(final Intent recognizerIntent) {
-// if (recognizerIntent == null) {
-// throw new IllegalArgumentException("intent must not be null");
-// }
-// checkIsCalledFromMainThread();
-// if (mConnection == null) { // first time connection
-// mConnection = new Connection();
-//
-// Intent serviceIntent = new Intent(RecognitionService.SERVICE_INTERFACE);
-//
-// if (mServiceComponent == null) {
-// String serviceComponent = Settings.Secure.getString(mContext.getContentResolver(),
-// VOICE_RECOGNITION_SERVICE);
-//
-// if (TextUtils.isEmpty(serviceComponent)) {
-// Log.e(TAG, "no selected voice recognition service");
-// mListener.onError(ERROR_CLIENT);
-// return;
-// }
-//
-// serviceIntent.setComponent(ComponentName.unflattenFromString(serviceComponent));
-// } else {
-// serviceIntent.setComponent(mServiceComponent);
-// }
-//
-// if (!mContext.bindService(serviceIntent, mConnection, Context.BIND_AUTO_CREATE)) {
-// Log.e(TAG, "bind to recognition service failed");
-// mConnection = null;
-// mService = null;
-// mListener.onError(ERROR_CLIENT);
-// return;
-// }
-// }
-// putMessage(Message.obtain(mHandler, MSG_START, recognizerIntent));
-// }
-//
-// /**
-// * Stops listening for speech. Speech captured so far will be recognized as if the user had
-// * stopped speaking at this point. Note that in the default case, this does not need to be
-// * called, as the speech endpointer will automatically stop the recognizer listening when it
-// * determines speech has completed. However, you can manipulate endpointer parameters directly
-// * using the intent extras defined in {@link RecognizerIntent}, in which case you may sometimes
-// * want to manually call this method to stop listening sooner. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// */
-// public void stopListening() {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_STOP));
-// }
-//
-// /**
-// * Cancels the speech recognition. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// */
-// public void cancel() {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_CANCEL));
-// }
-//
-// private static void checkIsCalledFromMainThread() {
-// if (Looper.myLooper() != Looper.getMainLooper()) {
-// throw new RuntimeException(
-// "SpeechRecognizer should be used only from the application's main thread");
-// }
-// }
-//
-// private void putMessage(Message msg) {
-// if (mService == null) {
-// mPendingTasks.offer(msg);
-// } else {
-// mHandler.sendMessage(msg);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleStartListening(Intent recognizerIntent) {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.startListening(recognizerIntent, mListener);
-// if (DBG) Log.d(TAG, "service start listening command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "startListening() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleStopMessage() {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.stopListening(mListener);
-// if (DBG) Log.d(TAG, "service stop listening command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "stopListening() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleCancelMessage() {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.cancel(mListener);
-// if (DBG) Log.d(TAG, "service cancel command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "cancel() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// private boolean checkOpenConnection() {
-// if (mService != null) {
-// return true;
-// }
-// mListener.onError(ERROR_CLIENT);
-// Log.e(TAG, "not connected to the recognition service");
-// return false;
-// }
-//
-// /** changes the listener */
-// private void handleChangeListener(RecognitionListener listener) {
-// if (DBG) Log.d(TAG, "handleChangeListener, listener=" + listener);
-// mListener.mInternalListener = listener;
-// }
-//
-// /**
-// * Destroys the {@code SpeechRecognizer} object.
-// */
-// public void destroy() {
-// if (mConnection != null) {
-// mContext.unbindService(mConnection);
-// }
-// mPendingTasks.clear();
-// mService = null;
-// mConnection = null;
-// mListener.mInternalListener = null;
-// }
-//
-// /**
-// * Internal wrapper of IRecognitionListener which will propagate the results to
-// * RecognitionListener
-// */
-// private class InternalListener extends IRecognitionListener.Stub {
-// private RecognitionListener mInternalListener;
-//
-// private final static int MSG_BEGINNING_OF_SPEECH = 1;
-// private final static int MSG_BUFFER_RECEIVED = 2;
-// private final static int MSG_END_OF_SPEECH = 3;
-// private final static int MSG_ERROR = 4;
-// private final static int MSG_READY_FOR_SPEECH = 5;
-// private final static int MSG_RESULTS = 6;
-// private final static int MSG_PARTIAL_RESULTS = 7;
-// private final static int MSG_RMS_CHANGED = 8;
-// private final static int MSG_ON_EVENT = 9;
-//
-// private final Handler mInternalHandler = new Handler() {
-// @Override
-// public void handleMessage(Message msg) {
-// if (mInternalListener == null) {
-// return;
-// }
-// switch (msg.what) {
-// case MSG_BEGINNING_OF_SPEECH:
-// mInternalListener.onBeginningOfSpeech();
-// break;
-// case MSG_BUFFER_RECEIVED:
-// mInternalListener.onBufferReceived((byte[]) msg.obj);
-// break;
-// case MSG_END_OF_SPEECH:
-// mInternalListener.onEndOfSpeech();
-// break;
-// case MSG_ERROR:
-// mInternalListener.onError((Integer) msg.obj);
-// break;
-// case MSG_READY_FOR_SPEECH:
-// mInternalListener.onReadyForSpeech((Bundle) msg.obj);
-// break;
-// case MSG_RESULTS:
-// mInternalListener.onResults((Bundle) msg.obj);
-// break;
-// case MSG_PARTIAL_RESULTS:
-// mInternalListener.onPartialResults((Bundle) msg.obj);
-// break;
-// case MSG_RMS_CHANGED:
-// mInternalListener.onRmsChanged((Float) msg.obj);
-// break;
-// case MSG_ON_EVENT:
-// mInternalListener.onEvent(msg.arg1, (Bundle) msg.obj);
-// break;
-// }
-// }
-// };
-//
-// public void onBeginningOfSpeech() {
-// Message.obtain(mInternalHandler, MSG_BEGINNING_OF_SPEECH).sendToTarget();
-// }
-//
-// public void onBufferReceived(final byte[] buffer) {
-// Message.obtain(mInternalHandler, MSG_BUFFER_RECEIVED, buffer).sendToTarget();
-// }
-//
-// public void onEndOfSpeech() {
-// Message.obtain(mInternalHandler, MSG_END_OF_SPEECH).sendToTarget();
-// }
-//
-// public void onError(final int error) {
-// Message.obtain(mInternalHandler, MSG_ERROR, error).sendToTarget();
-// }
-//
-// public void onReadyForSpeech(final Bundle noiseParams) {
-// Message.obtain(mInternalHandler, MSG_READY_FOR_SPEECH, noiseParams).sendToTarget();
-// }
-//
-// public void onResults(final Bundle results) {
-// Message.obtain(mInternalHandler, MSG_RESULTS, results).sendToTarget();
-// }
-//
-// public void onPartialResults(final Bundle results) {
-// Message.obtain(mInternalHandler, MSG_PARTIAL_RESULTS, results).sendToTarget();
-// }
-//
-// public void onRmsChanged(final float rmsdB) {
-// Message.obtain(mInternalHandler, MSG_RMS_CHANGED, rmsdB).sendToTarget();
-// }
-//
-// public void onEvent(final int eventType, final Bundle params) {
-// Message.obtain(mInternalHandler, MSG_ON_EVENT, eventType, eventType, params)
-// .sendToTarget();
-// }
-// }
+
}
View
470 src/ca/ilanguage/labs/pocketsphinx/service/SpeechRecognizerViaRecordPocketSphinx.java
@@ -19,25 +19,7 @@
*/
-import android.content.ComponentName;
import android.content.Context;
-import android.content.Intent;
-import android.content.ServiceConnection;
-import android.content.pm.ResolveInfo;
-import android.os.Bundle;
-import android.os.Handler;
-import android.os.IBinder;
-import android.os.Looper;
-import android.os.Message;
-import android.os.RemoteException;
-import android.provider.Settings;
-import android.speech.RecognitionService;
-import android.text.TextUtils;
-import android.util.Log;
-
-import java.util.LinkedList;
-import java.util.List;
-import java.util.Queue;
/**
* This class provides access to the speech recognition service. This service allows access to the
@@ -45,443 +27,21 @@
* {@link SpeechRecognizerViaRecordPocketSphinx#createSpeechRecognizer(Context)}. This class's methods must be
* invoked only from the main application thread. Please note that the application must have
* {@link android.Manifest.permission#RECORD_AUDIO} permission to use this class.
+ *
+ * TODO this class differs from the default system android.speech.SpeechRecognizer in that:
+ * 1. It runs with no network connection (using the PocketSphinx running on the device)
+ * 2. It allows long audio recording (it does not automatically stop listening when it detects a silence)
+ * It stops recording based on the user preferences
+ * a. on back button push
+ * b. on screen touch
+ * c. on screen swipe top to bottom
+ * d. on voice command (more difficult to implement, but prefered for eyesfree use, and for recording while user is doing something else with the screen)
+ *
+ *
+ * TODO For inspiration on how this can be done see a combination of android.speech.SpeechRecognizer, PocketSphinxAndroidDemo, RecognizerTask
+ *
+ *
*/
public class SpeechRecognizerViaRecordPocketSphinx {
-// /** DEBUG value to enable verbose debug prints */
-// private final static boolean DBG = false;
-// /*workaround for Settings.Secure. missing this constant */
-// public static final String VOICE_RECOGNITION_SERVICE = "voice_recognition_service";
-//
-//
-// /** Log messages identifier */
-// private static final String TAG = "SpeechRecognizer";
-//
-// /**
-// * Used to retrieve an {@code ArrayList&lt;String&gt;} from the {@link Bundle} passed to the
-// * {@link RecognitionListener#onResults(Bundle)} and
-// * {@link RecognitionListener#onPartialResults(Bundle)} methods. These strings are the possible
-// * recognition results, where the first element is the most likely candidate.
-// */
-// public static final String RESULTS_RECOGNITION = "results_recognition";
-//
-// /** Network operation timed out. */
-// public static final int ERROR_NETWORK_TIMEOUT = 1;
-//
-// /** Other network related errors. */
-// public static final int ERROR_NETWORK = 2;
-//
-// /** Audio recording error. */
-// public static final int ERROR_AUDIO = 3;
-//
-// /** Server sends error status. */
-// public static final int ERROR_SERVER = 4;
-//
-// /** Other client side errors. */
-// public static final int ERROR_CLIENT = 5;
-//
-// /** No speech input */
-// public static final int ERROR_SPEECH_TIMEOUT = 6;
-//
-// /** No recognition result matched. */
-// public static final int ERROR_NO_MATCH = 7;
-//
-// /** RecognitionService busy. */
-// public static final int ERROR_RECOGNIZER_BUSY = 8;
-//
-// /** Insufficient permissions */
-// public static final int ERROR_INSUFFICIENT_PERMISSIONS = 9;
-//
-// /** action codes */
-// private final static int MSG_START = 1;
-// private final static int MSG_STOP = 2;
-// private final static int MSG_CANCEL = 3;
-// private final static int MSG_CHANGE_LISTENER = 4;
-//
-// /** The actual RecognitionService endpoint */
-// private IRecognitionService mService;
-//
-// /** The connection to the actual service */
-// private Connection mConnection;
-//
-// /** Context with which the manager was created */
-// private final Context mContext;
-//
-// /** Component to direct service intent to */
-// private final ComponentName mServiceComponent;
-//
-// /** Handler that will execute the main tasks */
-// private Handler mHandler = new Handler() {
-// @Override
-// public void handleMessage(Message msg) {
-// switch (msg.what) {
-// case MSG_START:
-// handleStartListening((Intent) msg.obj);
-// break;
-// case MSG_STOP:
-// handleStopMessage();
-// break;
-// case MSG_CANCEL:
-// handleCancelMessage();
-// break;
-// case MSG_CHANGE_LISTENER:
-// handleChangeListener((RecognitionListenerPocketSphinx) msg.obj);
-// break;
-// }
-// }
-// };
-//
-// /**
-// * Temporary queue, saving the messages until the connection will be established, afterwards,
-// * only mHandler will receive the messages
-// */
-// private final Queue<Message> mPendingTasks = new LinkedList<Message>();
-//
-// /** The Listener that will receive all the callbacks */
-// private final InternalListener mListener = new InternalListener();
-//
-// /**
-// * The right way to create a {@code SpeechRecognizer} is by using
-// * {@link #createSpeechRecognizer} static factory method
-// */
-// private SpeechRecognizerViaRecordPocketSphinx(final Context context, final ComponentName serviceComponent) {
-// mContext = context;
-// mServiceComponent = serviceComponent;
-// }
-//
-// /**
-// * Basic ServiceConnection that records the mService variable. Additionally, on creation it
-// * invokes the {@link IRecognitionService#startListening(Intent, IRecognitionListener)}.
-// */
-// private class Connection implements ServiceConnection {
-//
-// public void onServiceConnected(final ComponentName name, final IBinder service) {
-// // always done on the application main thread, so no need to send message to mHandler
-// mService = IRecognitionService.Stub.asInterface(service);
-// if (DBG) Log.d(TAG, "onServiceConnected - Success");
-// while (!mPendingTasks.isEmpty()) {
-// mHandler.sendMessage(mPendingTasks.poll());
-// }
-// }
-//
-// public void onServiceDisconnected(final ComponentName name) {
-// // always done on the application main thread, so no need to send message to mHandler
-// mService = null;
-// mConnection = null;
-// mPendingTasks.clear();
-// if (DBG) Log.d(TAG, "onServiceDisconnected - Success");
-// }
-// }
-//
-// /**
-// * Checks whether a speech recognition service is available on the system. If this method
-// * returns {@code false}, {@link SpeechRecognizerViaRecordPocketSphinx#createSpeechRecognizer(Context)} will
-// * fail.
-// *
-// * @param context with which {@code SpeechRecognizer} will be created
-// * @return {@code true} if recognition is available, {@code false} otherwise
-// */
-// public static boolean isRecognitionAvailable(final Context context) {
-// final List<ResolveInfo> list = context.getPackageManager().queryIntentServices(
-// new Intent(RecognitionService.SERVICE_INTERFACE), 0);
-// return list != null && list.size() != 0;
-// }
-//
-// /**
-// * Factory method to create a new {@code SpeechRecognizer}. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called before dispatching any
-// * command to the created {@code SpeechRecognizer}, otherwise no notifications will be
-// * received.
-// *
-// * @param context in which to create {@code SpeechRecognizer}
-// * @return a new {@code SpeechRecognizer}
-// */
-// public static SpeechRecognizerViaRecordPocketSphinx createSpeechRecognizer(final Context context) {
-// return createSpeechRecognizer(context, null);
-// }
-//
-// /**
-// * Factory method to create a new {@code SpeechRecognizer}. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called before dispatching any
-// * command to the created {@code SpeechRecognizer}, otherwise no notifications will be
-// * received.
-// *
-// * Use this version of the method to specify a specific service to direct this
-// * {@link SpeechRecognizerViaRecordPocketSphinx} to. Normally you would not use this; use
-// * {@link #createSpeechRecognizer(Context)} instead to use the system default recognition
-// * service.
-// *
-// * @param context in which to create {@code SpeechRecognizer}
-// * @param serviceComponent the {@link ComponentName} of a specific service to direct this
-// * {@code SpeechRecognizer} to
-// * @return a new {@code SpeechRecognizer}
-// */
-// public static SpeechRecognizerViaRecordPocketSphinx createSpeechRecognizer(final Context context,
-// final ComponentName serviceComponent) {
-// if (context == null) {
-// throw new IllegalArgumentException("Context cannot be null)");
-// }
-// checkIsCalledFromMainThread();
-// return new SpeechRecognizerViaRecordPocketSphinx(context, serviceComponent);
-// }
-//
-// /**
-// * Sets the listener that will receive all the callbacks. The previous unfinished commands will
-// * be executed with the old listener, while any following command will be executed with the new
-// * listener.
-// *
-// * @param listener listener that will receive all the callbacks from the created
-// * {@link SpeechRecognizerViaRecordPocketSphinx}, this must not be null.
-// */
-// public void setRecognitionListener(RecognitionListenerPocketSphinx listener) {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_CHANGE_LISTENER, listener));
-// }
-//
-// /**
-// * Starts listening for speech. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// *
-// * @param recognizerIntent contains parameters for the recognition to be performed. The intent
-// * may also contain optional extras, see {@link RecognizerIntent}. If these values are
-// * not set explicitly, default values will be used by the recognizer.
-// */
-// public void startListening(final Intent recognizerIntent) {
-// if (recognizerIntent == null) {
-// throw new IllegalArgumentException("intent must not be null");
-// }
-// checkIsCalledFromMainThread();
-// if (mConnection == null) { // first time connection
-// mConnection = new Connection();
-//
-// Intent serviceIntent = new Intent(RecognitionService.SERVICE_INTERFACE);
-//
-// if (mServiceComponent == null) {
-// String serviceComponent = Settings.Secure.getString(mContext.getContentResolver(),
-// VOICE_RECOGNITION_SERVICE);
-//
-// if (TextUtils.isEmpty(serviceComponent)) {
-// Log.e(TAG, "no selected voice recognition service");
-// mListener.onError(ERROR_CLIENT);
-// return;
-// }
-//
-// serviceIntent.setComponent(ComponentName.unflattenFromString(serviceComponent));
-// } else {
-// serviceIntent.setComponent(mServiceComponent);
-// }
-//
-// if (!mContext.bindService(serviceIntent, mConnection, Context.BIND_AUTO_CREATE)) {
-// Log.e(TAG, "bind to recognition service failed");
-// mConnection = null;
-// mService = null;
-// mListener.onError(ERROR_CLIENT);
-// return;
-// }
-// }
-// putMessage(Message.obtain(mHandler, MSG_START, recognizerIntent));
-// }
-//
-// /**
-// * Stops listening for speech. Speech captured so far will be recognized as if the user had
-// * stopped speaking at this point. Note that in the default case, this does not need to be
-// * called, as the speech endpointer will automatically stop the recognizer listening when it
-// * determines speech has completed. However, you can manipulate endpointer parameters directly
-// * using the intent extras defined in {@link RecognizerIntent}, in which case you may sometimes
-// * want to manually call this method to stop listening sooner. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// */
-// public void stopListening() {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_STOP));
-// }
-//
-// /**
-// * Cancels the speech recognition. Please note that
-// * {@link #setRecognitionListener(RecognitionListener)} should be called beforehand, otherwise
-// * no notifications will be received.
-// */
-// public void cancel() {
-// checkIsCalledFromMainThread();
-// putMessage(Message.obtain(mHandler, MSG_CANCEL));
-// }
-//
-// private static void checkIsCalledFromMainThread() {
-// if (Looper.myLooper() != Looper.getMainLooper()) {
-// throw new RuntimeException(
-// "SpeechRecognizer should be used only from the application's main thread");
-// }
-// }
-//
-// private void putMessage(Message msg) {
-// if (mService == null) {
-// mPendingTasks.offer(msg);
-// } else {
-// mHandler.sendMessage(msg);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleStartListening(Intent recognizerIntent) {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.startListening(recognizerIntent, mListener);
-// if (DBG) Log.d(TAG, "service start listening command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "startListening() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleStopMessage() {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.stopListening(mListener);
-// if (DBG) Log.d(TAG, "service stop listening command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "stopListening() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// /** sends the actual message to the service */
-// private void handleCancelMessage() {
-// if (!checkOpenConnection()) {
-// return;
-// }
-// try {
-// mService.cancel(mListener);
-// if (DBG) Log.d(TAG, "service cancel command succeded");
-// } catch (final RemoteException e) {
-// Log.e(TAG, "cancel() failed", e);
-// mListener.onError(ERROR_CLIENT);
-// }
-// }
-//
-// private boolean checkOpenConnection() {
-// if (mService != null) {
-// return true;
-// }
-// mListener.onError(ERROR_CLIENT);
-// Log.e(TAG, "not connected to the recognition service");
-// return false;
-// }
-//
-// /** changes the listener */
-// private void handleChangeListener(RecognitionListener listener) {
-// if (DBG) Log.d(TAG, "handleChangeListener, listener=" + listener);
-// mListener.mInternalListener = listener;
-// }
-//
-// /**
-// * Destroys the {@code SpeechRecognizer} object.
-// */
-// public void destroy() {
-// if (mConnection != null) {
-// mContext.unbindService(mConnection);
-// }
-// mPendingTasks.clear();
-// mService = null;
-// mConnection = null;
-// mListener.mInternalListener = null;
-// }
-//
-// /**
-// * Internal wrapper of IRecognitionListener which will propagate the results to
-// * RecognitionListener
-// */
-// private class InternalListener extends IRecognitionListener.Stub {
-// private RecognitionListener mInternalListener;
-//
-// private final static int MSG_BEGINNING_OF_SPEECH = 1;
-// private final static int MSG_BUFFER_RECEIVED = 2;
-// private final static int MSG_END_OF_SPEECH = 3;
-// private final static int MSG_ERROR = 4;
-// private final static int MSG_READY_FOR_SPEECH = 5;
-// private final static int MSG_RESULTS = 6;
-// private final static int MSG_PARTIAL_RESULTS = 7;
-// private final static int MSG_RMS_CHANGED = 8;
-// private final static int MSG_ON_EVENT = 9;
-//
-// private final Handler mInternalHandler = new Handler() {
-// @Override
-// public void handleMessage(Message msg) {
-// if (mInternalListener == null) {
-// return;
-// }
-// switch (msg.what) {
-// case MSG_BEGINNING_OF_SPEECH:
-// mInternalListener.onBeginningOfSpeech();
-// break;
-// case MSG_BUFFER_RECEIVED:
-// mInternalListener.onBufferReceived((byte[]) msg.obj);
-// break;
-// case MSG_END_OF_SPEECH:
-// mInternalListener.onEndOfSpeech();
-// break;
-// case MSG_ERROR:
-// mInternalListener.onError((Integer) msg.obj);
-// break;
-// case MSG_READY_FOR_SPEECH:
-// mInternalListener.onReadyForSpeech((Bundle) msg.obj);
-// break;
-// case MSG_RESULTS:
-// mInternalListener.onResults((Bundle) msg.obj);
-// break;
-// case MSG_PARTIAL_RESULTS:
-// mInternalListener.onPartialResults((Bundle) msg.obj);
-// break;
-// case MSG_RMS_CHANGED:
-// mInternalListener.onRmsChanged((Float) msg.obj);
-// break;
-// case MSG_ON_EVENT:
-// mInternalListener.onEvent(msg.arg1, (Bundle) msg.obj);
-// break;
-// }
-// }
-// };
-//
-// public void onBeginningOfSpeech() {
-// Message.obtain(mInternalHandler, MSG_BEGINNING_OF_SPEECH).sendToTarget();
-// }
-//
-// public void onBufferReceived(final byte[] buffer) {
-// Message.obtain(mInternalHandler, MSG_BUFFER_RECEIVED, buffer).sendToTarget();
-// }
-//
-// public void onEndOfSpeech() {
-// Message.obtain(mInternalHandler, MSG_END_OF_SPEECH).sendToTarget();
-// }
-//
-// public void onError(final int error) {
-// Message.obtain(mInternalHandler, MSG_ERROR, error).sendToTarget();
-// }
-//
-// public void onReadyForSpeech(final Bundle noiseParams) {
-// Message.obtain(mInternalHandler, MSG_READY_FOR_SPEECH, noiseParams).sendToTarget();
-// }
-//
-// public void onResults(final Bundle results) {
-// Message.obtain(mInternalHandler, MSG_RESULTS, results).sendToTarget();
-// }
-//
-// public void onPartialResults(final Bundle results) {
-// Message.obtain(mInternalHandler, MSG_PARTIAL_RESULTS, results).sendToTarget();
-// }
-//
-// public void onRmsChanged(final float rmsdB) {
-// Message.obtain(mInternalHandler, MSG_RMS_CHANGED, rmsdB).sendToTarget();
-// }
-//
-// public void onEvent(final int eventType, final Bundle params) {
-// Message.obtain(mInternalHandler, MSG_ON_EVENT, eventType, eventType, params)
-// .sendToTarget();
-// }
-// }
+
}
View
21 src/ca/ilanguage/labs/pocketsphinx/ui/PocketSphinxAndroidDemo.java
@@ -56,6 +56,27 @@
* this is called the First Pass. Then after the users releases the Hold and Speak button the app
* goes through and tries to improve the quality of the recognized text now that it has all the context.
*
+ * TODO currently this demo can only demo the PocketSphinx speech recognizer, it doesn't make it available for other
+ * developers to call, or for the user to use generally.
+ *
+ * TODO implement service.SpeechRecognizerViaFilePocketSphinx so that developers can pass a file to the speech recognizer and get bac
+ * an array of array of hypotheses for utterances in the audio
+ *
+ * TODO implement service.SpeechRecognizerViaRecorderSphinx so that users can do speech recognition offline, without a network connection
+ * (the default speech recognizer provided by com.google.android.voicesearch has to be online and only accepts short utterances. it cannot be used eyes-free).
+ *
+ *
+ * TODO once the two speech recognizers are implemented, edit the ui.TestPocketSPhinxAndAndroidASR so that if PocketSphinx is enabled in the settings,
+ * it will run the service.SpeechRecognizerViaRecorderSphinx
+ *
+ *
+ *
+ * History of this Demo:
+ * Created by David Huggins-Daines <dhuggins@cs.cmu.edu> sourceforge:dhdfu and other contributors at the cmusphinx project
+ * Turned into a very user friendly Demo app and apk with very little dependencies by Aasish Pappu sourceforge: aasishp , github aasish
+ * Infrastructure laid out for eyes-free offline speech recognition by github: cesine
+ * Eyes-free offline speech recognition implemented by: maybe someone who knows pocketsphinx while i learn how to use it.. ;)
+ *
* @author aasish
*
*/
View
4 src/ca/ilanguage/labs/pocketsphinx/ui/TestPocketSphinxAndAndroidASR.java
@@ -32,9 +32,9 @@ public void onCreate(Bundle savedInstanceState) {
PreferenceConstants.PREFERENCE_NAME, MODE_PRIVATE);
mUsePocektSphinxASR = prefs.getBoolean(PreferenceConstants.PREFERENCE_USE_POCKETSPHINX_ASR, false);
if (mUsePocektSphinxASR){
- Toast.makeText(TestPocketSphinxAndAndroidASR.this, "Working offline, using PocketSphinx Speech recognizer. ", Toast.LENGTH_LONG).show();
+ Toast.makeText(TestPocketSphinxAndAndroidASR.this, "Would be working offline, using PocketSphinx Speech recognizer. However it is not yet tied into the system and implemented and registered as a Speech Recognizer.", Toast.LENGTH_LONG).show();
}else{
- Toast.makeText(TestPocketSphinxAndAndroidASR.this, "Working online, uUsing system speech recognizer (Google speech recognition server). ", Toast.LENGTH_LONG).show();
+ Toast.makeText(TestPocketSphinxAndAndroidASR.this, "Working online, Using system speech recognizer (Google speech recognition server). ", Toast.LENGTH_LONG).show();
}
PackageManager pm = getPackageManager();
List<ResolveInfo> activities = pm.queryIntentActivities(

0 comments on commit 58ebe78

Please sign in to comment.
Something went wrong with that request. Please try again.