-
Notifications
You must be signed in to change notification settings - Fork 0
Technical Documentation
MeMa's Facial Recognition uses the cv2 and face_recognition library. The facial-recognition code was built upon this example.
MeMa's Speech Recognition uses the SpeechRecognition library. The module at codebank/mema_speech_recognition provides a wrapper around the speech recognition library, which allows capturing audio from a microphone and converting it to text. The script makes use of the speech_recognition library (aliased as sr) for the speech recognition functionality. It also utilizes other Python modules for handling C data types, creating context managers, and managing threads, but this was mainly done to mute ASLA Errors from spamming the terminal, don't worry too much about that.
Here is a description of the supplied functions within the Speech Recognition class. These should not be used directly, instead through the callback function on the mema page instance. This is just here in-case someone needs to modify the internal workings of the speech recognition process.
⚠️ Please do not call any of these functions directly
recognize_speech_internal() -> str|NoneThis function performs speech recognition using the Google Speech Recognition API. It listens on the microphone for speech and returns the recognized text as a
string. If no speech is recognized or there is an error during the recognition process, it returnsNone.
recognize_speech_thread(input_queue: Queue, stop: bool) -> NoneThis function runs the
recognize_speech_internal()function inside a context manager (noalsaerr) to suppress ALSA warnings. It continuously recognizes speech and adds the recognized phrases to the providedinput_queue. It stops when thestopflag becomesTrue.
listen(input_queue: Queue, stop: bool) -> NoneThis function starts a thread to recognize speech. Recognized phrases are added to the parsed input queue using the speech_recognition library.
noalsaerr() -> NoneThis context manager switches the libasound.so error handler to the
py_error_handlerfunction (which is empty), essentially blocking ALSA warnings that get spammed in the terminal.
py_error_handler(filename, line, function, err, fmt) -> NoneThis empty error handler is used by the
noalsaerr()context manager to block ALSA warnings.
ERROR_HANDLER_FUNCThis variable represents the C type for the error handler function used to handle ALSA errors.
ERROR_HANDLER_FUNC(py_error_handler)This function sets the error handler for ALSA errors to the provided
py_error_handlerfunction.
recognize_speech_internal() -> str|NoneUses the Speech Recognition Library to listen on the microphone and convert speech to text. Returns a string or None if no speech is recognized.
recognize_speech_thread(input_queue: Queue, stop:bool) -> NoneRuns the speech recognition function inside of a
noalserrcontext manager, which blocks ALSA warnings. When the function returns a phrase, it is added to theinput_queue.
listen(input_queue: Queue, stop: bool) -> NoneStarts a thread to recognize speech. Recognized phrases are added to the parsed input queue, using the speech_recognition library.
MeMa's Text to Speech uses the gTTS (Google Text To Speech) Library