Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic speech-to-text functionality #33

Merged
merged 21 commits into from
Apr 22, 2020
Merged

Add basic speech-to-text functionality #33

merged 21 commits into from
Apr 22, 2020

Conversation

hauptdigital
Copy link
Owner

@hauptdigital hauptdigital commented Apr 21, 2020

In this pull request I added the basic speech-to-text functionality on the website. Users are now able to start and stop recording of notes, speak English text and the app will output transscribed text.

Test it here:
https://dev.deepspeech-notes.haupt.digital/

You have to speak very clearly and a good microphone will also help the accuracy of the language processing.

notes

To create this feature, I developed this process using different technologies:

To document the functionality, I added this table:

File Order Task
Back end server.js model.js 1 Create deepspeech language model on Express server start
Back end server.js socker.js 2 Start socket on Express server start
Front end Notes.js audio.js 3 Create media stream source from browser microphone and start socket on user interaction with microphone
Front end audio.js voice-processor.js 4 Create mono audio buffer from microphone media stream
Front end audio.js downsampler.js 5 Resample audio buffer to 16.000 sample rate
Front end audio.js 6 Send buffer to backend via socket
Back end audio.js 7 Identify voice bits in received audio with voice activity detection module
Back end audio.js model.js 8 Transcribe identified voice bits into text
Back end audio.js socket.js 9 Emit transcribed text to front end via socket
Front end Notes.js audio.js 10 Display transcribed text on page
Front end Notes.js audio.js 11 Close media stream, audio processing and socket on user interaction (deactivate microphone)

@hauptdigital hauptdigital self-assigned this Apr 21, 2020
@hauptdigital hauptdigital added backend Backend frontend Frontend react React labels Apr 21, 2020
@hauptdigital hauptdigital added this to In progress in Release 1 (MVP) via automation Apr 21, 2020
@hauptdigital hauptdigital added this to the Sprint 3 milestone Apr 21, 2020
@hauptdigital hauptdigital changed the title Recordaudio Add basic speech-to-text functionality Apr 21, 2020
@hauptdigital hauptdigital marked this pull request as ready for review April 22, 2020 16:28
Copy link
Contributor

@lmachens lmachens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 🎉
Recording works for me too.

I reviewed most of the files and have to say, that I am very impressed 👍 . Keep up the good work.
Some of my comments are code style-related. It's up to you :).

client/src/components/RecordButton.js Outdated Show resolved Hide resolved
client/src/pages/Notes.js Show resolved Hide resolved
Comment on lines 11 to 19
if (!isRecording) {
setIsRecording(startRecording());
const socket = getSocket();
socket.on('recognize', (results) => {
updateNoteContent(results.text);
});
} else {
setIsRecording(stopRecording().isRecording);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you have a memory leak here.
This event listener is never destroyed.
Every time you start a new recording, it will create a new listener on the recognize event. This listener (or callback) remains in memory and might be executed.

It's important to remove that listener when you stop recording:
https://socket.io/docs/server-api/#socket-removeListener-eventName-listener
This issue happens in multiple files.

    function handleRecognize(recognized) {
      updateNoteContent(recognized.text);
    }

    function handleRecordButtonClick() {
      const socket = getSocket();
      if (!isRecording) {
        setIsRecording(startRecording());  
        socket.on('recognize', handleRecognize);
      } else {
        socket.removeListener('recognize', handleRecognize);
        setIsRecording(stopRecording().isRecording);
      }
    }

This would be a good case for an useEffect if you want to have it more React-like (I recommend this solution):

   React.useEffect(() => {
     if (!isRecording) {
       return;
     }

     function handleRecognize(recognized) {
        updateNoteContent(recognized.text);
     }

     const socket = getSocket();
     socket.on('recognize', handleRecognize);

     return () => {
       socket.removeListener('recognize', handleRecognize);
     }
   }, [isRecording]);

   function handleRecordButtonClick() {
      if (!isRecording) {
        setIsRecording(startRecording());
      } else {
        setIsRecording(stopRecording().isRecording);
      }
    }

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied your solution with useEffect and it works. Thanks! But I don't know why it works 😅

What is the meaning of this:

 return () => {
       socket.removeListener('recognize', handleRecognize);
     }

To me it looks like this would directly remove the socket.on event listener. But when I run my application, I see that everything works as intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the useEffect documentation to understand what the return function is used for :)

client/src/utils/audio.js Outdated Show resolved Hide resolved
client/src/utils/audio.js Show resolved Hide resolved
src/audio.js Outdated Show resolved Hide resolved
src/audio.js Show resolved Hide resolved
src/audio.js Outdated Show resolved Hide resolved
src/audio.js Outdated Show resolved Hide resolved
src/audio.js Outdated Show resolved Hide resolved
@hauptdigital hauptdigital merged commit 44503f9 into master Apr 22, 2020
Release 1 (MVP) automation moved this from In progress to Done Apr 22, 2020
@hauptdigital hauptdigital deleted the recordaudio branch April 22, 2020 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Backend frontend Frontend react React
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants