Google Cloud Speech Node with Socket Playground
An easy-to-set-up playground for cross device real-time Google Speech Recognition with a Node server and socket.io.
- get a free test key from Google
- place it into the src folder and update the path in the
- open the terminal and go to the
node app.jsor with nodemon:
- go to
Run on Server
Same as run local
- config the
.envPort for a port that you've opened on the server. I'm using 1337 here, too.
- go to
your server adress
I recommend using pm2 or something similar, to keep the process running even when closing the terminal connection.
Made by Vinzenz Aubry
How Does the Client Process the Stream?
Google Cloud sends intermittent responses to the uploaded audio stream. Each response from Google Cloud contains the current estimation of the full sentence for the streamed audio.
When Google Cloud senses that the audio has reached an end of sentence, it will issue a response with an
isFinal flag set to true. Once this flag is issued, the client will finalize the sentence and write it to the document.
This process is repeated until the user ends the recording.
Interim Natural Language Processing
The client application highlights different parts of speech, such as nouns and verbs, by using this natural language processing library.
The client communicates with the server using Socket.io.
- If you have delays in calls, check if
IPV6is disabled on your server