Resonite Voice Bridge

Logo by DALL-E, edited by Zetaphor. Application written in collaboration with GPT-4

This application enables the use of Speech-To-Text in Resonite, by bridging Google Chrome's STT API with a Websocket server.

This enables the creation of tools like real-time captioning, or voice controlled objects.

Additionally it includes a simple visual command editor (Google Blockly) for easily turning natural language into commands with parameters. This enables the ease of development for more complex use cases like making a voice assistant.

Features

Real-time speech-to-text transcription (Google Chrome STT API)
Websocket server for Resonite. Includes full control and status of every feature using websocket commands and events
Visual command editor to create voice complex commands, reducing the need for Protoflux string parsing
Word replacement, punctuation removal, and more.

Download

The latest version can be found on the releases page.

Running the server

This application requires Google Chrome, as it uses the Web Speech API. Please note that the speech recognition API in use is provided by Google.

I plan to add Whisper STT running in the browser via WebGPU for completely local offline inference, but this is waiting on the release of Transformers.js version 3.

Launch the server executable, and make sure to allow the application through the Windows Firewall. Then open http://localhost:5000/ in Google Chrome. Grant the microphone permission and test the interface by speaking. You should see page saying the microhone is listening, the websocket is connected, and your spoken text appearing.

In Resonite, use the Websocket Connect node to create a websocket connection to ws://localhost:6789. Any speech the page detects will be sent to this connection.

Use the Websocket Message Received node to receive real-time updates from the speech recognition.

How it works

Internally the script is hosting both a webserver for the interface and a websocket server for Resonite to connect to.

The page you load uses Javascript to utilize Google's SpeechRecognition API via Chrome, and then sends that information to the websocket server.

The websocket server is configured to echo any message it receives back to all other connected clients.

Troubleshooting

Make sure you've granted the appplication internet access via the Windows Firewall. Additionally make sure you've granted the microphone permission in Chrome.

If you're not getting speech transcription in the webpage, make sure Chrome is listening to the correct input device by clicking the microphone icon in the address bar:

Additionally try testing the Chrome speech API on a different site to verify it's working: https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/

Building the executable

Install the pyinstaller package and then run it against server.py

pip install pyinstaller && pyinstaller server.py

Then copy the static and templates folders into the _internal folder in the dist output

Disclaimer

This project is in no way affiliated with by Resonite or any member of its staff.

TODO:

Implement custom timeouts instead of relying on the end event

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

static

static

templates

templates

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

server.py

server.py

Repository files navigation

Resonite Voice Bridge

Logo by DALL-E, edited by Zetaphor. Application written in collaboration with GPT-4

Features

Download

Running the server

How it works

Troubleshooting

Building the executable

Disclaimer

TODO:

About

Releases 10

Packages

Contributors 2

Languages

License

theneolanders/resonite-voice-bridge

Folders and files

Latest commit

History

Repository files navigation

Resonite Voice Bridge

Logo by DALL-E, edited by Zetaphor. Application written in collaboration with GPT-4

Features

Download

Running the server

How it works

Troubleshooting

Building the executable

Disclaimer

TODO:

About

Resources

License

Stars

Watchers

Forks

Languages