Skip to content
/ yava Public

A customizable and open source private voice assistant

License

Notifications You must be signed in to change notification settings

mdundek/yava

Repository files navigation

YAVA (Yet Another Voice Assistant)

YAVA is a full fledged open source, extensible voice assistant solution with a focus on flexibility, based on popular components that make the various pieces and bits of a complete voice assistant. designed to run on a Raspberry Pi 2/3. I have not tested it on a Raspberry Pi 4 or Zero, so if you do test it on those platforms, please let me know how it goes. Other platforms will be supported soon.

To read more about why yet another voice assistant, checkout the section "Why another voice assistant".

Table of contents

Why another voice assistant?

There are plenty of voice assistants out there, even a couple of open source once. The work they have done is great and some of them where a source of inspiration for this project.

Nevertheless, sometimes you need more out of your assistant than the usual hotword => command capture => intent matching => action kind of workflow.

Maybe you would like to skip the hotword detection part, and use the assistant on specific components of the overall solution only. Or let's say you are building a robot that for some reason needs to interact with you (ex. on environemental sensor triggers, or simply because of a custom event you defined), rather than you asking it to listen to you command when you need to interact with it by calling a hotword. Or let's say you want to capture the transcribed text from your speech without it going through the NLU component for intent and entity recognition. Or even better, you want to use two speech to text engines in your solution, one to run offline on the device for privacy, and one for accurate transcription on the cloud for specific commands within your application flow.

You see, there are alot of situations where you need flexibility of the solution in order to achieve certain goals, and that's what this project is focusing on.

Some of the key features

  • Plug & play composable architecture, flexible and extensible
  • NodeJs, Python & Java client libraries (Python & Java will be available soon)
  • Possibility to transcribe user commands to text without NLU matching
  • Possibility to hyjack and control the assistant from your application, rather than triggered by a hotword
  • Possibility to use 2 separate speech to text engines for optimised use cases

What's next

This project is fairly recent, and should be considered at the moment as work in progress. There might still be some bugs, so if you come across some unwanted behaviour or bugs, please let me know. At the moment the focus is at stabilizing the platform and finishing some components such as client libraries for other programming languages.

Work in progress:

  • Add support for other languages than English
  • Automated mandatory entity slot filling
  • Python & Java client libraries
  • Add client library function to query the NLU engine with provided text programatically rather than capturing from the microphone
  • Add Espeak & Google TTS images as alternatives for the TTS engines
  • Write documentation on how to extend and build new images for the YAVA platform

About

A customizable and open source private voice assistant

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published