CoVoX

Cloud enabled library providing a customizable voice-interface for your application or your device

Covox allows the interaction with an application or device through voice.
You provide a list of Commands, i.e. operations that can be invoked via the voice interface, Covox then listens to the audio and when a command matches with the spoken words, it's executed. It also has multi-language support!

With some imagination you could speak to a calculator, a virtual assistant, or a CRM application!

How it works

define commands and provide them to a CovoxEngine instance
start the audio capture by calling the method covox.StartAsync
covox will translate and recognize the input, and then it will emit the event Recognized
execute the logic connected to the detected command

Getting started

Covox is offered as .NET library and acts on behalf of the Azure Cognitive Services, therefore to use it you will need:

an Azure Cognitive Services subscription key (follow this guideline)
a .NET project or application
a device connected to internet
a device with a working microphone

In order to get started, take a look at the samples.

How to use

Consider a simple use case: a voice-controlled light-switching application.

Define the available commands, with unique IDs and one or many voice triggers (in English):

var turnOnLightCmd = new Command
{
    Id = "TurnOnLight",
    VoiceTriggers = new[] { "turn on the light", "light on", "on" }
};

var turnOffLightCmd = new Command
{
    Id = "TurnOffLight",
    VoiceTriggers = new[] { "turn off the light", "light off", "off" }
};

Create an instance of CovoxEngine:

var covox = new CovoxEngine(new Configuration
{
    AzureConfiguration = AzureConfiguration.FromSubscription(
        subscriptionKey: YOUR_SUBSCRIPTION_KEY,
        region: YOUR_REGION),

    // Define all the languages that can be regognized
    InputLanguages = new[] { "en-US", "de-DE", "it-IT", "es-ES" },
});

covox.RegisterCommands(turnOnLightCmd, turnOffLightCmd);

Define a delegate for when a command is recognized:

covox.Recognized += (cmd, ctx) =>
{
    if (cmd == turnOnLightCmd) { /* ... */ }
    else if (cmd == turnOffLightCmd) { /* ... */ }
};

await covox.StartAsync();

Use case scenarios

Basic

LightSwitch

(source) Basic showcase of the engine and commands invocation.

Commands

turn on the lights
output: "Light on"
turn off the lights
output: "Light off"

Web application

Pac-Scream

Pac-Scream is a variant on the popular game Pac-Man, in which movements are defined via voice commands instead of keys press.

Commands

left / move left
right / move right
up / move up
down / move down
stop / cancel / no
to cancel the previous command

Technologies

CoVoX engine
ASP.NET Core 5
SignalR
WebGL

Mobile application

Find-it

Find-it it's a Mobile App that is able to recognize objects in an image, or in a video, from user voice request. Given an image or a video, if the user requests to see a particular object, the application will create a box around the object that match the description.

Technologies

AI/Machine Learning

Guess-Who

Guess Who is a game for 2 players. Each player has a "playing field" with different people and a fixed person, which must be guessed by the opponent, by exclusion questions. Via Voice commands you should be able to ask a question, such as, "Does the woman have red hair?" Image recognition should then return the answer yes / no.

Procedure

Asking a Question via Voice Command
Recognize and process question
Looking at e.g. Image and detect the answer
Returning Answer (Yes / No)

Technologies

CoVoX engine
Python / Tensorflow
Face

Security

Voice-Unlock

Voice-Unlock showcases the voice recognition service from azure. An application will display a locked lock. If the authorized user says "Unlock", the lock should unlock. Instead, if an unauthorized users says "Unlock" the background flashes a few seconds in red.

Technologies

CoVoX engine
Speaker Recognition
VueJS application

External device

Robobutler

Robobutler is a robot capable of executing voice triggered actions based on its perception of the current environment. The idea is that an operator can tell the robot to "Bring me the yellow box" and the robot will in this case do the following:

Confirm/Repeat the task the robot was told to do
Go to the yellow box
Pick it up
Bring it to the operator

Other possible scenarios

Placing a box on top of another
Basic movements (Stop, rotate, etc)
Spatial awarness (e.g. go to the nearest corner)

Benefit to the real world

In the real world you could have a warehouse with a lot of heavy weight packages. Working in a human-robot collaboration environment the human would be able to control the robot either with a controller or by voice. Adding intelligence to the robot does simplify the interaction with the robot increasing the overall productivity and performance of the human and the facility. Furthermore it enables the human do multitask.

Robo to use

https://www.dji.com/de/robomaster-s1

The desired configuration would be an industrial arm on top of a body with wheels to represent a valid scenario for the industry.

Technologies

CoVoX engine
Azure computer vision
Python (to control the robot)

Technologies

The library is developed in .NET 5 and uses the Azure's Cognitive Services.

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
.github/workflows		.github/workflows
assets		assets
poc		poc
samples		samples
scenarios		scenarios
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoVoX

How it works

Getting started

How to use

Use case scenarios

LightSwitch

Pac-Scream

Find-it

Guess-Who

Technologies

Voice-Unlock

Robobutler

Technologies

About

Releases 1

Packages

Contributors 7

Languages

License

artiso-solutions/CoVoX

Folders and files

Latest commit

History

Repository files navigation

CoVoX

How it works

Getting started

How to use

Use case scenarios

LightSwitch

Pac-Scream

Find-it

Guess-Who

Technologies

Voice-Unlock

Robobutler

Technologies

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 7

Languages

Packages