Offline Speech Recognition and Command Execution

The meta-offline-voice-agent is an AGL Layer that enables Offline Speech Recognition and Command Execution capabilities for Automotive Grade Linux.

Introduction

The meta-offline-voice-agent layer integrates the Vosk API, Snips (Inference Only), and RASA to provide offline speech recognition and command execution for Automotive Grade Linux. The layer is based on the Kaldi ASR Toolkit, which allows for accurate and efficient speech recognition in the AGL platform, Snips which provides us with a lightweight Natural Language Intent Engine, and RASA a complete end-to-end ML framework for developing and creating chatbots and voice assistants.

Layer Status

Status: WIP (Work In Progress)

This layer is currently in development and integrates the Vosk, Snips (Inference Only), and RASA libraries. Speech recognition has been verified using test scripts from the vosk-api python examples. Additionally, both Snips and RASA have undergone testing and verification to ensure their functionality. Currently, work on command execution is still in progress.

External Dependencies

This layer depends on the following external layers:

meta-tensorflow

Working Features

The following features are currently working in the meta-offline-voice-agent layer:

Testing Features on AGL

Build Layer

In order to test the features of this layer you first need to build it as part of your final AGL image. First of all ensure that you have all the external layer dependencies included. Then you can use the following set of commands to initialize and build this layer into the agl-demo-platform qemux86_64 image:

$ source master/meta-agl/scripts/aglsetup.sh -m qemux86-64 -b build-master agl-demo agl-devel agl-offline-voice-agent
$ source agl-init-build-env
$ bitbake agl-demo-platform

The build can take anywhere from 6 hours to 24 hours or even more depending upon compute power of your machine. After the build completes you can use the following command to boot into your AGL image: (you need to install QEMU if not already for the command to work)

$ runqemu tmp/deploy/images/qemux86-64/agl-demo-platform-qemux86-64.qemuboot.conf kvm serialstdio slirp publicvnc audio

Test Vosk

(Not Recommended) The simplest way to test Vosk API is by using the following command:

$ ptest-runner python3-vosk-api

In order for the above command to work you need to turn on ptests by adding the following lines to your local.conf that can be found at meta-agl-devel/templates/feature/agl-offline-voice-agent/50_local.conf.inc:

DISTRO_FEATURES:append = " ptest"
EXTRA_IMAGE_FEATURES += "ptest-pkgs"

The above method may be the easiest one but it's not recommended because ptests increase the image build times by a substantial amount. You can look into the official vosk-api docs for usage and other ways of testing.

Test Snips

(Important) Currently, there are some library linking issues between NumPy, SciPy, and OpenBLAS. While we investigate and fix them you need to use LD_PRELOAD method as a workaround for Snips to work properly. Input the following command as soon as you boot into the target image:

$ export LD_PRELOAD=/usr/lib/libopenblas.so.0

In order to test the Snips NLU Intent Engine you can use the sample pre-trained model, by default it automatically gets built into the target image when you include this layer. To perform inference using this model you can run the following command inside your target image:

$ nlu-inference-agl parse /usr/share/nlu/snips/model/ -q "your command here"

This is just a sample model and may not be able to handle all types of commands. You can always train your own intent engine model using your custom dataset, for more details on how to do that you can look into the README files of snips-sdk-agl, snips-model-agl, and snips-inference-agl.

Test RASA

(Important) Currently, there are some library linking issues between NumPy, SciPy, and OpenBLAS. While we investigate and fix them you need to use LD_PRELOAD method as a workaround for RASA to work properly. Input the following command as soon as you boot into the target image:

$ export LD_PRELOAD=/usr/lib/libopenblas.so.0

In order to test the RASA NLU Intent Engine you can use the sample pre-trained model, by default it automatically gets built into the target image when you include this layer. To perform inference using this model you can run the following command inside your target image:

$ rasa shell -m /usr/share/nlu/rasa/models/

This will open an interactive shell where you can issue commands and get related results. This is just a sample model and may not be able to handle all types of commands. You can always train your own RASA intent engine model using your custom dataset by following the official RASA docs.

Supported Targets

Currently, the following targets are fully supported:

QEMU x86-64 (work in progress)

Maintainers

Malik Talha talhamalik727x@gmail.com
Aman Arora aman.arora9848@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
conf		conf
dynamic-layers/meta-agl-demo/recipes-demo/flutter-vosk-demo		dynamic-layers/meta-agl-demo/recipes-demo/flutter-vosk-demo
recipes-blas/openblas		recipes-blas/openblas
recipes-lapack/lapack		recipes-lapack/lapack
recipes-nlu/nlu-model		recipes-nlu/nlu-model
recipes-python		recipes-python
recipes-vosk		recipes-vosk
COPYING.MIT		COPYING.MIT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

dynamic-layers/meta-agl-demo/recipes-demo/flutter-vosk-demo

dynamic-layers/meta-agl-demo/recipes-demo/flutter-vosk-demo

recipes-blas/openblas

recipes-blas/openblas

recipes-lapack/lapack

recipes-lapack/lapack

recipes-nlu/nlu-model

recipes-nlu/nlu-model

recipes-python

recipes-python

recipes-vosk

recipes-vosk

COPYING.MIT

COPYING.MIT

README.md

README.md

Repository files navigation

Offline Speech Recognition and Command Execution

Table of Contents

Introduction

Layer Status

External Dependencies

Working Features

Testing Features on AGL

Build Layer

Test Vosk

Test Snips

Test RASA

Supported Targets

Maintainers

About

Releases

Packages

Languages

License

malik727/meta-offline-voice-agent

Folders and files

Latest commit

History

Repository files navigation

Offline Speech Recognition and Command Execution

Table of Contents

Introduction

Layer Status

External Dependencies

Working Features

Testing Features on AGL

Build Layer

Test Vosk

Test Snips

Test RASA

Supported Targets

Maintainers

About

Resources

License

Stars

Watchers

Forks

Languages