Edge AI Audio Visual Demonstration

This repository hosts source code for an Edge AI demo on TI processors, focused on audio and visual processing.

This demo shows an audio-visual system in the form of a video conferencing front end. It shows side-by-side audio and vision processing to infer meaning from the raw signals that come in. Audio data is processed on the CPU. Vision and imagery is processed using a variety of hardware accelerators, including an ISP (VISS), a downscaling engine (MSC), and a deep learning accelerator (C7xMMA).

This demo has been validated on the AM62A running the 9.0.0 Edge AI Linux SDK. It is expected to run equivalently well on other AM6xA / Edge AI processors from TI, like the TDA4VM, AM68A, and AM69A

How to run this demo

Note: this demo borrows heavily from the edgeai-keyword-spotting and edgeai-gst-apps-retail-checkout projects for running a keyword spotting neural network on microphone audio and constructing a image-processing gstreamer pipeline, respectively.

Obtain an EVM for the AM6xA processor of choice, e.g. the AM62A Starter Kit
Flash an SD card with the Edge AI SDK (Linux) by following the quick start guide (Quick start for AM62A)
Login to the device over a serial or SSH connection. A network connection is required to setup this demo
Clone this repository to the device using git.

If the EVM is behind a proxy, first set the HTTPS_PROXY environment variable and then add it to git: git config --global https.proxy $HTTPS_PROXY

Run the audio_setup.sh script to download, build, and install libportaudio, pyaudio, and librosa to the device. This will fail if the network or proxy are not configured.
Plug in a USB microphone
Run the detect_microphone.py script to recognize which device index to use in Linux. If this is not 1, provide this as an argument with the -a tag when running the run_demo.sh script later

This may print many additional lines and warnings -- these can be safely ignored if audio dependencies were installed.

Using an IMX219 camera, enable the DTBO for this camera type by adding a line uEnv.txt using instructions on dev.ti.com page for "Evaluating Linux -> Camera" in AM62A academy
Reboot the board so the device tree overlay is applied
Once confirmed the IMX219 is enabled (it will show in the terminal when a shell is opened), run the setup_imx219-2mp.sh script to set 2 MP mode for 1640x1232 resolution
Run the run_demo.sh script.

Errors like seg-fault will occur from choosing the wrong device_index for the microphone

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
audio_setup.sh		audio_setup.sh
command_interpreter.py		command_interpreter.py
detect_microphone.py		detect_microphone.py
display.py		display.py
down_0c40e715_nohash_0.wav		down_0c40e715_nohash_0.wav
edgeai-audio-visual_manifest.html		edgeai-audio-visual_manifest.html
gst_configs.py		gst_configs.py
kws_matchbox.py		kws_matchbox.py
labels.yaml		labels.yaml
matchboxnet.onnx		matchboxnet.onnx
model_runner.py		model_runner.py
quick-setup-portaudio.sh		quick-setup-portaudio.sh
run_demo.sh		run_demo.sh
setup_imx219-2mp.sh		setup_imx219-2mp.sh
utils.py		utils.py
vision+kws_app.py		vision+kws_app.py

License

TexasInstruments/edgeai-demo-audio-visual

Folders and files

Latest commit

History

Repository files navigation

Edge AI Audio Visual Demonstration

How to run this demo

Resources and Help

About

Resources

License

Stars

Watchers

Forks

Languages