GitHub - FedericaPaoli1/stm32-speech-recognition-and-traduction at https://codemonkey.link

stm32-speech-recognition-and-traduction

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
Usage
License
Contact

About The Project

stm32-speech-recognition-and-traduction is a project developed for the Operating Systems Design exam at the University of Milan (academic year 2020-2021). It implements a speech recognition and speech-to-text translation system using a pre-trained machine learning model. The system is able to distinguish a restricted set of words to limit the complexity of the final model so that it can be run on the stm32f407vg microcontroller where memory and calculation capacity are limited. In particular, the voice recognition system distinguishes simple commands such as switching the LEDs on and off and enabling the translation system. The latter, on the other hand, processes the voice signal by converting it into text, which is then displayed on the screen.

Built With

Getting Started

Prerequisites

STM32F407VG board, equipped with both UART/USART interface and ST-MEMS digital microphone
USART-USB dongle to connect the pins of the UART/USART interface to a PC via USB

Usage

To enable the microphone mounted on the microcontroller to acquire a new word, the user B1 button must be pressed. Once this is done, a one-second audio signal is acquired. Depending on the word spoken, different actions are performed.

Say the word ON twice in succession; this word enables the printing system on the terminal.

Say the word OFF, after pronouncing the word ON.

Say various words, after pronouncing the word ON, followed by the pronunciation of the word OFF.

Say ONE to turn on the green LED, after the word ON has been pronounced.

Say TWO to turn on the blue LED, after the word ON and ONE has been pronounced.

Say THREE to turn on the red LED, after the word ON, ONE and TWO has been pronounced.

Say FOUR to turn on the orange LED, after the word ON, ONE, TWO and THREE has been pronounced and followed by the word OFF

Say the word VISUAL three times in succession to show the calculation of the execution time statistics. This is followed by the pronunciation of the word STOP, followed by the pronunciation of the word VISUAL to show the reset of the execution times (expressed in clock cycles).

License

Distributed under the BSD 2-Clause License. See LICENSE for more information.

Contact

Federica Paoli' - federicapaoli1@gmail.com
Stefano Taverni - ste.taverni@gmail.com

Project Link: https://github.com/FedericaPaoli1/stm32-speech-recognition-and-traduction

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.ai		.ai
.settings		.settings
Core		Core
Debug		Debug
Drivers		Drivers
Middlewares/ST		Middlewares/ST
PDM2PCM/App		PDM2PCM/App
X-CUBE-AI/App		X-CUBE-AI/App
img		img
.cproject		.cproject
.mxproject		.mxproject
.project		.project
LICENSE		LICENSE
README.md		README.md
STM32F407VGTX_FLASH.ld		STM32F407VGTX_FLASH.ld
STM32F407VGTX_RAM.ld		STM32F407VGTX_RAM.ld
SpeechRecognitionAndTraduction Debug.launch		SpeechRecognitionAndTraduction Debug.launch
speech_recognition.ipynb		speech_recognition.ipynb
stm32-speech-recognition-and-traduction.ioc		stm32-speech-recognition-and-traduction.ioc
stm32-speech-recognition-and-traduction.launch		stm32-speech-recognition-and-traduction.launch

License

FedericaPaoli1/stm32-speech-recognition-and-traduction

Folders and files

Latest commit

History

Repository files navigation

stm32-speech-recognition-and-traduction

About The Project

Built With

Getting Started

Prerequisites

Usage

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages