SpeechCommAI

SpeechCommAI is a console program for predicting up to 35 spoken words.

Introduction

This project can be used to recognize speech commands, recorded in real-time. Prediction is made by a convolutional neural network based on Keras, learned on the TensorFlow database. Using special options, users can download a dataset, preprocess it and teach the program themselves.

List of the speech commands

backward
bed
bird
cat
dog
down
eight
five
follow
forward
four
go
happy
house
learn
left
marvin
nine
no
off
on
one
right
seven
sheila
six
stop
three
tree
two
up
visual
wow
yes
zero

Requirements

python (3.7+)
pip
required libraries listet in the requirements.txt file

Setup

You can install requirements by the follwing command:

pip install -r requirements.txt

Usage

To use this program run main.py in the command line:

python main.py <option>

As an <option> one of the following can be selected: D – Download the raw dataset P – Pre-process the dataset T – Train the model L – Live record

In addition, the T and L options can take one more optional argument specifying the type of set of words to be trained. The word sets are available in the config.toml file. If the name of the set is not specified, the program will accept the entire set of 35 words by default.

Technologies used

Machine learning:

TensorFlow - version 2.9.1
Keras - version 2.9.0

Audio processing:

librosa - version 0.9.1
PyAudio - version 0.2.12

Plots:

matplotlib - version 3.5.1
scikit-learn - version 1.1.2

Screenshots

Visualization of the learned network (confusion matrix):

The model's learning history:

Contact

Created by @miolows - feel free to contact me!

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Models/all		Models/all
speechcommai		speechcommai
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.toml		config.toml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models/all

Models/all

speechcommai

speechcommai

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

config.toml

config.toml

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

SpeechCommAI

Table of Contents

Introduction

List of the speech commands

Requirements

Setup

Usage

Technologies used

Screenshots

Contact

About

Releases

Packages

Languages

License

miolows/SpeechCommAI

Folders and files

Latest commit

History

Repository files navigation

SpeechCommAI

Table of Contents

Introduction

List of the speech commands

Requirements

Setup

Usage

Technologies used

Screenshots

Contact

About

Resources

License

Stars

Watchers

Forks

Languages