Skip to content

An end to end speech recognition system for Swahili audio

Notifications You must be signed in to change notification settings

SameC137/SpeechToText

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

African language Speech Recognition - Speech-to-Text

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in two different countries in Africa - Ethiopia and Kenya. The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

Folder/File structure for branch

  • artifacts-contains artifacts such meta files and other artifacts generated through the project
  • notebook-contains notebooks for describing the functionality of the the classes to achieve the meta generation and the preprocessing
  • scripts-contains scripts for Meta generation, preprocessing and feature extraction
  • data.dvc- DVC File for versioning of the data
  • requirements.txt- dependencies for code inside this branch

Data


Data Features

Input features (X): audio clips of spoken words
Target labels (y):  text transcript of what was spoken

Models Used

Simple RNN



Bidirectional RNN



CNN RNN model



Deep speech 2 layers



Deep Speech with RNN layer Bidirectional



Deep Speech 3 Layers

About

An end to end speech recognition system for Swahili audio

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published