African language Speech Recognition

In this project we are going to build deep learning model to process and convert African language (Amharic) speech/voice to text format.

Table of Content

Introduction
Install
Data
Notebooks
Scripts
Technologies used
Contributors

Introduction

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in three different countries in Africa - Ethiopia and Kenya.

The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

Our responsibility is to build a deep learning model that is capable of transcribing a speech to text. The model we produce should be accurate and is robust against background noise. This project is made on the fourth week of 10Academy Machine Learning training session.

Instalation

Install Required Python Moduls

git clone https://github.com/10acad-group3/speech_recognition
cd speech_recognition
pip install -r requirements.txt

Jupiter Notebook

cd notebooks
jupyter notebook

Model Training ui (not-implimented)

mlflow ui

Dashboard (not-implimented)

streamlit run app.py

Data

The folder is being tarcked with DVC and the files are only shown after cloning and setting up locally. The sub-folder AMHARIC contain training and testing files for our model. Both files contain similar file structure.

wav/ : a folder containing all audio files
text : file contining the metadata (audio file name and cropsonding transcription)
spk2utt, trsTest.txt, utt2spk, wav.scp : this are files provided with the dataset, Currently they don't have a purpose but could be used for future analysis.

Notebooks

1.0 preprocessing.ipynb : Notebook file showing metada-generation, new features, Data exploration, Removing outliers, Clean audio and **clean text **
1.0 acoustic_modeling_v2.ipynb: Notebook file similar to 1.0 preprocessing.ipynb, but contain more analysis on audio visualization and text data anlysis
2.0 outliers.ipynb: visualize the effect of ourlier removal on features of the dataset. finally save the outlier cleaned file.
3.0 speech_recognition.ipynb: Notebook file, showing how to Tokenize, Augument and Generate data, from outlier cleaned data.
audio_visualization : Google colab file showing how to visualize audio file using audio wave and spectogram.
4.0 acoustic_modeling: on progress ...

Scripts

audio_vis.py : Helper class for visualizing and playing audio files
clean_audio.py : Helper class for cleaning audio files
config.py : Project configration and file paths
file_handler.py : Helper class for reading files
log.py: Helper class for logging
script.py: Utility functions

Technologies used

DVC : Remote Data Storage
MLflow: Model training and visualization
CML: Display Model result and usefull information during pull requests
Streamlit: Display Web interface and dashboard

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.dvc		.dvc
.github/workflows		.github/workflows
.vscode		.vscode
data		data
fonts		fonts
img		img
models		models
notebooks		notebooks
scripts		scripts
tests		tests
.dvcignore		.dvcignore
.flake8		.flake8
.gitignore		.gitignore
.travis.yaml		.travis.yaml
README.md		README.md
__int__.py		__int__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

African language Speech Recognition

Table of Content

Introduction

Instalation

Data

Notebooks

Scripts

Technologies used

Contributors

About

Releases

Packages

Contributors 8

Languages

10acad-group3/speech_recognition

Folders and files

Latest commit

History

Repository files navigation

African language Speech Recognition

Table of Content

Introduction

Instalation

Data

Notebooks

Scripts

Technologies used

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages