S_PROJECT (CMPE 195F Group 1 Project U19: Profanity Filtering Microphone)

Getting Started

This project is written in Python 3 and works for Ubuntu 18.04 and Windows OS. To get this project working on your computer, download all files in this project. Then download the prerequisites and follow the installation instructions below.

Note that this repo contains both the front-end and back-end, which should be placed in different locations. The front-end should be installed locally and the back-end should be installed in an AWS instance

Prerequisites

For Ubuntu 18.04: Make sure you are a non-root user that has sudo privileges. Ubuntu comes with Python 3, so run the following code to make sure the version of Python is up-to-date.

$ sudo apt update
$ sudo apt -y upgrade

Then run the following code to check the version of Python 3 that is installed on your system. The version should be Python 3.7 or above.

$ python3 -V

Then download the following libraries. For the GUI, install wxPython using the following code.

$ sudo pip install -U wxPython

To access the microphone for audio input and functions related to audio, install the PyAudio library.

$ sudo pip install PyAudio

For the speech recognition used by the front end, install Python's SpeechRecognition library. This will be used to record and process the audio. This will also create the RAW file that will be sent to the backend.

$ sudo pip install SpeechRecognition

Lastly, for the media player install the Python VLC video player.

$ sudo pip install python-vlc

After adding the front-end components locally, add the following code to connect to the backend.

LibCurl with nghttp2

Run the following commands to access the AWS S3 Instance and Transcribe API.

$ sudo apt-get -y install build-essential nghttp2 libnghttp2-dev libssl-dev   
$ wget https://curl.haxx.se/download/curl-7.63.0.tar.gz   
$ tar xzf curl-7.63.0.tar.gz    
$ cd curl-7.63.0    
$ ./configure --with-nghttp2 --prefix=/usr/local --with-ssl   
$ make && sudo make install   
$ sudo ldconfig

PortAudio

Download the latest stable release of PortAudio from the website.
Unzip.
Input the following terminal commands.

$ cd portaudio  
$ mkdir build  
$ cd build  
$ cmake .. -DCMAKE_BUILD_TYPE=Release  
$ cmake --build .  
$ cmake --build . --target install

AWS Transcribe

Input the following terminal commands.

$ git clone https://github.com/aws/aws-sdk-cpp.git  
$ cd aws-sdk-cpp  
$ mkdir build  
$ cd build  
$ cmake .. -DBUILD_ONLY="transcribestreaming" -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF  
$ cmake --build . --config Release  
$ sudo cmake --build . --config Release --target install

CMake compile (Back-end)

Input the following terminal commands after doing all of the above. $ mkdir build
$ cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Release
$ cmake --build . --config Release

The instructions to setup the AWS Instance are far too long for this README, and thus will be summarized as follows.

Amazon Web Services

Follow the instructions found here. (aws.amazon.com/ec2/getting-started) Make sure to choose a Linux server container with size "macro" or larger.

Once the instance is set up, follow the instructions below for testing. (docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)

Pytorch-Kaldi

The following will be done in the AWS Instance.

First, install Kaldi with the directions found here. Kaldi Next, install PyTorch with the directions found here. PyTorch Then, install PyTorch-Kaldi with the directions found here. PyTorch-Kaldi

After this, the instance should be set with the default training model.

Last, we will need training data. Any data set with around 1000 hours of English data is sufficient. The one used for this project can be found here. LibriSpeech

Installing

To run the project, navigate to the directory the project folder is in. Example:

$ cd (your path here)/S_PROJECT

Then run the following code to open the UI.

python3 senior_proj_app.py

Press the Record button. In the terminal it will say the name of your system's microphone. Copy the first name that shows up and paste it in place of the microphone name found in audio.py. This will be in the initializer. Example:

def __init__(self, mic_name="Intel 82801AA-ICH: - (hw:0,0)"):

Running Tests

Open the UI again. Select your settings with the check boxes. Speech to Text will show the transcription of what you said in the transcription box. Filter Profanity will show if filtering profanity was selected or not in the terminal. The volume adjusts the playback volume of the media player when you press the Play button. Record will listen to the microphone for speech input. The terminal will show "Say something" and return with what you said or that it couldn't understand the audio. After the terminal returns either of these options, you can press the Stop button. Depending on your settings, the output will show in the UI. Play will playback what you said, and create a RAW file to send to the backend.

Back end tests can be found in SENIOR_PROJECT folder and pytorch_kaldi once binaries are recompiled in AWS instance.

Built With

wxPython - For GUI
PyAudio - For microphone input and functions
SpeechRecognition - For the front end's speech recognition
python-vlc - For the media player
PortAudio - Audio API
Libcurl - URL transfer library for AWS Instance connection
nghttp2 - Implementation of HTTP/2 for libcurl efficiency and multiplexing requests over TCP from AWS instance to local project.
Kaldi - Speech recognition toolkit to handle FST
PyTorch - Open source machine learning framework
PyTorch-Kaldi - DNN/RNN hybrid speech recognition system.

Authors

Arturo Reyes - Front-end and UI/UX, audio.py, and senior_proj_app.py
Michelle Roque - Front-end functionality, senior_proj_app2.py
Hangbo Gu - Back-end, audio-capture.cpp and main.cpp
David Tran - Back-end, AWS Instance, pytorch-kaldi (custom)

Acknowlegements

wxPython Tutorial from (http://zetcode.com/wxpython/)
SpeechRecognition Tutorial for audio module from (https://pypi.org/project/SpeechRecognition/)
Dan Povey from (https://danielpovey.com)
README template from here
M. Ravanelli, T. Parcollet, Y. Bengio, "The PyTorch-Kaldi Speech Recognition Toolkit", arXiv

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
SENIOR_PROJECT		SENIOR_PROJECT
build		build
images		images
pytorch-kaldi		pytorch-kaldi
version_2		version_2
CMakeLists.txt		CMakeLists.txt
README.md		README.md
audio-capture.cpp		audio-capture.cpp
audio.py		audio.py
main.cpp		main.cpp
pa_ringbuffer.h		pa_ringbuffer.h
pa_util.h		pa_util.h
senior_project_app.py		senior_project_app.py
senior_project_app2.py		senior_project_app2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S_PROJECT (CMPE 195F Group 1 Project U19: Profanity Filtering Microphone)

Getting Started

Prerequisites

LibCurl with nghttp2

PortAudio

AWS Transcribe

CMake compile (Back-end)

Amazon Web Services

Pytorch-Kaldi

Installing

Running Tests

Built With

Authors

Acknowlegements

About

Releases

Packages

Contributors 2

Languages

hangbogu/S_PROJECT

Folders and files

Latest commit

History

Repository files navigation

S_PROJECT (CMPE 195F Group 1 Project U19: Profanity Filtering Microphone)

Getting Started

Prerequisites

LibCurl with nghttp2

PortAudio

AWS Transcribe

CMake compile (Back-end)

Amazon Web Services

Pytorch-Kaldi

Installing

Running Tests

Built With

Authors

Acknowlegements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages