This project are build to help people with hearing disability. The idea is to give help by providing real time translation based on mouth movement and it's sound.
flowchart TD;
Image-->Haarcascade
Sound-->Librosa
Haarcascade-->CNN
Librosa-->CNN
Haarcascade and librosa here are used for preprocessing. Haarcascade is used to detect mouth from face image. On the other hand, librosa is used to extract acoustic features, such as Mel-frequency cepstral coefficients (MFCCs), from audio waveforms.
Note: The model trained still limited for vowel. And the model for image and voice recognition are separated.
git clone https://github.com/dipp-12/teman-disabilitas
cd teman-disabilitas
Virtual environment used to create isolated Python environments for their own installation directories and don't share libraries with other environments or the system Python installation, which is useful for avoiding any dependency error.
python -m venv venv
You can then activate the environment using this code:
- For Linux and macOS
source venv/bin/activate
- For Windows
venv/Scripts/activate
pip install -r requirements.txt
flask run
Note: for the voice, need to be uploaded manually.
Here's also web version for the flask app but still limited for image recognition. https://dipp-12.github.io/teman-disabilitas/