A bachelor's thesis for ranking articulation disparity using 3D facial features generated by deep learning models
SpeechApp_Video.mp4
- Python
- Conda
git clonethis repositorygit submodule initandgit submodule updateto initialize the submodules- Create conda environment
voca& resolve dependencies from voca directory - Create conda environment
deca& install dependencies from deca directory - Create conda environment
autoeditor& runpip install auto-editor - Create conda environment
pyqt& runpip install pyqt5 - Activate
pyqtenvironment and execute speech app
- User provides input in the form of video
- Frame rate of input video is changed to 24
- Silence part is removed
- Duration is adjusted
- Extract Audio from Video
- Convert Video to Frames
- Convert Frames to 3D Meshes
- Compare 3D Meshes with Standard
By default, this repository contains only one standard stream. If you wish to add more standard words, perform the following steps:
- Make sure selenium is installed
- Place your desired words in words.txt file to scrape from online dictionary.
- Run audio_scraper.py
- Place scraped mp3 files into standard audios folder
- Activate environment
voca - Run preprocess_audios.py
- Run generate_vertices.py
- You should see your standard's 3D Meshes generated by VOCA model
