Speech Vision

Speech Vision (SV) is a Dysarthric Speech Recognition System that adopts a novel approach towards dysarthric ASR in which speech features are extracted visually, then SV learns to see the shape of the words pronounced by dysarthric individuals.

There are two folders:

DysarthricCNNRezaTransferLearning folder that includes models trained with normal speech,
DysarthricCNNRezaTransferLearningSD includes transfer learning and Dysarthric models. This is where you need to re-train the models with additional syntactic generated data. Look at “Train_Test.ipynb” notebook to start.

Please note if you get errors when running or testing the pre-trained models, the trained models are not compatible with your CPU/GPU and both control and dysarthric models need to be retrained from scratch.

To setup the environment, create a python 3.6 environment and install jupyter notebook. Then run “Install packages.ipynb” notebook from the environment - it installs the required packages for you.

The dysarhtic speech samples are from UA Speech (http://www.isle.illinois.edu/sst/data/UASpeech/).

Speech Vision's paper is avaiable from https://ieeexplore.ieee.org/document/9419963. Kindly cite the paper if you intend to use this repo.

Also see: https://github.com/rshahamiri/SpeechVisionResidualSeparable for the updated model information.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
DysarthricCNNRezaTransferLearning		DysarthricCNNRezaTransferLearning
DysarthricCNNRezaTransferLearningSD		DysarthricCNNRezaTransferLearningSD
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Vision

About

Releases

Packages

Languages

License

rshahamiri/SpeechVision

Folders and files

Latest commit

History

Repository files navigation

Speech Vision

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages