GitHub - madhu1995-oss/Pronunciation-and-Fluency-evaluation-using-machne-learning-and-DeepLearning

Version 0

Fluency

1.Extracted various features such as mfccs,,zero-crossing rate, spectral flux, root mean square energy from audio data set
2.Then used various machine learning (SVM, RF,) and Deep Learning(MLP, CNN, RNN) for classification which is then trained on 70 percent data and 30 percent for test.
3.The random forest seemed to perform better because even though for other models accuracy was high for the test set, it has high false positive and false negative, whereas for random forest only 4 samples FP and FN

Pronunciation

When a person pronounces incorrectly then the spectrogram of that word will be different from the spectrogram of actual pronunciation of the same word.One can calculate the difference between the two to know if he/she has pronounced incorrectly.

Grammatical errors

By converting speech to text
1.Covert from speech to text with autocorrection
2.Convert the text output to speech
3.Find the dynamic time warp similarity using fast dynamic time warp algorithm.
4.If the difference is more than a thresold then the word is mispronounced.

Version 1

This will neither require seperate methods for pronunciation and grammatical errors nor require conversion from speech to text
1.Convert the actual texts and perturbed texts to speech let's say speech_actual and speech_perturbed.
2.Train sequence to sequence model using speech_perturbed as input and speech_actual as output.
3.In order to handle two same sentence but with different sequence lenght dynamic time warping or similar method can be used .
4.When the user inputs incorrect voices then the system will correct voices without converting to text.
5.Once can calcuate the similarity or difference score betwen input voice and corrected voice by using sequence correlation coefficient.

Version 3

One can use iterative learning where the models are trained continousy and also can use federated learning where training will happen on their device so that there is no issue of data privacy.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Fast_DTW .ipynb		Fast_DTW .ipynb
Fluency (1).ipynb		Fluency (1).ipynb
Readme.md		Readme.md
feat (3).npy		feat (3).npy
label (3).npy		label (3).npy
stt_tts.py		stt_tts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast_DTW .ipynb

Fast_DTW .ipynb

Fluency (1).ipynb

Fluency (1).ipynb

Readme.md

Readme.md

feat (3).npy

feat (3).npy

label (3).npy

label (3).npy

stt_tts.py

stt_tts.py

Repository files navigation

Version 0

Fluency

Pronunciation

Grammatical errors

Version 1

Version 3

About

Releases

Packages

Languages

madhu1995-oss/Pronunciation-and-Fluency-evaluation-using-machne-learning-and-DeepLearning

Folders and files

Latest commit

History

Repository files navigation

Version 0

Fluency

Pronunciation

Grammatical errors

Version 1

Version 3

About

Topics

Resources

Stars

Watchers

Forks

Languages