grade-pronounce

This program allows pronunciation assessment via asynchronous communication from Azure Speech SDK.

It supports audio (WAV format) longer than 15 seconds. It processes multiple files at once, and outputs each assessment in CSV format.

specification

Required

token.json ... example: {"key1":"*********", "region":"eastasia"}
a folder submit ... includes scripts(.txt) and voices(.wav)
corresponding voice and script should have the same file name, like sample.wav and sample.txt
create a folder output ... grade will be written in this folder

Run

python ./main.py

Result

Azure Cognitive Services grades voices sentence-by-sentence. For the evaluation of the whole paragraph, this program re-calculates grading:

Accuracy score: weighted average of each sentence's accuracy score
pronunciation score: weighted average of each sentence's pronunciation score
completeness score: percentage of words with error_type None, instead of Insertion and Omission
fluency score: percentage of time actually spoken

input
- submit/sample.wav: saying What time is it?
  ref: Sample Voice
- submit/sample.txt > "What time is it now in Japan?" (deliberate mistake)
output: grade-sample.csv

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
README.md		README.md
main.py		main.py