Speech Recognition for Uyghur using Speech transformer

Training:

this model using CTC loss and Cross Entropy loss for training.

unzip results.7z and thuyg20_data.7z to the same folder where python source files located. then run:

python train.py

Recognition:

for recognition download only pretrained model. then run:

python .\tonu.py .\test6.wav

result will be:

        Model loaded: results/UFormer_last.pth
            Best CER: 4.16%
             Trained: 276 epochs
The model has 36,418,306 trainable parameters
 Feature  has 25,869,058 trainable parameters
  Encoder has 4,205,568 trainable parameters
  Decoder has 6,343,680 trainable parameters

======================
Recognizing file .\test6.wav
test6.wav -> u qizlarning resimi chiqip qalsa bilekchila sinchilap qaraytti

This project using

A free Uyghur speech database Released by CSLT@Tsinghua University & Xinjiang University

Reference

https://github.com/gentaiscool/end2end-asr-pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
results		results
thuyg20_data		thuyg20_data
README.md		README.md
UFormer.py		UFormer.py
cafe.wav		cafe.wav
data.py		data.py
perlin.wav		perlin.wav
radionoise.wav		radionoise.wav
silence.wav		silence.wav
test1.wav		test1.wav
test2.wav		test2.wav
test3.wav		test3.wav
test4.wav		test4.wav
test5.wav		test5.wav
test6.wav		test6.wav
thuyg20_test.csv		thuyg20_test.csv
thuyg20_train.csv		thuyg20_train.csv
tonu.py		tonu.py
train.py		train.py
uyghur.py		uyghur.py
white.wav		white.wav

gheyret/uyghur-asr-transformer

Folders and files

Latest commit

History

Repository files navigation

Speech Recognition for Uyghur using Speech transformer

Training:

Recognition:

This project using

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages