Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AIShell recipe result #140

Merged
merged 13 commits into from
Dec 4, 2021
Merged

Update AIShell recipe result #140

merged 13 commits into from
Dec 4, 2021

Conversation

pingfengluo
Copy link
Contributor

@pingfengluo pingfengluo commented Dec 4, 2021

1.update a slight better CER result from 5.12(50 epochs) to 4.94 (85 epochs, as conformer-ctc training) for confomer-mmi

2.fix train text (should only use train text in aishell_transcript_v0.8.txt, filter by uid in 'data/aishell/data_aishell/wav/train', as below shell script in prepare.sh, same way as KALDI)

    aishell_text=$dl_dir/aishell/data_aishell/transcript/aishell_transcript_v0.8.txt
    aishell_train_uid=$dl_dir/aishell/data_aishell/transcript/aishell_train_uid
    find data/aishell/data_aishell/wav/train -name "*.wav" | sed 's/\.wav//g' | awk -F '/' '{print $NF}' > $aishell_train_uid
    awk 'NR==FNR{uid[$1]=$1} NR!=FNR{if($1 in uid) print $0}' $aishell_train_uid $aishell_text | cut -d " " -f 2- > $lang_phone_dir/transcript_words.txt

@pingfengluo pingfengluo closed this Dec 4, 2021
@pingfengluo pingfengluo reopened this Dec 4, 2021
@pingfengluo
Copy link
Contributor Author

ready for review, thanks @csukuangfj

@csukuangfj
Copy link
Collaborator

Thanks! Merging.

@csukuangfj csukuangfj merged commit d1adc25 into k2-fsa:master Dec 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants