Skip to content

Latest commit

 

History

History
134 lines (108 loc) · 4.1 KB

README.md

File metadata and controls

134 lines (108 loc) · 4.1 KB

Performance Record

This is a Chinese speech recognition recipe that trains on all Chinese corpora including:

  • Aidatatang (140 hours)
  • Aishell (151 hours)
  • MagicData (712 hours)
  • Primewords (99 hours)
  • ST-CMDS (110 hours)
  • THCHS-30 (26 hours)
  • optional AISHELL2 (~1000 hours) if available
  • optional TAL ASR (~600 hours) if available

Unified Transformer Result

data info:

aishell results

decoding mode/chunk size full 16
attention decoder 4.69 4.97
ctc greedy search 5.80 6.75
ctc prefix beam search 5.80 6.75
attention rescoring 4.64 5.37

aidatatang results

decoding mode/chunk size full 16
attention decoder 4.23 4.59
ctc greedy search 5.82 6.99
ctc prefix beam search 5.82 6.99
attention rescoring 4.71 5.29

thcs30 results

decoding mode/chunk size full 16
attention decoder 16.68 17.47
ctc greedy search 15.46 16.81
ctc prefix beam search 15,46 16.82
attention rescoring 14.38 15.63

magic results

decoding mode/chunk size full 16
attention decoder 2.86 3.10
ctc greedy search 4.01 5.02
ctc prefix beam search 4.00 5.02
attention rescoring 3.07 3.68

Unified Conformer Result

data info:

  • Aidatatang (140 hours)

  • Aishell (151 hours)

  • MagicData (712 hours)

  • Primewords (99 hours)

  • ST-CMDS (110 hours)

  • THCHS-30 (26 hours)

  • AISHELL2 (~1000 hours)

  • TAL ASR (~600 hours)

  • Feature info: using fbank feature, dither=0, cmvn, speed perturb

  • Training info: lr 0.001, batch size 12, 16 gpu, acc_grad 4, 48 epochs, dither 0.1

  • Decoding info: ctc_weight 0.5, average_num 10

  • Git hash: 18a75244f39b3403ff7b39f791e7af4fb93d4d03

  • Model link:

aishell results

decoding mode/chunk size full 16
attention decoder 1.65 1.79
ctc greedy search 2.14 2.79
ctc prefix beam search 2.13 2.79
attention rescoring 1.65 2.01

aidatatang results

decoding mode/chunk size full 16
attention decoder 3.88 4.21
ctc greedy search 5.39 5.88
ctc prefix beam search 5.39 5.87
attention rescoring 4.02 4.54

thcs30 results

decoding mode/chunk size full 16
attention decoder 9.67 10.07
ctc greedy search 10.94 11.95
ctc prefix beam search 10.94 11.96
attention rescoring 9.90 10.74

magic results

decoding mode/chunk size full 16
attention decoder 2.66 2.98
ctc greedy search 3.18 3.96
ctc prefix beam search 3.17 3.95
attention rescoring 2.71 3.23

aishell-2 results

decoding mode/chunk size full 16
attention decoder 5.37 5.67
ctc greedy search 5.87 6.56
ctc prefix beam search 5.88 6.57
attention rescoring 5.27 5.78

tal results

decoding mode/chunk size full 16
attention decoder 10.49 11.09
ctc greedy search 11.11 12.14
ctc prefix beam search 11.05 12.06
attention rescoring 10.49 11.32