Skip to content

Latest commit

 

History

History
187 lines (158 loc) · 9.72 KB

RESULTS.md

File metadata and controls

187 lines (158 loc) · 9.72 KB

Conformer-Transducer with auxiliary task (CTC weight = 0.5)

Environments

  • Same as RNN-Transducer (see below)

Config files

  • preprocess config: conf/specaug.yaml
  • train config: conf/tuning/transducer/train_conformer-rnn_transducer_aux_ngpu4.yaml
  • lm config: - (LM was not used)
  • decode config: conf/tuning/transducer/decode_default.yaml
  • ngpu: 4

Results (CER)

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_dev_decode_default 14326 205341 95.8 4.0 0.2 0.1 4.3 33.6
decode_test_decode_default 7176 104765 95.3 4.4 0.2 0.1 4.8 36.3

Conformer-Transducer

Environments

  • Same as RNN-Transducer (see below)

Config files

  • preprocess config: conf/specaug.yaml
  • train config: conf/tuning/transducer/train_conformer-rnn_transducer.yaml
  • lm config: - (LM was not used)
  • decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_dev_decode_default 14326 205341 95.6 4.2 0.2 0.1 4.5 34.0
decode_test_decode_default 7176 104765 95.0 4.7 0.3 0.1 5.0 37.1

RNN-Transducer with auxiliary task (CTC weight = 0.1)

Environments

  • Same as RNN-Transducer (see below)

Config files

  • preprocess config: conf/specaug.yaml
  • train config: conf/tuning/transducer/train_transducer_aux.yaml
  • lm config: - (LM was not used)
  • decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_dev_decode_default 14326 205341 93.9 5.8 0.3 0.1 6.3 41.9
decode_test_decode_default 7176 104765 93.2 6.5 0.4 0.1 6.9 44.5

RNN-Transducer

Environments

  • date: Thu May 20 05:29:03 UTC 2021
  • python version: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
  • espnet version: espnet 0.9.8
  • chainer version: chainer 6.0.0
  • pytorch version: pytorch 1.6.0
  • Git hash: 95b3008cdcc2247e781a048bc999243dc7f45fe7
    • Commit date: Sat Mar 6 00:48:29 2021 +0000

Config files

  • preprocess config: conf/specaug.yaml
  • train config: conf/tuning/transducer/train_transducer.yaml
  • lm config: - (LM was not used)
  • decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset Snt Wrd Corr Sub Del Ins Err S.Err
decode_dev_decode_default 14326 205341 93.8 5.9 0.3 0.1 6.3 42.0
decode_test_decode_default 7176 104765 92.9 6.7 0.3 0.1 7.2 45.9

Conformer (kernel size = 15) + SpecAugment + LM weight = 0.0 result

exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.6       36.0   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   95.0        4.9         0.1        0.1        5.1       38.6   |

Conformer (kernel size = 31) + SpecAugment + LM weight = 0.0 result

  • training config file: conf/tuning/train_pytorch_conformer_kernel31.yaml
  • preprocess config file: conf/specaug.yaml
  • decoding config file: conf/decode.yaml, set lm-weight = 0.0
exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.7       36.2   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.9        5.0         0.1        0.1        5.2       39.1   |

Conformer (kernel size = 31) result

  • training config file: conf/tuning/train_pytorch_conformer_kernel31.yaml
  • decoding config file: conf/decode.yaml
exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_dev_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.9        5.0        0.1        0.1        5.2       38.3   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_test_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.2        5.4         0.2        0.1        5.8       41.0   |

Transformer result (default transformer with initial learning rate = 1.0 and epochs = 50)

  • Environments (obtained by $ get_sys_info.sh)
    • date: Mon Jun 10 12:34:41 EDT 2019
    • system information: Linux b14 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux
    • python version: Python 3.7.3
    • espnet version: espnet 0.3.1
    • chainer version: chainer 6.0.0
    • pytorch version: pytorch 1.0.1.post2
    • Git hash: 82e9b7eb7ccae61e11af28981734ea1c2b315a98
  • Model files (archived to model.v1.tar.gz by $ pack_model.sh)
    • model link: https://drive.google.com/open?id=1BIQBpLRRy3XSMT5IRxnLcgLMirGzu8dg
    • training config file: conf/train.yaml
    • decoding config file: conf/decode.yaml
    • cmvn file: data/train_sp/cmvn.ark
    • e2e file: exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.last10.avg.best
    • e2e JSON file: exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.json
    • lm file: exp/train_rnnlm_pytorch_lm/rnnlm.model.best
    • lm JSON file: exp/train_rnnlm_pytorch_lm/model.json
  • Results (paste them by yourself or obtained by $ pack_model.sh --results <results>)
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.1        5.7        0.2        0.1        6.0       42.0   |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   93.4        6.4         0.2        0.1        6.7       45.1   |

First result (no tuning, but already very good. cf. Kaldi chain best 7.43% and nnet3 8.64% while ESPnet 8.0%)

exp/train_sp_pytorch_no_patience/decode_dev_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |    Corr            Sub           Del           Ins            Err         S.Err     |
|    Sum/Avg    |    14326         205341     |    93.3            6.5           0.2           0.1            6.8          45.2     |
exp/train_sp_pytorch_no_patience/decode_test_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |     Corr           Sub            Del           Ins            Err         S.Err     |
|    Sum/Avg    |     7176         104765     |     92.2           7.6            0.2           0.2            8.0          50.2     |

Ngram related

  • decoding with ngram and RNNLM
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |
  • only e2e model
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326       205341 |   93.6        6.2        0.2        0.1        6.5      45.6    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   92.7        7.1        0.2        0.1        7.4      49.8    |