Conformer-Transducer with auxiliary task (CTC weight = 0.5)

Environments

Same as RNN-Transducer (see below)

Config files

preprocess config: conf/specaug.yaml
train config: conf/tuning/transducer/train_conformer-rnn_transducer_aux_ngpu4.yaml
lm config: - (LM was not used)
decode config: conf/tuning/transducer/decode_default.yaml
ngpu: 4

Results (CER)

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
decode_dev_decode_default	14326	205341	95.8	4.0	0.2	0.1	4.3	33.6
decode_test_decode_default	7176	104765	95.3	4.4	0.2	0.1	4.8	36.3

Conformer-Transducer

Environments

Same as RNN-Transducer (see below)

Config files

preprocess config: conf/specaug.yaml
train config: conf/tuning/transducer/train_conformer-rnn_transducer.yaml
lm config: - (LM was not used)
decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
decode_dev_decode_default	14326	205341	95.6	4.2	0.2	0.1	4.5	34.0
decode_test_decode_default	7176	104765	95.0	4.7	0.3	0.1	5.0	37.1

RNN-Transducer with auxiliary task (CTC weight = 0.1)

Environments

Same as RNN-Transducer (see below)

Config files

preprocess config: conf/specaug.yaml
train config: conf/tuning/transducer/train_transducer_aux.yaml
lm config: - (LM was not used)
decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
decode_dev_decode_default	14326	205341	93.9	5.8	0.3	0.1	6.3	41.9
decode_test_decode_default	7176	104765	93.2	6.5	0.4	0.1	6.9	44.5

RNN-Transducer

Environments

date: Thu May 20 05:29:03 UTC 2021
python version: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
espnet version: espnet 0.9.8
chainer version: chainer 6.0.0
pytorch version: pytorch 1.6.0
Git hash: 95b3008cdcc2247e781a048bc999243dc7f45fe7
- Commit date: Sat Mar 6 00:48:29 2021 +0000

Config files

preprocess config: conf/specaug.yaml
train config: conf/tuning/transducer/train_transducer.yaml
lm config: - (LM was not used)
decode config: conf/tuning/transducer/decode_default.yaml

Results (CER)

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
decode_dev_decode_default	14326	205341	93.8	5.9	0.3	0.1	6.3	42.0
decode_test_decode_default	7176	104765	92.9	6.7	0.3	0.1	7.2	45.9

Conformer (kernel size = 15) + SpecAugment + LM weight = 0.0 result

training config file: conf/tuning/train_pytorch_conformer_kernel15.yaml
preprocess config file: conf/specaug.yaml
decoding config file: conf/decode.yaml, set lm-weight = 0.0
model link: https://drive.google.com/file/d/1pOhwj6JFqVyt5quW7BKWfJ3vfPFRoxpQ/view?usp=sharing

exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.6       36.0   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   95.0        4.9         0.1        0.1        5.1       38.6   |

Conformer (kernel size = 31) + SpecAugment + LM weight = 0.0 result

training config file: conf/tuning/train_pytorch_conformer_kernel31.yaml
preprocess config file: conf/specaug.yaml
decoding config file: conf/decode.yaml, set lm-weight = 0.0

exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.7       36.2   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.9        5.0         0.1        0.1        5.2       39.1   |

Conformer (kernel size = 31) result

training config file: conf/tuning/train_pytorch_conformer_kernel31.yaml
decoding config file: conf/decode.yaml

exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_dev_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.9        5.0        0.1        0.1        5.2       38.3   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_test_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.2        5.4         0.2        0.1        5.8       41.0   |

Transformer result (default transformer with initial learning rate = 1.0 and epochs = 50)

Environments (obtained by $ get_sys_info.sh)
- date: Mon Jun 10 12:34:41 EDT 2019
- system information: Linux b14 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux
- python version: Python 3.7.3
- espnet version: espnet 0.3.1
- chainer version: chainer 6.0.0
- pytorch version: pytorch 1.0.1.post2
- Git hash: 82e9b7eb7ccae61e11af28981734ea1c2b315a98
Model files (archived to model.v1.tar.gz by $ pack_model.sh)
- model link: https://drive.google.com/open?id=1BIQBpLRRy3XSMT5IRxnLcgLMirGzu8dg
- training config file: conf/train.yaml
- decoding config file: conf/decode.yaml
- cmvn file: data/train_sp/cmvn.ark
- e2e file: exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.last10.avg.best
- e2e JSON file: exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.json
- lm file: exp/train_rnnlm_pytorch_lm/rnnlm.model.best
- lm JSON file: exp/train_rnnlm_pytorch_lm/model.json
Results (paste them by yourself or obtained by $ pack_model.sh --results <results>)

exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.1        5.7        0.2        0.1        6.0       42.0   |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   93.4        6.4         0.2        0.1        6.7       45.1   |

First result (no tuning, but already very good. cf. Kaldi chain best 7.43% and nnet3 8.64% while ESPnet 8.0%)

exp/train_sp_pytorch_no_patience/decode_dev_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |    Corr            Sub           Del           Ins            Err         S.Err     |
|    Sum/Avg    |    14326         205341     |    93.3            6.5           0.2           0.1            6.8          45.2     |
exp/train_sp_pytorch_no_patience/decode_test_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |     Corr           Sub            Del           Ins            Err         S.Err     |
|    Sum/Avg    |     7176         104765     |     92.2           7.6            0.2           0.2            8.0          50.2     |

Ngram related

decoding with ngram and RNNLM

exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |

exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |

only e2e model

exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326       205341 |   93.6        6.2        0.2        0.1        6.5      45.6    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   92.7        7.1        0.2        0.1        7.4      49.8    |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RESULTS.md

RESULTS.md

Conformer-Transducer with auxiliary task (CTC weight = 0.5)

Environments

Config files

Results (CER)

Conformer-Transducer

Environments

Config files

Results (CER)

RNN-Transducer with auxiliary task (CTC weight = 0.1)

Environments

Config files

Results (CER)

RNN-Transducer

Environments

Config files

Results (CER)

Conformer (kernel size = 15) + SpecAugment + LM weight = 0.0 result

Conformer (kernel size = 31) + SpecAugment + LM weight = 0.0 result

Conformer (kernel size = 31) result

Transformer result (default transformer with initial learning rate = 1.0 and epochs = 50)

First result (no tuning, but already very good. cf. Kaldi chain best 7.43% and nnet3 8.64% while ESPnet 8.0%)

Ngram related

Files

RESULTS.md

Latest commit

History

RESULTS.md

File metadata and controls

Conformer-Transducer with auxiliary task (CTC weight = 0.5)

Environments

Config files

Results (CER)

Conformer-Transducer

Environments

Config files

Results (CER)

RNN-Transducer with auxiliary task (CTC weight = 0.1)

Environments

Config files

Results (CER)

RNN-Transducer

Environments

Config files

Results (CER)

Conformer (kernel size = 15) + SpecAugment + LM weight = 0.0 result

Conformer (kernel size = 31) + SpecAugment + LM weight = 0.0 result

Conformer (kernel size = 31) result

Transformer result (default transformer with initial learning rate = 1.0 and epochs = 50)

First result (no tuning, but already very good. cf. Kaldi chain best 7.43% and nnet3 8.64% while ESPnet 8.0%)

Ngram related