Update result for full libri + GigaSpeech using transducer_stateless. #231

csukuangfj · 2022-03-01T08:46:17Z

This PR provides the WER for full libri + GigaSpeech. See #213 for more details.

The following tables compare the WERs with and without using multiple datasets:

Baseline (without using multiple dataset)

Time per epoch (~2 hours 46 minutes, using 4 GPUs)

	test-clean	test-other	comment
greedy search (max sym per frame 1)	2.67	6.67	--epoch 63, --avg 19, --max-duration 100
modified beam search (beam size 4)	2.67	6.57	--epoch 63, --avg 19, --max-duration 100

(tensorboard log: https://tensorboard.dev/experiment/qgvWkbF2R46FYA6ZMNmOjA/#scalars)

With multiple dataset (--giga-prob 0.2)

	test-clean	test-other	comment
greedy search (max sym per frame 1)	2.64	6.55	--epoch 39, --avg 15, --max-duration 100
modified beam search (beam size 4)	2.61	6.46	--epoch 39, --avg 15, --max-duration 100

(tensorboard log: https://tensorboard.dev/experiment/xmo5oCgrRVelH9dCeOkYBg/)

Time per epoch (~4 hours 15 minutes, using 4 GPUs)

The training time per epoch is increased as it is using more data in the training. However, it converges faster (39 epochs vs 63 epochs). If we decrease the probability to select data from GigaSpeech, it will definitely decrease the training time, but it needs more experiments to see how it affects the WER.

danpovey · 2022-03-01T08:52:24Z

Cool!
Hopefully it will give more improvement in situations where we have less training data available (or where
the model is larger).

Update result for full libri + GigaSpeech using transducer_stateless.

cc2628f

csukuangfj added the ready label Mar 1, 2022

csukuangfj merged commit 05cb297 into k2-fsa:master Mar 1, 2022

csukuangfj deleted the update-results-2 branch March 1, 2022 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update result for full libri + GigaSpeech using transducer_stateless. #231

Update result for full libri + GigaSpeech using transducer_stateless. #231

csukuangfj commented Mar 1, 2022

danpovey commented Mar 1, 2022

Update result for full libri + GigaSpeech using transducer_stateless. #231

Update result for full libri + GigaSpeech using transducer_stateless. #231

Conversation

csukuangfj commented Mar 1, 2022

Baseline (without using multiple dataset)

With multiple dataset (--giga-prob 0.2)

danpovey commented Mar 1, 2022