Name		Name	Last commit message	Last commit date
parent directory ..
conformer_ctc		conformer_ctc
conformer_mmi		conformer_mmi
local		local
pruned_transducer_stateless2		pruned_transducer_stateless2
pruned_transducer_stateless3		pruned_transducer_stateless3
pruned_transducer_stateless7		pruned_transducer_stateless7
pruned_transducer_stateless7_bbpe		pruned_transducer_stateless7_bbpe
pruned_transducer_stateless7_streaming		pruned_transducer_stateless7_streaming
tdnn_lstm_ctc		tdnn_lstm_ctc
transducer_stateless		transducer_stateless
transducer_stateless_modified-2		transducer_stateless_modified-2
transducer_stateless_modified		transducer_stateless_modified
whisper		whisper
zipformer		zipformer
README.md		README.md
RESULTS.md		RESULTS.md
prepare.sh		prepare.sh
prepare_aidatatang_200zh.sh		prepare_aidatatang_200zh.sh
shared		shared

README.md

Introduction

Please refer to https://k2-fsa.github.io/icefall/recipes/Non-streaming-ASR/aishell/index.html for how to run models in this recipe.

Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co., Ltd. 400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation and strict quality inspection. The data is free for academic use. We hope to provide moderate amount of data for new researchers in the field of speech recognition.

(From Open Speech and Language Resources)

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer_stateless`	Conformer	Embedding + Conv1d	with `k2.rnnt_loss`
`transducer_stateless_modified`	Conformer	Embedding + Conv1d	with modified transducer from `optimized_transducer`
`transducer_stateless_modified-2`	Conformer	Embedding + Conv1d	with modified transducer from `optimized_transducer` + extra data
`pruned_transducer_stateless3`	Conformer (reworked)	Embedding + Conv1d	pruned RNN-T + reworked model with random combiner + using aidatatang_20zh as extra data
`pruned_transducer_stateless7`	Zipformer	Embedding	pruned RNN-T + zipformer encoder + stateless decoder with context-size set to 1
`zipformer`	Upgraded Zipformer	Embedding + Conv1d	The latest recipe with context-size set to 1

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.

Whisper

Recipe to finetune large pretrained models

	Encoder	Decoder	Comment
`whisper`	Transformer	Transformer	support fine-tuning using deepspeed

Files

ASR

Directory actions

More options

Directory actions

More options

Latest commit

History

ASR

Folders and files

parent directory

Introduction

Transducers

Whisper