Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zipformer for Common Voice #997

Merged
merged 14 commits into from Apr 17, 2023
Merged

Zipformer for Common Voice #997

merged 14 commits into from Apr 17, 2023

Conversation

yfyeung
Copy link
Collaborator

@yfyeung yfyeung commented Apr 12, 2023

Data Preparation

The prepare.sh prepares the English (en) dataset of version 13.0 by default.

Version Date Size Recorded Hours Validated Hours License Number of Voices Audio Format
Common Voice Corpus 13.0 3/15/2023 76.39 GB 3,209 2,429 CC-0 86,942 MP3
Stage Time Comment
Stage 0: Download data - Use machine in America to download the dataset manuallly
Stage 1: Prepare CommonVoice manifest 50 minutes num_jobs=8
Stage 2: Prepare musan manifest Very little -
Stage 3: Preprocess CommonVoice manifest 5 minutes -
Stage 4: Compute fbank for dev and test subsets of CommonVoice 2 minutes 1 GPU
Stage 5: Split train subset into 1000 pieces 10 minutes -
Stage 6: Compute features for train subset of CommonVoice 3 hours 4 processes with 2 GPUs
Stage 7: Combine features for train 15 minutes -
Stage 8: Compute fbank for musan 15 minutes -
Stage 9: Prepare BPE based lang 5 minutes -

Result

Dev Test
greedy search 9.96 12.54
modified beam search 9.86 12.48

To reproduce the above result, use the following commands for training:

export CUDA_VISIBLE_DEVICES="0,1,2,3"
./pruned_transducer_stateless7/train.py \
  --world-size 4 \
  --num-epochs 30 \
  --start-epoch 1 \
  --use-fp16 1 \
  --exp-dir pruned_transducer_stateless7/exp \
  --max-duration 550

and the following commands for decoding:

# greedy search
./pruned_transducer_stateless7/decode.py \
  --epoch 30 \
  --avg 5 \
  --decoding-method greedy_search \
  --exp-dir pruned_transducer_stateless7/exp \
  --bpe-model data/lang_bpe_500/bpe.model \
  --max-duration 600

# modified beam search
./pruned_transducer_stateless7/decode.py \
  --epoch 30 \
  --avg 5 \
  --decoding-method modified_beam_search \
  --beam-size 4 \
  --exp-dir pruned_transducer_stateless7/exp \
  --bpe-model data/lang_bpe_500/bpe.model \
  --max-duration 600

Pretrained model is available at
https://huggingface.co/yfyeung/icefall-asr-cv-corpus-13.0-2023-03-09-en-pruned-transducer-stateless7-2023-04-17

The tensorboard log for training is available at
https://tensorboard.dev/experiment/j4pJQty6RMOkMJtRySREKw/

@desh2608
Copy link
Collaborator

BTW it seems that num_jobs > 1 is currently not supported for the CommonVoice Lhotse recipe (see here). It may be worth implementing this option to speed up the manifest creation time.

@csukuangfj
Copy link
Collaborator

BTW it seems that num_jobs > 1 is currently not supported for the CommonVoice Lhotse recipe (see here). It may be worth implementing this option to speed up the manifest creation time.

There's an ongoing PR about that

lhotse-speech/lhotse#1025

@yfyeung yfyeung changed the title [WIP] zipformer for Common Voice zipformer for Common Voice Apr 17, 2023
@@ -90,7 +90,7 @@ def compute_fbank_commonvoice_splits(args):
subset = "train"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a RESULTS.md to document the results, pre-trained models, tensorboard logs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@yfyeung yfyeung changed the title zipformer for Common Voice Zipformer for Common Voice Apr 17, 2023
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
@@ -0,0 +1,105 @@
# Copyright 2021 Xiaomi Corp. (authors: Fangjun Kuang)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace it with a symlink?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for other files like joiner.py and model.py.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Collaborator

@csukuangfj csukuangfj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@yfyeung yfyeung merged commit 8838fe0 into k2-fsa:master Apr 17, 2023
3 checks passed
@yfyeung yfyeung deleted the cv branch April 17, 2023 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants