-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Smaller models #745
base: master
Are you sure you want to change the base?
[WIP] Smaller models #745
Conversation
These numbers look pretty good. Would love to see some streaming model results too! |
Update the results of a streaming model with 15M parameters.
|
@pkufool Did you try any experiments with smaller Zipformer models? |
I think @yaozengwei tried smaller zipformer, can you share some results? @yaozengwei |
I have trained a smaller version of the merged Zipformer (pruned_transducer_stateless7) on full librispeech for 30 epochs, with model args: --num-encoder-layers 2,2,2,2,2 \
--feedforward-dims 768,768,768,768,768 \
--nhead 8,8,8,8,8 \
--encoder-dims 256,256,256,256,256 \
--attention-dims 192,192,192,192,192 \
--encoder-unmasked-dims 192,192,192,192,192 \
--zipformer-downsampling-factors 1,2,4,8,2 \
--cnn-module-kernels 31,31,31,31,31 \
--decoder-dim 512 \
--joiner-dim 512 \ Number of model parameters: 20697573 |
@yaozengwei Thanks for your reply, those numbers look very good! Are you planning to upload the pretrained model in HF? |
I have uploaded the pretrained model on https://huggingface.co/Zengwei/icefall-asr-librispeech-pruned-transducer-stateless7-20M-2023-01-28. It is a tiny version of Zipformer-Transducer (https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless7). Number of model parameters: 20697573 Decoding results at epoch-30-avg-9:
|
@yaozengwei the results you got with the 20M parameter model are better than those with the 70M model according to the results posted https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/RESULTS.md Isn't that unexpected? |
You're probably comparing it with the streaming ASR results. For non-streaming results see https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/RESULTS.md#pruned_transducer_stateless7-zipformer |
BTW have you also trained a streaming version of this smaller Zipformer model? |
As a follow up, I trained a small streaming Zipformer model based on the configuration provided by @yaozengwei using the recipe The training logs, tensorboard, and pretrained model are available at: https://huggingface.co/desh2608/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-small |
Could you update RESULTS.md to include this small model? |
Why these results are still far behind conformer-s with 10M params from conformer paper (2.7/6.3)? |
I think it is difficult, if not impossible, to reproduce the results listed in the conformer paper. |
You can't really compare streaming vs. non-streaming results; our 20M Zipformer is about the same as the reported 10M Conformer. But no-one has really been able to reproduce that result. For example, here https://arxiv.org/pdf/2207.02971.pdf |
I create this PR to post my current results for smaller models (i.e. the models with less parameters). I'd say the parameters used to construct the models were chosen arbitrarily. The aim is to tune a good model with around 15M parameters. The following results were based on
pruned_transducer_stateless5
, will try zipformer and update the results once available.These models are all non-streaming models, if someone need them, I can upload them to huggingface.