-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a BTC/OTC in the training Zipformer instead of Conformer. #1589
Comments
sorry, this is not planned so far.
Best Regards
Jin
…On Thu, 11 Apr 2024 at 18:31 Kerolos ghobrial ***@***.***> wrote:
Is there any script available to train the latest good model *Zipformer*
model using Bypass Temporal Classification(*BTC*)/Omni-temporal
Classification (*OTC*) (
https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/WSASR) to
align speech with text instead of CTC (
https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/zipformer_ctc)
?
—
Reply to this email directly, view it on GitHub
<#1589>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOON42F76HJTCQFPNRJQD33Y4ZQ7BAVCNFSM6AAAAABGCBFVYKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZTONBQHA3TEMA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@DongjiGao any interest? |
Thanks, Dr. Daniel Povey, JinZr, 1- Is there any script to create lang OTC based on Phone lexicon instead of BPE (k2-fsa/icefall/tree/master/egs/librispeech/WSASR/local/prepare_otc_lang_bpe.py), since the paper showed a better performance in the Phone Based Lexicon over (Bypass Temporal Classification)? Thanks in advance, |
Thank you for your interest.
Dongji |
Thanks for the quick response, @DongjiGao |
Sorry @DongjiGao for bothering you again: |
I will submit a PR by the end of this week. |
Hello @DongjiGao , Phone results: 4-05-10 13:32:00,131 INFO [decode_phone.py:410] batch 0/?, cuts processed until now is 14 2024-05-10 13:32:19,673 INFO [decode_phone.py:410] batch 0/?, cuts processed until now is 18 BPE results: 2024-04-14 14:21:47,136 INFO [decode.py:476] batch 0/?, cuts processed until now is 14 2024-04-14 14:22:23,962 INFO [decode.py:476] batch 0/?, cuts processed until now is 18
|
Please use subsampling_factor = 2 for SSL features. |
Thanks @DongjiGao for your support, and sorry for bothering you again. -The parameters used for decoding based phone lexicon: The training loss in Tensorboard (BPE based lexicon White -VS- Phone based lexicon Black): |
Is there any script available to train the latest good model Zipformer model using Bypass Temporal Classification(BTC)/Omni-temporal Classification (OTC) (https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/WSASR) to align speech with text instead of CTC (https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/zipformer_ctc) ?
The text was updated successfully, but these errors were encountered: