RNNT, Hybrid CTC RNNT and model API compatibility question #7839
FredSRichardson
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I think @titu1994 might be able to answer this - I just didn't want to clutter the issue thread I started with this tangent 😉
Please correct me where I've got this wrong - I'm sure there are more than a few things I'm not quite piecing together correctly!
With the
EncDecCTCModel
, computing posteriors is as easy as calling theforward()
method and thetranscribe()
method also has the option of returning posteriors.For the
EncDecRNNTModel
, it a bit more complicated. From looking at theEncDecRNNTModel.training_step()
method, I believe posteriors can be computed using these two steps steps:For the
EncDecHybridRNNTCTCModel
, things are a bit different. I believe you can compute the RNNT posteriors using the same steps above, but for CTC posteriors you would call thectc_decoder
member (which I think is aConvASRDecoder
created from the 'aux_ctc' config of the hybrid model):From what I can tell, the method
EncDecHybridRNNTCTCModel.change_decoding_strategy()
doesn't significantly impact either of the posterior computation steps above, but it definitely impacts how theEncDecHybridRNNTCTCModel.transcribe()
method works. If decoding strategy is set toctc
, then thetranscribe()
method can return log posteriors (like the CTC model method does), otherwise it calls the RNNT version oftranscribe
which does not return posteriors.Does that sound about right? Honestly it would be nice if there was a simple way to "view" a HybridRNNTCTC model as a CTC model and have all the APIs work the same way the do with a CTC model... It would also be nice if there was a single method to return log posteriors from the RNNT models when those are needed. I understand I could be wrong about all of that though... 🤣
Beta Was this translation helpful? Give feedback.
All reactions