How to use an external RNN-LM (mono-lingual) with a bilingual ASR? #1569

sangeet2020 · 2024-03-26T15:14:52Z

Hi K2 team,

Thank you so much for your amazingly efficient toolkit in streaming focused ASR.

I have trained an EN-DE bilingual streaming ASR model using this receipe.
However, I am not really satisfied with the performance on the English side, and I want to use an externally trained RNN LM (trained using this receipe) to strengthen the WER only on the English side.

I tried using --decoding-method modified_beam_search_lm_shallow_fusion and using English RNN-LM, however, ran into errors due to different vocab size used.
vocab size for bilingual ASR training = 1000 (500 for EN and 500 for DE) and vocab size used for English RNN-LM = 500.

I wonder if its possible to use a monolingual RNN LM with a bilingual ASR model.

Alternatively, is it possible to combine two RNN-LMs? or somehow interpolate them?
I saw some related discussions here: kaldi-asr/kaldi#2069.

Thank You

The text was updated successfully, but these errors were encountered:

marcoyang1998 · 2024-03-28T09:17:18Z

I think it's possible as long as the German bpe and English bpe are distinguishable.

And you also need to make sure which language you are decoding, otherwise you might end up rescoring the German utterance with English RNNLM.

sangeet2020 · 2024-03-28T09:25:06Z

but wouldnt different vocab size of the BPE model for ASR and RNN-LM create an issue in the first place.

When the loading the RNN LM

            model = RnnLmModel(
                vocab_size=params.vocab_size,
                embedding_dim=params.rnn_lm_embedding_dim,
                hidden_dim=params.rnn_lm_hidden_dim,
                num_layers=params.rnn_lm_num_layers,
                tie_weights=params.rnn_lm_tie_weights,
            )

params.vocab_size is the size of the sentence piece tokenizer from ASR (1000 in my case), which is different from the actual RNN LM vocab size (500 in my case). How can I overcome this?

marcoyang1998 · 2024-03-28T09:55:34Z

You need to change the code, I only mean that it's theoretically possible to use a mono-lingual RNNLM to rescore multi-lingual ASR model.

JinZr closed this as completed Apr 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use an external RNN-LM (mono-lingual) with a bilingual ASR? #1569

How to use an external RNN-LM (mono-lingual) with a bilingual ASR? #1569

sangeet2020 commented Mar 26, 2024

marcoyang1998 commented Mar 28, 2024

sangeet2020 commented Mar 28, 2024 •

edited

Loading

marcoyang1998 commented Mar 28, 2024

How to use an external RNN-LM (mono-lingual) with a bilingual ASR? #1569

How to use an external RNN-LM (mono-lingual) with a bilingual ASR? #1569

Comments

sangeet2020 commented Mar 26, 2024

marcoyang1998 commented Mar 28, 2024

sangeet2020 commented Mar 28, 2024 • edited Loading

marcoyang1998 commented Mar 28, 2024

sangeet2020 commented Mar 28, 2024 •

edited

Loading