Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve recognition parameter handling #125

Open
christophmluscher opened this issue Apr 6, 2023 · 1 comment
Open

Improve recognition parameter handling #125

christophmluscher opened this issue Apr 6, 2023 · 1 comment
Assignees

Comments

@christophmluscher
Copy link
Contributor

When doing grid search on the decoding parameters for a hybrid model, doing such a cartesian product does not make so much sense. One should first tune model-related scales such as prior scale and tdp scale. Then tdp values, and only given the optimal values of the mentioned parameters one tunes the lm and pronunciation scales. Moreover, we should definitely consider the obligatory use of a high altas and small beam for the first two steps - only lm scale should not be tuned together with altas.

Please also consider that experience shows that the only tdp value that is worth it to tune is the exit penalty for silence and non-word.

Originally posted by @Marvin84 in #110 (comment)

@Marvin84
Copy link
Contributor

The idea is to have three steps, implemented as independent functions of the HybridDecoder class.

Set a small beam (e.g. 14, and high altas, e.g. 12) and run the next 2 steps

  1. Tune the tdp and prior scales
  2. With the optimal value of this tune exit penalties of silence and non-word

Set the final beam (e.g. 16 or 18, and lower altas, e.g. 2.0, 4.0) and then
3. Tune the LM scale
4. Optionally with all optimal values, tune the search space size. (we pruning, beam limit, beam, altas)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants