Improve recognition parameter handling #125

christophmluscher · 2023-04-06T13:08:20Z

When doing grid search on the decoding parameters for a hybrid model, doing such a cartesian product does not make so much sense. One should first tune model-related scales such as prior scale and tdp scale. Then tdp values, and only given the optimal values of the mentioned parameters one tunes the lm and pronunciation scales. Moreover, we should definitely consider the obligatory use of a high altas and small beam for the first two steps - only lm scale should not be tuned together with altas.

Please also consider that experience shows that the only tdp value that is worth it to tune is the exit penalty for silence and non-word.

Originally posted by @Marvin84 in #110 (comment)

The text was updated successfully, but these errors were encountered:

Marvin84 · 2023-04-11T12:31:37Z

The idea is to have three steps, implemented as independent functions of the HybridDecoder class.

Set a small beam (e.g. 14, and high altas, e.g. 12) and run the next 2 steps

Tune the tdp and prior scales
With the optimal value of this tune exit penalties of silence and non-word

Set the final beam (e.g. 16 or 18, and lower altas, e.g. 2.0, 4.0) and then
3. Tune the LM scale
4. Optionally with all optimal values, tune the search space size. (we pruning, beam limit, beam, altas)

christophmluscher assigned christophmluscher and Marvin84 Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve recognition parameter handling #125

Improve recognition parameter handling #125

christophmluscher commented Apr 6, 2023

Marvin84 commented Apr 11, 2023

Improve recognition parameter handling #125

Improve recognition parameter handling #125

Comments

christophmluscher commented Apr 6, 2023

Marvin84 commented Apr 11, 2023