Hybrid and Base Decoder Classes #110

christophmluscher · 2023-01-03T17:56:41Z

No description provided.

michelwi

I have some comments. Request changes w.r.t. some mixup of am_scale and pronunciation_scale.

There are many docstrings missing.

common/setups/rasr/base_decoder.py

common/setups/rasr/hybrid_decoder.py

common/setups/rasr/util/decode.py

Marvin84 · 2023-04-05T12:23:10Z

common/setups/rasr/config/am_config.py

+        elif self.value == 2:
+            return "cart"
+        elif self.value == 3:
+            return "dense"


For dense tying there are three different cases (monophone, diphone and triphone), and one needs to specify also boundary class and word end class.

we can add other options (in fact all that are available in rasr) to the enum.

we can add other options

certainly, we can add different dense tyings, but then the word end class and boundary that is specific to the dense tying needs a separate enum class. I will take care of this when I integrate FH. working on it now.

The options to add are:

monophone-dense, diphone-dense, and no-tying-dense, for now @christophmluscher

I added all options from RASR

christophmluscher · 2023-04-05T18:25:15Z

@JackTemaki @Marvin84 @vieting @michelwi can you review please? :)

Marvin84 · 2023-04-06T08:42:23Z

@JackTemaki @Marvin84 @vieting @michelwi can you review please? :)

Looks good, thanks. I will submit later with fh recipes something also for word-end and boundary classes for no-tying-dense.

vieting · 2023-04-06T08:55:15Z

I think I would just add nitpicks, not sure if this is helpful right now. I'm fine with merging.

Atticus1806 · 2023-04-06T09:02:06Z

So for me it would take a longer time to read into the code to review, I just quickly went over it.
It would be nice (even though I know its tedious) to add documentation where applicable to the parameters. This does not have to be every function, but maybe the major ones (there are even already some empty docstrings)

michelwi

So here I am, the only one not using this code.. All my concerns were addressed so I approve now. Have fun :)

Marvin84 · 2023-04-06T12:50:18Z

common/setups/rasr/config/am_config.py

+
+
+@dataclass()
+class Tdp:


It is true that for Viterbi alignment and decoding we generally use the non-normalized default tdp values, however, for full-sum training, we generally introduce normalized loop/forward values that might be even estimated from an alignment or defined based on aomse heuristics, e.g. average phoneme length. In addition to this class, we could also introduce an enum class for TdpType, out of default, heuristic, and alignment-based. For the two latter types one could also have jobs that estimate values from an alignment or from transcription. Daniel Mann has already such jobs.

I would consider this as a new PR. Since this extends the functionality.

Please address also the other comment, before merging the current commit. In my honest opinion, that type of search over the parameters is not correct.

I created two new issues.
Currently you have a lot of flexibility, I rather like that...

Marvin84 · 2023-04-06T12:57:12Z

common/setups/rasr/util/decode.py

+
+    def _get_iter(self):
+        return [
+            (am, lm, pri, pron, tdp, speech, sil, nonspeech, al)


When doing grid search on the decoding parameters for a hybrid model, doing such a cartesian product does not make so much sense. One should first tune model-related scales such as prior scale and tdp scale. Then tdp values, and only given the optimal values of the mentioned parameters one tunes the lm and pronunciation scales. Moreover, we should definitely consider the obligatory use of a high altas and small beam for the first two steps - only lm scale should not be tuned together with altas.

Please also consider that experience shows that the only tdp value that is worth it to tune is the exit penalty for silence and non-word.

You have the option to pass a List of RecogParams or RecogParams with lists where the cartesian product is taken from. We can always change the RecogParam class later on or extend or add more versions.. that is why I tried to put it all in the RecogParamClass

christophmluscher requested review from JackTemaki, Marvin84, michelwi, vieting and Atticus1806 January 3, 2023 17:56

michelwi requested changes Jan 5, 2023

View reviewed changes

christophmluscher mentioned this pull request Jan 10, 2023

Tedlium 2 GMM Baseline #111

Closed

christophmluscher force-pushed the hybrid-base-decoder-dev branch 2 times, most recently from 57ff7af to f965db6 Compare March 30, 2023 16:37

christophmluscher added 21 commits April 4, 2023 19:08

add base and hybrid decoder

62722e0

add imports

203206f

add todo

85317aa

black

34cbab5

black

2436af8

black

31fd5c7

add doc

8698815

fixes

b404439

add todos

c602526

add doc

1cbc0d6

more

f312278

fix: return type

18c3c2a

add lm image creation

2d68256

fixes

dc8abbc

make list for recog params

59e0310

fixes

74d4de6

fix imports

1b996dd

fixes for am and pron scale

bb5e23e

more

f01378a

more fine grained selection for recog params

61d6230

add enum for output layer type

0be53ce

christophmluscher added 4 commits April 4, 2023 19:08

more

1ca825e

fix required lib param

d2a3552

more

4e9fd5c

Merge branch 'main' into hybrid-base-decoder-dev

2d65c5e

christophmluscher force-pushed the hybrid-base-decoder-dev branch from e682e19 to 2d65c5e Compare April 4, 2023 17:14

Merge branch 'main' into hybrid-base-decoder-dev

c8e2556

Marvin84 reviewed Apr 5, 2023

View reviewed changes

add state tying options

84140d9

christophmluscher requested a review from michelwi April 5, 2023 18:23

michelwi approved these changes Apr 6, 2023

View reviewed changes

christophmluscher merged commit 5085cae into rwth-i6:main Apr 6, 2023

christophmluscher deleted the hybrid-base-decoder-dev branch April 6, 2023 12:54

Marvin84 requested changes Apr 6, 2023

View reviewed changes

This was referenced Apr 6, 2023

Improve TDP handling for RASR decoding #124

Open

Improve recognition parameter handling #125

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hybrid and Base Decoder Classes #110

Hybrid and Base Decoder Classes #110

christophmluscher commented Jan 3, 2023

michelwi left a comment

Marvin84 Apr 5, 2023

michelwi Apr 5, 2023

Marvin84 Apr 5, 2023

Marvin84 Apr 5, 2023

christophmluscher Apr 5, 2023

christophmluscher commented Apr 5, 2023

Marvin84 commented Apr 6, 2023

vieting commented Apr 6, 2023

Atticus1806 commented Apr 6, 2023

michelwi left a comment

Marvin84 Apr 6, 2023

christophmluscher Apr 6, 2023

Marvin84 Apr 6, 2023

christophmluscher Apr 6, 2023

Marvin84 Apr 6, 2023 •

edited

Loading

Marvin84 Apr 6, 2023

christophmluscher Apr 6, 2023

Hybrid and Base Decoder Classes #110

Hybrid and Base Decoder Classes #110

Conversation

christophmluscher commented Jan 3, 2023

michelwi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

christophmluscher commented Apr 5, 2023

Marvin84 commented Apr 6, 2023

vieting commented Apr 6, 2023

Atticus1806 commented Apr 6, 2023

michelwi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Marvin84 Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Marvin84 Apr 6, 2023 •

edited

Loading