# Welcome to ExKaldi

In this section, we will further process the Kaldi decoding lattice and score the results.

In [None]:
import exkaldi

import os
dataDir = "librispeech_dummy"

Load the lattice file (generated in 09_decode_back_HMM-GMM_and_WFST).

In [None]:
latFile = os.path.join(dataDir, "exp", "train_delta", "decode_test", "test.lat")

lat = exkaldi.decode.wfst.load_lat(latFile)

lat

To be simple and straightforward, we get the 1-best result from lattice. Word-id table and HMM model are necessary.

Word-ID table can be __words.txt__ file (If decoded in word level) or __phones.txt__ file (If decoded in phone level) or Exkaldi __ListTable__ object.  

Ideally, __LexiconBank__ object is also avaliable because you can get both "words" and "phones" from it.

In [None]:
wordsFile = os.path.join(dataDir, "exp", "words.txt")

hmmFile = os.path.join(dataDir, "exp", "train_delta", "final.mdl")

In [None]:
result = lat.get_1best(symbolTable=wordsFile, hmm=hmmFile, lmwt=1, acwt=0.5)

result.subset(nHead=1)

___result___ is a exkaldi __Transcription__ object.

The decoding result is int-ID format. If you want it by text-format, try this:

In [None]:
textResult = exkaldi.hmm.transcription_from_int(result, wordsFile)

textResult.subset(nHead=1)

Just for convenience, we restorage lexicons.

In [None]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.load_lex(lexFile)

In [None]:
del textResult

Besides the __transcription_from_int__ function, we can transform transcription by using the __Transcription__'s own method, like this:

In [None]:
word2id = lexicons("words")
oovID = word2id[lexicons("oov")]
id2word = word2id.reverse()

textResult = result.convert(symbolTable=id2word, unkSymbol=oovID)

textResult.subset(nHead=1)

In [None]:
del result

Now we can score the decoding result. Typically, you can compute the WER(word err rate).

In [None]:
refFile = os.path.join(dataDir, "test", "text")

score = exkaldi.decode.score.wer(ref=refFile, hyp=textResult, mode="present")

score

Or some times, compute the edit distance score. 

In [None]:
score = exkaldi.decode.score.edit_distance(ref=refFile, hyp=textResult, mode="present")

score

Then compute the accuracy of words levels.

In [None]:
1 - score.editDistance/score.words

We tested this and only get the WER 134.37, and the accuracy rate of words is 27.6%.

We support further process the lattice, for example, to add penalty or to scale it.

Here is a example to config different language model weight(LMWT) and penalty. (In Instead of text-format result, we use int-format reference file.)

In [None]:
refInt = exkaldi.hmm.transcription_to_int(refFile, lexicons("words"), unkSymbol=lexicons("oov"))
refIntFile = os.path.join(dataDir, "exp", "train_delta", "decode_test", "text.int")
refInt.save(refIntFile)

refInt.subset(nHead=1)

In [None]:
for penalty in [0., 0.5, 1.0]:
    for LMWT in range(10, 15):
        
        newLat = lat.add_penalty(penalty)
        result = newLat.get_1best(lexicons("words"), hmmFile, lmwt=LMWT, acwt=0.5)

        score = exkaldi.decode.score.wer(ref=refInt, hyp=result, mode="present")
        
        print(f"Penalty {penalty}, LMWT {LMWT}: WER {score.WER}")

From the lattice, you can get the phone-level result.

In [None]:
phoneResult = lat.get_1best(lexicons("phones"), hmmFile, lmwt=1, acwt=0.5, phoneLevel=True)

phoneResult = exkaldi.hmm.transcription_from_int(phoneResult, lexicons("phones"))

phoneResult.subset(nHead=1)

From lattice, N-Best results can also be extracted.

In [None]:
result = lat.get_nbest(
                        n=3,
                        symbolTable=lexicons("words"),
                        hmm=hmmFile, 
                        acwt=0.5, 
                        phoneLevel=False,
                        requireCost=False,
                )

for re in result:
    print(re.name, type(re))

___result___ is a list of N-bests __Transcription__ objects. If ___requireCost___ is True, return the LM score and AM score sumultaneously.

In [None]:
result = lat.get_nbest(
                        n=3,
                        symbolTable=lexicons("words"),
                        hmm=hmmFile, 
                        acwt=0.5, 
                        phoneLevel=False,
                        requireCost=True,
                )

for re in result[0]:
    print(re.name, type(re))
    
for re in result[1]:
    print(re.name, type(re))

for re in result[2]:
    print(re.name, type(re))

And importantly, Alignment can be returned. 

In [None]:
result = lat.get_nbest(
                        n=3,
                        symbolTable=lexicons("words"),
                        hmm=hmmFile, 
                        acwt=0.5, 
                        phoneLevel=False,
                        requireCost=False,
                        requireAli=True,
                )

for re in result[1]:
    print(re.name, type(re))

We will not train __LDA+MLLT__ and __SAT__ in this tutorial. If you need tutorial about them, please look the `examples` directory. We prepare some actual recipes for, for example, __TIMIT__ corpus.