# Welcome to Exkaldi

In this section, we will decode based on HMM-GMM model and WFST.

In [1]:
import os
dataDir = "librispeech_dummy"

os.environ["LD_LIBRARY_PATH"] = "/home/khanh/workspace/miniconda3/envs/kaldi/lib/;/home/khanh/workspace/miniconda3/envs/test/lib/"

import exkaldi
exkaldi.info.reset_kaldi_root("/home/khanh/workspace/projects/kaldi")

exkaldi.info.reset_kaldi_root( yourPath )
If not, ERROR will occur when implementing some core functions.


Prepare a word-id table. We use the lexicons generated in early step directly. So load it. 

In [2]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.decode.graph.load_lex(lexFile)

In lexicons, call "words" to get the word-id table if you want decode in words level. Or call "phones" to get the phone-ID table when decoding in phone level. But both them will return the exkaldi __ListTable__ object.

Then prepare the acoustic feature for test. We compute the feature as same as training data.

In [3]:
scpFile = os.path.join(dataDir, "test", "wav.scp")
utt2spkFile = os.path.join(dataDir, "test", "utt2spk")
spk2uttFile = os.path.join(dataDir, "test", "spk2utt")

feat = exkaldi.compute_mfcc(scpFile, name="mfcc")
cmvn = exkaldi.compute_cmvn_stats(feat, spk2utt=spk2uttFile, name="cmvn")
feat = exkaldi.use_cmvn(feat, cmvn, utt2spk=utt2spkFile)

feat.dim

13

Save it to file.

In [4]:
featFile = os.path.join(dataDir, "exp", "test_mfcc_cmvn.ark")

feat.save(featFile)

'librispeech_dummy/exp/test_mfcc_cmvn.ark'

In [5]:
feat = feat.add_delta(order=2)

feat.dim

39

Prepare the HMM-GMM model and WFST decoding graph. They have been generated in early steps.

In [6]:
HCLGFile = os.path.join(dataDir, "exp", "train_delta", "graph", "HCLG.fst")

hmmFile = os.path.join(dataDir, "exp", "train_delta", "final.mdl")

Then decode. You can set some decoding parameters such as __beam__, __acwt__ and so on. Here we only use default configures.

In [7]:
lat = exkaldi.decode.wfst.gmm_decode(feat, hmmFile, HCLGFile, symbolTable=lexicons("words"))

lat

<exkaldi.decode.wfst.Lattice at 0x7f00a91b4ca0>

___lat___ is an exkaldi __Lattice__ object. We will introduce it's property in detail in next step. Now, save it to file with kaldi format.

In [8]:
outDir = os.path.join(dataDir, "exp", "train_delta", "decode_test")

exkaldi.utils.make_dependent_dirs(outDir, pathIsFile=False)

In [9]:
latFile = os.path.join(outDir,"test.lat")

lat.save(latFile)

'librispeech_dummy/exp/train_delta/decode_test/test.lat'

In [10]:
refIntFile = os.path.join(dataDir, "exp", "train_delta", "decode_test", "text.int")

for penalty in [0., 0.5, 1.0]:
    for LMWT in range(10, 15):
    
        newLat = lat.add_penalty(penalty)
        result = newLat.get_1best(lexicons("words"), hmmFile, lmwt=LMWT, acwt=0.5)

        score = exkaldi.decode.score.wer(ref=refIntFile, hyp=result, mode="present")
        
        print(f"Penalty {penalty}, LMWT {LMWT}: WER {score.WER}")

Penalty 0.0, LMWT 10: WER 135.08
Penalty 0.0, LMWT 11: WER 135.08
Penalty 0.0, LMWT 12: WER 134.84
Penalty 0.0, LMWT 13: WER 134.84
Penalty 0.0, LMWT 14: WER 134.84
Penalty 0.5, LMWT 10: WER 134.61
Penalty 0.5, LMWT 11: WER 134.61
Penalty 0.5, LMWT 12: WER 134.61
Penalty 0.5, LMWT 13: WER 134.61
Penalty 0.5, LMWT 14: WER 134.61
Penalty 1.0, LMWT 10: WER 134.61
Penalty 1.0, LMWT 11: WER 134.13
Penalty 1.0, LMWT 12: WER 134.13
Penalty 1.0, LMWT 13: WER 134.13
Penalty 1.0, LMWT 14: WER 134.13
