# Welcome to Exkaldi

In this section, we will decode based on HMM-GMM model and WFST.

In [1]:
import exkaldi

import os
dataDir = os.path.join("..","examplesdata","librispeech_dummy")

Prepare a word-id table. We use the lexicons generated in early step directly. So load it. 

In [3]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.decode.graph.load_lex(lexFile)

lexicons

<exkaldi.decode.graph.LexiconBank at 0x7fa9292597f0>

In lexicons, call "words" to get the word-id table if you want decode in words level. Or call "phones" to get the phone-ID table when decoding in phone level. But both them will return the Exkaldi ListTable object.

In [None]:
type(lexicons("words"))

Prepare the acoustic feature for test. We compute the feature as same as training data.

In [None]:
scpFile = os.path.join(dataDir, "test", "wav.scp")
utt2spkFile = os.path.join(dataDir, "test", "utt2spk")
spk2uttFile = os.path.join(dataDir, "test", "spk2utt")

feat = exkaldi.compute_mfcc(scpFile, name="mfcc")
feat = exkaldi.compute_cmvn_stats(feat, spk2utt=spk2uttFile, name="cmvn")
feat = exkaldi.use_cmvn(feat, cmvn, utt2spk=utt2spkFile)

feat

Save it to file.

In [4]:
featFile = os.path.join(dataDir, "exp", "test_mfcc.ark")

feat.save(featFile, outScpFile=False)

<exkaldi.core.achivements.BytesFeature at 0x7fa929072e48>

Prepare the HMM-GMM model and WFST decoding graph. They have been generated in early steps.

In [5]:
HCLGFile = os.path.join(dataDir, "exp", "graph", "HCLG.fst")
hmmFile = os.path.join(dataDir, "exp", "train_delta", "final.mdl")

Then decode. You can set some decoding parameters such as __beam__, __acwt__ and so on. Here we only use default configures.

In [6]:
lat = exkaldi.decode.wfst.gmm_decode(feat, hmmFile, HCLGFile, wordSymbolTable=lexicons("words"))

lat

<exkaldi.decode.wfst.Lattice at 0x7fa9820323c8>

___lat___ is an exkaldi __Lattice__ object. We will introduce it's property in detail in next step. Now, save it to file with kaldi format.

In [None]:
outDir = os.path.join(dataDir, "exp", "decode_test")

exkaldi.utils.make_dependent_dirs(outDir, False)

In [7]:
latFile = os.path.join(outDir "test.lat")

lat.save(latFile)

'/misc/Work19/wangyu/exkaldi-1.0/examplesdata/librispeech_light/exp/test.lat'