# Welcome to ExKaldi

In this section, we will decode the test data based on HMM-DNN model and WFST graph.

In [1]:
import os
dataDir = "librispeech_dummy"

os.environ["LD_LIBRARY_PATH"] = "/home/khanh/workspace/miniconda3/envs/kaldi/lib/;/home/khanh/workspace/miniconda3/envs/test/lib/"

import exkaldi
exkaldi.info.reset_kaldi_root("/home/khanh/workspace/projects/kaldi")

exkaldi.info.reset_kaldi_root( yourPath )
If not, ERROR will occur when implementing some core functions.


Restorage the posteiori probability of AM from file (generated in 11_train_DNN_acoustic_model_with_tensorflow).

In [2]:
probFile = os.path.join(dataDir, "exp", "train_DNN", "amp.npy")

prob = exkaldi.load_prob(probFile)

prob

<exkaldi.core.archive.NumpyProb at 0x7febdcc496d0>

In [3]:
prob.subset(nHead=1).data

{'1272-128104-0000': array([[-3.316089  ,  9.444235  , -1.2055417 , ..., -1.5940963 ,
         -1.4743445 ,  4.3728447 ],
        [-2.947546  ,  9.233662  , -1.4251244 , ..., -1.2616441 ,
         -0.6829376 ,  3.4242752 ],
        [-3.9679208 , 10.093671  , -1.6254816 , ..., -1.0757825 ,
         -1.0036131 ,  2.437086  ],
        ...,
        [-3.5702353 , 10.735809  , -1.5427482 , ..., -0.08101523,
          1.3123363 ,  4.8648405 ],
        [-4.2810836 , 11.132116  , -1.5070189 , ...,  0.84956515,
          1.4962223 ,  4.3466806 ],
        [-4.9594646 , 11.140017  , -1.9769467 , ...,  1.0206281 ,
          1.0107785 ,  3.685708  ]], dtype=float32)}

As above, this is naive output without log softmax activation function. We need do softmax.

Exkaldi Numpy achivements have a method __.map(...)__ . We map a softmax function to all matrixs.

In [4]:
prob = prob.map( lambda x: exkaldi.nn.log_softmax(x, axis=1) )

prob.subset(nHead=1).data

{'1272-128104-0000': array([[-14.927118 ,  -2.1667948, -12.816571 , ..., -13.205126 ,
         -13.085374 ,  -7.238185 ],
        [-14.330314 ,  -2.149106 , -12.807892 , ..., -12.644412 ,
         -12.065705 ,  -7.9584923],
        [-16.05725  ,  -1.995657 , -13.714809 , ..., -13.165111 ,
         -13.092941 ,  -9.652242 ],
        ...,
        [-15.551076 ,  -1.2450314, -13.523589 , ..., -12.061856 ,
         -10.668505 ,  -7.116    ],
        [-16.484016 ,  -1.070816 , -13.709951 , ..., -11.353367 ,
         -10.70671  ,  -7.8562517],
        [-17.505625 ,  -1.4061432, -14.523107 , ..., -11.525532 ,
         -11.535381 ,  -8.860452 ]], dtype=float32)}

Then decode based on WFST. HCLG graph file and HMM model file have been generated (07_train_triphone_HMM-GMM_delta).

In [5]:
HCLGFile = os.path.join(dataDir, "exp", "train_delta", "graph", "HCLG.fst")

hmmFile = os.path.join(dataDir, "exp", "train_delta", "final.mdl")

And for convenience, prepare lexicons.

In [6]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.decode.graph.load_lex(lexFile)

lexicons

<exkaldi.decode.graph.LexiconBank at 0x7febdcc495e0>

Use __nn_decode__ function. 

In [7]:
lat = exkaldi.decode.wfst.nn_decode(prob, hmmFile, HCLGFile, symbolTable=lexicons("words"))

lat

<exkaldi.decode.wfst.Lattice at 0x7febc5f5c3a0>

In [8]:
outDir = os.path.join(dataDir, "exp", "train_DNN", "decode_test")

exkaldi.utils.make_dependent_dirs(outDir, False)

lat.save( os.path.join(outDir,"test.lat") )

'librispeech_dummy/exp/train_DNN/decode_test/test.lat'

From lattice get 1-best result and score it.

In [9]:
refIntFile = os.path.join(dataDir, "exp", "train_delta", "decode_test", "text.int")

for penalty in [0., 0.5, 1.0]:
    for LMWT in range(10, 15):
    
        newLat = lat.add_penalty(penalty)
        result = newLat.get_1best(lexicons("words"), hmmFile, lmwt=LMWT, acwt=0.5)

        score = exkaldi.decode.score.wer(ref=refIntFile, hyp=result, mode="present")
        
        print(f"Penalty {penalty}, LMWT {LMWT}: WER {score.WER}")

Penalty 0.0, LMWT 10: WER 110.98
Penalty 0.0, LMWT 11: WER 110.98
Penalty 0.0, LMWT 12: WER 110.98
Penalty 0.0, LMWT 13: WER 110.98
Penalty 0.0, LMWT 14: WER 110.98
Penalty 0.5, LMWT 10: WER 110.98
Penalty 0.5, LMWT 11: WER 110.98
Penalty 0.5, LMWT 12: WER 110.98
Penalty 0.5, LMWT 13: WER 110.98
Penalty 0.5, LMWT 14: WER 110.98
Penalty 1.0, LMWT 10: WER 110.98
Penalty 1.0, LMWT 11: WER 110.98
Penalty 1.0, LMWT 12: WER 110.98
Penalty 1.0, LMWT 13: WER 110.98
Penalty 1.0, LMWT 14: WER 110.98


In step 10_process_lattice_and_score, the best WER based on HMM-GMM model is about 135% and here is 107% in our experiment. So it is a truth that the performance of HMM-DNN got better.

Up to here, the simple tutorial is over.