# Welcome to Exkaldi

In this section, we will decode the test data based on HMM-DNN model and WFST graph.

In [1]:
import exkaldi

import os
dataDir = os.path.join("..","examplesdata","librispeech_dummy")

Restorage the posteiori probability of AM from file (generated in 11_train_DNN_acoustic_model_with_tensorflow).

In [3]:
probFile = os.path.join(dataDir, "exp", "train_DNN", "amp.npy")

prob = exkaldi.load_prob(probFile)

prob

<exkaldi.core.achivements.NumpyProbability at 0x7f06d3bddac8>

In [4]:
prob.subset(nHead=1).arrays

[array([[  9.898075  ,   1.0953686 ,  -7.4839864 , ..., -16.370773  ,
         -12.213672  , -18.83996   ],
        [ 10.67855   ,   2.26591   , -10.3419895 , ..., -11.463517  ,
          -9.06819   , -15.469423  ],
        [ 12.087365  ,   2.3868787 ,  -7.8134317 , ..., -10.210643  ,
          -8.51814   , -13.351054  ],
        ...,
        [ 12.916887  ,   8.691092  ,  -6.6198387 , ...,  -6.230239  ,
           0.23029727, -13.86903   ],
        [ 14.433498  ,  10.299255  ,  -6.050156  , ...,  -7.0731497 ,
          -2.3821042 , -15.21595   ],
        [ 12.779198  ,  10.429777  ,  -7.52954   , ...,  -7.4322987 ,
          -3.2171364 , -13.760274  ]], dtype=float32)]

As above, this is naive output without log softmax activation function. We need do softmax.

Exkaldi Numpy achivements have a method __.map()__ . We map a softmax function to all matrixs.

In [5]:
prob = prob.map( lambda x: exkaldi.nn.log_softmax(x, axis=1) )

prob.subset(nHead=1).arrays

[array([[ -6.8242016, -15.626908 , -24.206263 , ..., -33.09305  ,
         -28.935947 , -35.562237 ],
        [ -5.423871 , -13.836511 , -26.44441  , ..., -27.565937 ,
         -25.17061  , -31.571844 ],
        [ -5.6012163, -15.3017025, -25.502014 , ..., -27.899223 ,
         -26.206722 , -31.039635 ],
        ...,
        [ -8.47839  , -12.7041855, -28.015116 , ..., -27.625515 ,
         -21.16498  , -35.264305 ],
        [ -8.655146 , -12.789389 , -29.1388   , ..., -30.161793 ,
         -25.470749 , -38.304596 ],
        [ -9.562988 , -11.912409 , -29.871727 , ..., -29.774485 ,
         -25.559322 , -36.10246  ]], dtype=float32)]

Then decode. We use WFST. HCLG file and HMM model file have been generated (07_train_triphone_HMM-GMM_delta).

In [6]:
HCLGFile = os.path.join(dataDir, "exp", "graph", "HCLG.fst")

hmmFile = os.path.join(dataDir, "exp", "train_delta", "final.mdl")

And for convenience, prepare lexicons.

In [7]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.decode.graph.load_lex(lexFile)

lexicons

<exkaldi.decode.graph.LexiconBank at 0x7f06d3bf6cf8>

Use __nn_decode__ function. 

In [8]:
lat = exkaldi.decode.wfst.nn_decode(prob, hmmFile, HCLGFile, wordSymbolTable=lexicons("words"))

lat

<exkaldi.decode.wfst.Lattice at 0x7f07311a25f8>

From lattice get 1-best result.

In [9]:
result = lat.get_1best(wordSymbolTable=lexicons("words"), hmm=hmmFile, lmwt=1, acwt=0.5)

result.subset(nHead=1)

{'103-1240-0000': '182 790 718 908 670 589 1120 718 908 670 654 605 1259 1146 89 676 965 298 314 584 4 653 529 449 1279 35 51 619 329 51 1187 161 4 153'}

Score it.

In [11]:
refFile = os.path.join(dataDir, "exp", "text.int")

score = exkaldi.decode.score.wer(ref=refFile, hyp=result, mode="present")

score

Score(WER=1.86, words=3707, insErr=27, delErr=6, subErr=36, SER=9.0, sentences=9, wrongSentences=100, missedSentences=0)

Up to here, the simple tutorial is over.