# Welcome to ExKaldi

In this section, we will build a decision tree. In order to train a triphone model, a decision tree is necessary.

In [None]:
import exkaldi

import os
dataDir = "librispeech_dummy"

Restorage lexicons generated in early step (3_prepare_lexicons).

In [None]:
lexFile = os.path.join(dataDir, "exp", "lexicons.lex")

lexicons = exkaldi.load_lex(lexFile)

Then instantiate a __DecisionTree__ object. ___lexicons___ can be provided as a parameter.

In [None]:
tree = exkaldi.hmm.DecisionTree(lexicons=lexicons, contextWidth=3, centralPosition=1)

tree

Then prepare acoustic feature, hmm model and alignment.

In [None]:
featFile = os.path.join(dataDir, "exp", "train_mfcc_cmvn.ark")
feat = exkaldi.load_feat(featFile)
feat = feat.add_delta(order=2)

feat.dim

Monophone HMM model and alignment have been generated in last step (5_train_mono_HMM-GMM). Now use them directly. In terms of all archives, that are feature, CMVN, probability, fmllr and alignment, we do not allow you use their file directly. So you need load them.

You can load the data or only load the index table.

In [None]:
hmmFile = os.path.join(dataDir, "exp", "train_mono", "final.mdl")

aliFile = os.path.join(dataDir, "exp", "train_mono", "final.ali")
ali = exkaldi.load_index_table(aliFile, useSuffix="ark")
ali

As training the HMM model, we provide high-level API to train tree, but now we still introduce the training steps in detail.

### Train Dicision Tree in detail

#### 1. Accumulate statistics data

In [None]:
outDir = os.path.join(dataDir, "exp", "train_delta")

exkaldi.utils.make_dependent_dirs(outDir, False)

In [None]:
treeStatsFile = os.path.join(outDir, "treeStats.acc")

tree.accumulate_stats(feat, hmmFile, ali, outFile=treeStatsFile)

#### 2. Cluster phones and compile questions.

In [None]:
topoFile = os.path.join(dataDir, "exp", "topo")

questionsFile = os.path.join(outDir, "questions.qst")

tree.compile_questions(treeStatsFile, topoFile, outFile=questionsFile)

#### 3. Build tree.

In [None]:
targetLeaves = 300

tree.build(treeStatsFile, questionsFile, topoFile, numLeaves=targetLeaves)

Decision has been built done. Look it.

In [None]:
tree.info

Save the tree to file.

In [None]:
treeFile = os.path.join(outDir, "tree")

tree.save(treeFile)

As mentioned above, we provided a high-level API to build tree directly.

### Train Dicision Tree in high-level API

In [None]:
del tree
os.remove(treeStatsFile)
os.remove(questionsFile)
os.remove(treeFile)

In [None]:
tree = exkaldi.hmm.DecisionTree(lexicons=lexicons,contextWidth=3,centralPosition=1)

tree.train(feat=feat, hmm=hmmFile, ali=ali, topoFile=topoFile, numLeaves=300, tempDir=outDir)

Tree has been saved in directory automatically.

In [None]:
tree.info