# 6.3: Inspecting the `exp` directories

`run_train_phones.sh` will generate a new directory for each layer of the acoustic model in `exp`.  We will inspect their contents below.

In [1]:
ls exp

[0m[01;34mmonophones[0m          [01;34mtriphones[0m          [01;34mtriphones_lda[0m          [01;34mtriphones_sat[0m
[01;34mmonophones_aligned[0m  [01;34mtriphones_aligned[0m  [01;34mtriphones_lda_aligned[0m


## `exp/monophones`

This directory contains the files generated from the *first* layer of training: `monophones`.

In [8]:
ls exp/monophones

0.mdl    [0m[01;31mali.1.gz[0m  [01;31mali.4.gz[0m   [01;36mfinal.occs[0m  [01;31mfsts.3.gz[0m               [01;34mlog[0m
40.mdl   [01;31mali.2.gz[0m  cmvn_opts  [01;31mfsts.1.gz[0m   [01;31mfsts.4.gz[0m               num_jobs
40.occs  [01;31mali.3.gz[0m  [01;36mfinal.mdl[0m  [01;31mfsts.2.gz[0m   kaldi_config_args.json  tree


### `exp/monophones/log`

This directory contains all the logs of all the steps run in the process of training `monophones`.  You'll notice that there will be one or more `.[0-9]` postfixes to the log.  The **last** of these will refer to the thread used during parallelization.  The **first** of these will refer to a particular iteration (for those steps that are iterative).  Some of these are more useful than others, but they are **always** useful when an error occurs. 

In [84]:
ls exp/monophones/log

acc.1.1.log   acc.26.3.log  acc.8.1.log     align.4.3.log
acc.1.2.log   acc.26.4.log  acc.8.2.log     align.4.4.log
acc.1.3.log   acc.27.1.log  acc.8.3.log     align.5.1.log
acc.1.4.log   acc.27.2.log  acc.8.4.log     align.5.2.log
acc.10.1.log  acc.27.3.log  acc.9.1.log     align.5.3.log
acc.10.2.log  acc.27.4.log  acc.9.2.log     align.5.4.log
acc.10.3.log  acc.28.1.log  acc.9.3.log     align.6.1.log
acc.10.4.log  acc.28.2.log  acc.9.4.log     align.6.2.log
acc.11.1.log  acc.28.3.log  align.0.1.log   align.6.3.log
acc.11.2.log  acc.28.4.log  align.0.2.log   align.6.4.log
acc.11.3.log  acc.29.1.log  align.0.3.log   align.7.1.log
acc.11.4.log  acc.29.2.log  align.0.4.log   align.7.2.log
acc.12.1.log  acc.29.3.log  align.1.1.log   align.7.3.log
acc.12.2.log  acc.29.4.log  align.1.2.log   align.7.4.log
acc.12.3.log  acc.3.1.log   align.1.3.log   align.8.1.log
acc.12.4.log  acc.3.2.log   align.1.4.log   align.8.2.log
acc.13.1.log  acc.3.3.log   align.10.1.log  align.8.3.log
acc.13.2.log  

### `num_jobs`

There will often be a `num_jobs` file in `kaldi` directories.  This is simply one `integer`, the number of threads used if parallelization was used.

In [6]:
cat exp/monophones/num_jobs

4


### `cmvn_opts`

You will often see a file ending in `_opts`.  This is an `options` file that *sometimes* contains hyperparameter settings that will be read by scripts.  They will take the same format as the arguments we can add to our `non_vanilla_*` arguments in `kaldi_config.json`:

```
--variable_name [variable_value]
```

In this case, `cmvn_opts` is empty.


In [4]:
cat exp/monophones/cmvn_opts




### `{40,final}.occs`

This file contains the "per-transition-id occupation counts" and is "rarely needed" (quotes from a post by the main author of `kaldi`).  So we will ignore this file.  

In this case, you see a `40_` and a `final_`.  This implies that this information was updated iteratively, and all but the last iteration (in this case, `40_`) were deleted.  `final_` is then a `symbolic link` to the highest valued file left in the directory.  You can see this represented by the `->` in the `ls -lah` command below.

**Note:** `kaldi` will utilize this structure often, including below with the `.mdl` files.

In [10]:
ls -lah exp/monophones | grep occs

-rw-r--r--  1 root root  811 Nov 29 20:12 40.[01;31m[Koccs[m[K
lrwxrwxrwx  1 root root    7 Nov 29 20:12 final.[01;31m[Koccs[m[K -> 40.[01;31m[Koccs[m[K


### `{40,final}.mdl`

The `.mdl` file is the actual acoustic model file for this step.  If we were so inclined, we could use this `.mdl` file as one of the arguments passed to our decoding step.  Each "layer" of our acoustic training will generate a `.mdl` file.

This file [does ????], and we'll look at these `.mdl` files in more detail in the next notebook.  But they can be converted to "human-readable" form using `show-transitions` (as long as you have `source`d `path.sh`)


In [4]:
. ${KALDI_INSTRUCTIONAL_PATH}/path.sh
show-transitions

show-transitions 

Print debugging info from transition model, in human-readable form
Usage:  show-transitions <phones-symbol-table> <transition/model-file> [<occs-file>]
e.g.: 
 show-transitions phones.txt 1.mdl 1.occs

Standard options:
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")
  --help                      : Print out usage message (bool, default = false)
  --print-args                : Print the command line arguments (to stderr) (bool, default = true)
  --verbose                   : Verbose level (higher->more logging) (int, default = 0)



: 1

In [79]:
show-transitions \
    data/lang/phones.txt \
    exp/monophones/final.mdl \
    | head

show-transitions data/lang/phones.txt exp/monophones/final.mdl 
Transition-state 1: phone = SIL hmm-state = 0 pdf = 0
 Transition-id = 1 p = 0.825838 [self-loop]
 Transition-id = 2 p = 0.01 [0 -> 1]
 Transition-id = 3 p = 0.154166 [0 -> 2]
 Transition-id = 4 p = 0.01 [0 -> 3]
Transition-state 2: phone = SIL hmm-state = 1 pdf = 1
 Transition-id = 5 p = 0.951921 [self-loop]
 Transition-id = 6 p = 0.01 [1 -> 2]
 Transition-id = 7 p = 0.01 [1 -> 3]
 Transition-id = 8 p = 0.0280863 [1 -> 4]


### `fsts.*.gz`

These files (one for each parallelized thread) contain the `FST`s representing our training data.  We will look at similar `FST`s used during **test** time at a later date, so for now, we'll ignore these files.

### `ali.*.gz`

These files contain the alignment information mapping each frame to a phone.  You may recall that we used a similar `ali.*.gz` file in `4_3-examining_mfccs.ipynb`.  We can use `ali-to-phones` to convert these aligments into a sequence of phones.  We will look at these alignments in more detail later.

In [81]:
ali-to-phones

ali-to-phones 

Convert model-level alignments to phone-sequences (in integer, not text, form)
Usage:  ali-to-phones  [options] <model> <alignments-rspecifier> <phone-transcript-wspecifier|ctm-wxfilename>
e.g.: 
 ali-to-phones 1.mdl ark:1.ali ark:-
or:
 ali-to-phones --ctm-output 1.mdl ark:1.ali 1.ctm
See also: show-alignments lattice-align-phones

Options:
  --ctm-output                : If true, output the alignments in ctm format (the confidences will be set to 1) (bool, default = false)
  --frame-shift               : frame shift used to control the times of the ctm output (float, default = 0.01)
  --per-frame                 : If true, write out the frame-level phone alignment (else phone sequence) (bool, default = false)
  --write-lengths             : If true, write the #frames for each phone (different format) (bool, default = false)

Standard options:
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")
  --help       

: 1

**Note:** Notice that they are `gzipped` (compressed).  So, in order to access the "actual" binary file, you'll need to decompress the file, either in a separate, initial step or via a `piped` step.  Below you can see how you can decompress "on-the-fly" using `gzip -cd`.

**Note:** You'll also notice we're `pip`ing `int2sym.pl` since the output of `fsts-to-transcripts` are indexes.  This will convert those indexes to their appropriate words.

In [83]:
ali-to-phones \
    --per-frame=true \
    exp/monophones/final.mdl \
    "ark:gzip -cd exp/monophones/ali.1.gz|" \
    "ark,t:|int2sym.pl -f 2- data/lang/phones.txt" \
    | head -n1

ali-to-phones --per-frame=true exp/monophones/final.mdl 'ark:gzip -cd exp/monophones/ali.1.gz|' 'ark,t:|int2sym.pl -f 2- data/lang/phones.txt' 
1272-128104-0009 SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL SIL HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B HH_B IY1_E IY1_E IY1_E IY1_E IY1_E IY1_E IY1_E L_B L_B L_B L_B L_B L_B L_B L_B L_B L_B L_B L_B AH0_I AH0_I AH0_I AH0_I M_I M_I M_I M_I M_I M_I M_I M_I M_I M_I M_I EH1_I EH1_I EH1_I EH1_I EH1_I N_I N_I N_I N_I N_I N_I T_I T_I T_I S_E S_E S_E S_E S_E S_E S_E S_E S_E S_E S_E M_B M_B M_B M_B M_B M_B M_B M_B M_B M_B M_B OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I OW1_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I S_I T_E T_E T_E T_E T_E SIL SIL SIL SIL SIL SIL SIL SIL SIL B_B B_B B_B B_B B_B B_B B_B B_B IH1_I IH1_I IH

LOG (ali-to-phones[5.2.191~1-48be1]:main():ali-to-phones.cc:134) Done 68 utterances.


### `tree`

This file is a representation of the decision tree that will be used to cluster the phones.  We will go into much more detail about this later and we will look at a visual representation of this tree generated by `draw-tree`.

In [96]:
draw-tree

draw-tree 

Outputs a decision tree description in GraphViz format
Usage: draw-tree [options] <phone-symbols> <tree>
e.g.: draw-tree phones.txt tree | dot -Gsize=8,10.5 -Tps | ps2pdf - tree.pdf

Options:
  --gen-html                  : generates HTML boilerplate(useful with SVG) (bool, default = false)
  --query                     : a query to trace through the tree(format: pdf-class/ctx-phone1/.../ctx-phoneN) (string, default = "")
  --use-tooltips              : use tooltips instead of labels (bool, default = false)

Standard options:
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")
  --help                      : Print out usage message (bool, default = false)
  --print-args                : Print the command line arguments (to stderr) (bool, default = true)
  --verbose                   : Verbose level (higher->more logging) (int, default = 0)



: 255

The command below will save a `.png` of the tree to `exp/monphones/tree`, and the next cell will render that `.png` using `Markdown` (if you want to see how to render images in `Markdown`, click on the next cell and the `Markdown` command will be revealed).  Obviously, it's not to easy to see.  We'll look at some close-ups in the next notebook.

In [105]:
draw-tree \
    data/lang/phones.txt \
    exp/monophones/tree \
    | dot -Tpng -Gsize=8,10.5 > exp/monophones/tree.png

draw-tree --query=0/SIL_B data/lang/phones.txt exp/monophones/tree 


![tree](exp/monophones/tree.png)