# Sequence Labeling with Weighted Finite State Transducers

- Language Understanding Systems Lab
- Evgeny A. Stepanov
- stepanov.evgeny.a@gmail.com

## (Spoken/Natural) Language Understanding with WFST

$$\lambda_{LU} = \lambda_{W} \circ \lambda_{G} \circ \lambda_{W2T} \circ \lambda_{SCLM}$$

Where:
- $\lambda_{W}$
- $\lambda_{G}$
- $\lambda_{W2T}$
- $\lambda_{SCML}$


## Joint Distribution Modeling (Dinarelli et. al)

As we have seen, sequence labeling for Natural Language Understanding could be approached using Hidden Markov Models (similar to Part-of-Speech Tagging, will modifications, i.e. O --> w)



| Model   | Equation |
|:--------|:----------
| __HMM__ | $$p(t_{1}^{n}) \approx \prod_{i=1}^{n}{p(w_i|t_i)p(t_i|t_{i-N+1}^{i-1})}$$
| __R&R__ | $$p(w_{1}^n,t_{1}^{n}) \approx \prod_{i=1}^{n}{p(w_{i}t_{i}|w_{i-N+1}^{i-1}t_{i-N+1}^{i-1})}$$





Dinarelli proposes to model $p(t_{1}^{n})$ jointly as $p(w_{1})$

- join words & tags together
- train lm
- use them together



In [None]:
%%bash



In [None]:
## TM + LM

In [4]:
%%bash

function char_lex() {
    # epsilon, unknown, space
    echo -e "<eps>\t0"
    echo -e "<unk>\t1"
    echo -e "<space>\t2"

    cnt=3

    # lowercase letters
    for c in {a..z}
    do
        echo -e "$c\t$cnt"
        ((cnt++))
    done

    # capital letters
    for c in {A..Z}
    do
        echo -e "$c\t$cnt"
        ((cnt++))
    done
}

function fsa_e1a() {
    for c in {a..z}
    do
        echo "0 1 $c"
    done

    # uppercase letters
    for c in {A..Z}
    do
        echo "0 1 $c"
    done

    # space
    echo "0 1 <space>"

    # final state is 1
    echo "1"
}

char_lex > e1.sym
fsa_e1a > e1a.txt

fstcompile --acceptor --isymbols=e1.sym --keep_isymbols e1a.txt e1a.bin
fstprint --acceptor e1a.bin | head -n 10

0	1	a
0	1	b
0	1	c
0	1	d
0	1	e
0	1	f
0	1	g
0	1	h
0	1	i
0	1	j
