# Hidden Markov Model

Assumption:
- $P(y_{t+1}|y_{t}, Z) = P(y_{t+1}|y_{t})$
- $P(x_{t}|y_{t}, Z) = P(x_{t}|y_{t})$

There are 3 kind of probabilities:
- Transmission $P(y_{t+1}|y_{t})$ : likelihood of moving from one state to another state
- Emission $P(x_t|y_t)$ :likelihood of observations given state
- $\pi$- initial state probability distribution.

## Forward Algorithm:

$$ \alpha_{t}(j) = P(x1, x2,..., x_t, y_t)
= P(x_t|y_t)\sum_{y_{t-1}}[P(y_t|y_{t-1})P(x1, x2,..., x_{t-1}, y_{t-1})] $$
$$
= P(x_t|y_t)\sum_{y_{t-1}}[P(y_t|y_{t-1})\alpha_{t-1}(j-1)]
$$

dimension($\alpha_{t}(j)$) = (# hiddens x 1) = dimension($y_t$)

dimension($x_t$) = (#observations x 1)

$$ \alpha_{0}(0) = P(x_0, y_0) = P(x_0|y_0)P(y_0) $$ 

P(x_0|y_0) and P(y_0) are intialized.

$$ \alpha_{1}(1) = P(x_0, x_1, y_1) = P(x_1|y_1) \sum_{y_0}[P(y_1|y_0)\alpha_{0}(0)$$
...

`Note`: In my code, I used T.T\[i\] and E\[i\]. 
* T.T\[i\]: the transmission probability P(y_t|y_{t-1}) from all hidden states to hidden state `i` (1xh)
* E\[i\]: the emmision probability P(x_t|y_t) from all hidden states to observation `i` (1 x h)

In [1]:
import numpy as np

In [2]:
transmission= np.array([[0.05, 0.05, 	0.7, 	0.2],
                        [0.1,	0.05,	0.6,	0.25],
                        [0.1,	0.3,	0.4,	0.2],
                        [0.25,	0.4,	0.3,	0.05]])
emission = np.array([[0.3,	0.4,	0.2,	0.3],
                     [0.4,	0.2,	0.1,	0.05],
                     [0.2,	0.1,	0.2,	0.3],
                     [0.1,	0.3,	0.5,	0.35]])

observations = ['2','3','3','2','3','2','3','2','2','3','1','3','3','1','1',
        '1','2','1','1','1','3','1','2','1','1','1','2','3','3','2',
        '3','2','2']

def str_to_index(name): # Simple convert
        return list(map(lambda x: int(x) - 1, name))
def index_to_str(idx):
    return list(map(lambda x: str(x+1), idx))

In [3]:
def HMM(T, E):
    """
    Build model
    """
    pass
def train(observations):
    """
    Find T and E
    """
    pass

def forward_initialize(E):# Asume no information before (actually we can find it) => P(y_0) = 1/len(E) for all states
    forward = []
    P_y0 = 1/len(E.T)
    forward.append(E*P_y0)
    return forward

def forward(new_seq, T, E):
    """
    Args: 
        new_seq: (1 x N) observation index
        T: (h x h) trainsmission - T_{ij} = P(y_j|y_i)is likelihood 
            that state i move to state j
        E: (o x h) emission - E_{ij} = P(x_j|y_i) is probability of 
            emitting j-th observation state from i-th hidden state
    Return: (1 x N) array of likelihood of hidden states sequence
    """

    obs = new_seq[0]
    forward = forward_initialize(E[obs]) # initalize probability P(x0, y0)
    
    for t in range(1, len(new_seq)):
        Sum = [T.T[j].dot(forward[t-1]) for j in range(len(T))]
        obs = new_seq[t]
        alpha = Sum * E[obs] 
        forward.append(alpha)
    
    # likelihood of seq of hidden states
    best_hidden_seq = np.argmax(forward, axis = 1)
    return best_hidden_seq
new_seq = ['1', '1', '4']
new_seq = str_to_index(new_seq)

best_state_idx = forward(new_seq, transmission, emission)
print(index_to_str(best_state_idx))

['2', '3', '3']


## Viterbi Algorithm
The most likely sequence $Y* = \{y1, ..., y_T\}$ given a sequence of observations $X = \{x1, ..., x_T\}$

$$Y* = argmax_Y P(Y|X) = argmax_Y P(Y,X)$$

$$
\begin{aligned}Viterbi(T) &= \max_{y_{1:T}} p(y_1, ..., y_T, x_1, ..., x_T)=\max_{y_T}\max_{y_{1:T-1}} p(y_1, ..., y_T, x_1, ..., x_T)\\&=\max_{y_T}\max_{y_{1:T-1}} \{p(x_T|y_T)p(y_T|y_{T-1})p(y_1, ..., y_{T-1}, x_1,..., x_{T-1})\}\\&= \max_{y_T}\max_{y_{T-1}} \{p(x_T|y_T)p(y_T|y_{T-1})\max_{y_{1:T-2}}p(y_1, ..., y_{T-1}, x_1,..., x_{T-1})\}\\ &=\max_{y_T}\max_{y_{T-1}} \{p(x_T|y_T)p(y_T|y_{T-1})Viterbi(T-1)\}\end{aligned}$$


In [4]:
def viterbi_algorithm(new_seq, T, E):
    """
    Args: 
        new_seq: (1 x N) index observation
        T: (h x h) trainsmission - T_{ij} = P(y_j|y_i)is likelihood 
            that state i move to state j
        E: (o x h) emission - E_{ij} = P(x_j|y_i) is probability of 
            emitting j-th observation state from i-th hidden state
    Return:
        (1 x N) array of likelihood of hidden states sequence
    """
    
    
    obs_idx = new_seq[0]
    max_viterbi = forward_initialize(E[obs_idx]) # initalize probability P(x0, y0)
    viterbi = []

    # Compute viterbi (o x h x h) and max_viterbi (o x h) 
    for t in range(1, len(new_seq)):
        obs_idx = new_seq[t]
        viterbi.append(np.array([max_viterbi[t-1]  # (1 x h)
                                *T.T[i]            # (1 x h)
                                *E[obs_idx][i]     # (1 x h)
                                for i in range(len(T))
                                ]))# (o x h x h)
        max_viterbi.append(np.max(viterbi[t-1], axis = 1)) # (o x h)
    
    # Back tracking to find likelihood of seq of hidden states
    best_hidden_seq = [np.argmax(max_viterbi[-1])]
    for t in reversed(range(1, len(max_viterbi))):
        argmax_viterbi = best_hidden_seq[-1]
        best_hidden_seq.append(np.argmax(viterbi[t-1][argmax_viterbi]))
    return [ele for ele in reversed(best_hidden_seq)]

new_seq = ['1', '1', '3', '4']
new_seq = str_to_index(new_seq)
best_state_idx = viterbi_algorithm(new_seq, transmission, emission)
print(index_to_str(best_state_idx))


['4', '2', '3', '3']


## Refs

https://github.com/aldengolab/hidden-markov-model