In [1]:
import numpy as np
import itertools

# Problem 1
(https://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf)

Suppose that you have trained an HMM and you obtain the model $λ = (A, B, π)$ where
$A = 
\begin{bmatrix}
0.7 0.3 \\
0.4 0.6
\end{bmatrix}$
, $B =
\begin{bmatrix}
0.1 0.4 0.5 \\
0.7 0.2 0.1
\end{bmatrix}$
, $π =
\begin{bmatrix}
0.0 1.0
\end{bmatrix}$.

Furthermore, suppose the hidden states correspond to $H$ and $C$, respectively, and the observations are $S$, $M$, and $L$, respectively. In this problem, we consider the observation sequence $O = (O0, O1, O2) = (M, S, L)$.


a) Directly compute $P(O | λ)$. Since
$P(O | λ) = \sum_X P(O, X | λ)$, we use the probabilities in $λ = (A, B, π)$ to compute each of the following for given observation sequence:

$P(O, X = HHH)$

$P(O, X = HHC)$

$P(O, X = HCH)$

$P(O, X = HCC)$

$P(O, X = CHH)$

$P(O, X = CHC)$

$P(O, X = CCH)$

$P(O, X = CCC)$

The desired probability is the sum of these 8 probabilities.

In [2]:
A1 = np.array([[.7,.3],
              [.4,.6]])
B1 = np.array([[.1,.4,.5],
              [.7,.2,.1]])
pi1 = np.array([0., 1.])
Obs = np.array([1,0,2])

prob = 0.
for s1 in [0,1]:
    for s2 in [0,1]:
        for s3 in [0,1]:
            logprob = 0
            #because pi1[0] = 0, we never start in H
            if s1:
                logprob += np.log(pi1[s1]) + np.log(B1[s1,1]) \
                    + np.log(A1[s1,s2]) + np.log(B1[s2,0]) \
                    + np.log(A1[s2,s3]) + np.log(B1[s3,2])
                prob += np.exp(logprob)
print(prob)

0.02488


b) Compute $P(O|λ)$ using the α pass. That is, compute

$α0(0)$

$α0(1)$

$α1(0)$

$α1(1)$

$α2(0)$

$α2(1)$

where the recurrence for $α_t(i)$ is

$α_0(i) = π_ib_i(O_0)$,    for $i = 0, 1, . . . , N − 1$

and

$α_t(i) = (\sum_{j=0}^{N-1} α_{t−1}(j)a_{ji})b_i(O_t)$   for $t = 1, 2, . . . , T − 1$ and $i = 0, 1, . . . , N − 1$. 

The desired probability is given by
$P(O | λ) = \sum_{i=0}^{N-1} \alpha_{T-1}(i)$.

In [3]:
N = 2
M = 3
T = 3

alpha = np.zeros((T,N))
for i in range(N):
    alpha[0,i] = pi1[i]*B1[i,Obs[0]]
    
for t in range(1,3):
    for i in range(N):
        alpha[t,i] = 0
        for j in range(N):
            alpha[t,i] = alpha[t,i] + alpha[t-1,j]*A1[j,i]
        alpha[t,i] = alpha[t,i]*B1[i,Obs[t]]
prob = alpha[-1].sum()
print(prob)

0.02488


c) Explain the results you obtained for parts a) and b). Be sure to explain why you obtained the results you did.

For part a) we are just calculating the probability naively. However, in part b) we are calculating the probabilities recursively so we don't have to recalculate building blocks of the probability. It's basically just calculating probabilities of getting to the given observation for each state.

d) In terms of $N$ and $T$, and counting only multiplications, what is the work factor for the method in part a)? The method in part b)?

Part a) technically has NT(N^T) calculations but I left out half of them since pi(0) = 0. <br>
Part b) has N + (T-1)(N^2 + N)

# Problem 2
b) Determine the “best” hidden state sequence $(X_0, X_1, X_2)$ in the HMM sense.

In [4]:
beta = np.zeros((T,N))
beta[-1] = np.ones(N)
for t in range(T-2,-1,-1):
    for i in range(N):
        for j in range(N):
            beta[t,i] += A1[i,j]*B1[j,Obs[t+1]]*beta[t+1,j]
gamma = np.zeros((T,N))
for t in range(T):
    for i in range(N):
        gamma[t,i] = alpha[t,i]*beta[t,i]/alpha[-1].sum()
opt_state = np.argmax(gamma, axis=1)
print(opt_state)

[1 1 0]


# Problem 3
Summing the numbers in the “probability” column of Table 1, we find
$P(O = (0, 1, 0, 2)) = 0.009629$.

a) By a similar direct calculation, compute $P(O = (O_0, O_1, O_2, O_3))$, where each $O_i ∈
{0, 1, 2}$, and verify that $\sum P(O) = 1$. You will use the probabilities for $A, B$ and $π$ given in equations (3), (4) and (5) in Section 1, respectively.

In [5]:
A2 = np.array([[.7,.3],
               [.4,.6]])
B2 = np.array([[.1,.4,.5],
               [.7,.2,.1]])
pi2 = np.array([.6,.4])

probabilities = []
for Obs in itertools.product([0,1,2],repeat=4):
    prob = 0.
    for s1 in [0,1]:
        for s2 in [0,1]:
            for s3 in [0,1]:
                for s4 in [0,1]:
                    logprob = 0
                    logprob += np.log(pi2[s1]) + np.log(B2[s1,Obs[0]]) \
                        + np.log(A2[s1,s2]) + np.log(B2[s2,Obs[1]]) \
                        + np.log(A2[s2,s3]) + np.log(B2[s3,Obs[2]]) \
                        + np.log(A2[s3,s4]) + np.log(B2[s4,Obs[3]])
                    prob += np.exp(logprob)
    probabilities.append(prob)
print(np.sum(probabilities))

1.0


In [6]:
# Part B, verify using forward algorithm
N = 2
M = 3
T = 4

probabilities = []
for Obs in itertools.product([0,1,2], repeat=4):
    alpha = np.zeros((T,N))
    for i in range(N):
        alpha[0,i] = pi2[i]*B2[i,Obs[0]]

    for t in range(1,T):
        for i in range(N):
            alpha[t,i] = 0
            for j in range(N):
                alpha[t,i] = alpha[t,i] + alpha[t-1,j]*A2[j,i]
            alpha[t,i] = alpha[t,i]*B2[i,Obs[t]]
    probabilities.append(alpha[-1].sum())
print(np.sum(probabilities))

1.0


# Problem 4
(https://learningsuite.byu.edu/plugins/Upload/fileDownload.php?fileId=6834f119-LTeZ-19w5-WOrg-g780632e31d3)

To start off your implementation of the HMM, define a class object which you should call “hmm". Then add the initialization method, in which you should set the self aspects $A, B$, and $\pi$ to be `None` objects. You will be adding methods throughout the remainder of the lab.

In [11]:
class hmm():
    def __init__(self, A=None, B=None, pi=None):
        self.A = A
        self.B = B
        self.pi = pi
        
    def _forward(self, obs):
        """
        Compute the scaled forward probability matrix and scaling factors.
        Parameters
        ----------
        obs : ndarray of shape (T,)
        The observation sequence
        Returns
        -------
        alpha : ndarray of shape (T,N)
        The scaled forward probability matrix
        c : ndarray of shape (T,)
        The scaling factors c = [c_1,c_2,...,c_T]
        """
        A = self.A
        B = self.B
        pi = self.pi
        T = len(obs)
        N = self.A.shape[0]
        alpha = np.zeros((T,N))
        c = np.zeros(T)
        c[0] = 1./np.dot(pi,B[obs[0]])
        alpha[0] = c[0]*(pi*B[obs[0]])
        for t in range(1,T):
            c[t] = 1./np.dot(A.dot(alpha[t-1]),B[obs[t]])
            alpha[t] = c[t]*(A.dot(alpha[t-1])*B[obs[t]])
        return alpha, c
            
        
    def _backward(self, obs, c):
        """
        Compute the scaled backward probability matrix.
        
        Parameters
        ----------
        obs : ndarray of shape (T,)
        The observation sequence
        c : ndarray of shape (T,)
        The scaling factors from the forward pass
        
        Returns
        -------
        beta : ndarray of shape (T,N)
        The scaled backward probability matrix
        """
        A = self.A
        B = self.B
        pi = self.pi
        T = len(obs)
        N = A.shape[0]
        beta = np.zeros((T,N))
        beta[-1] = c[-1]
        for t in range(T-2,-1,-1):
            beta[t] = c[t]*(A.T).dot(B[obs[t+1]]*beta[t+1])
        return beta

In [12]:
# toy HMM example to be used to check answers
A = np.array([[.7, .4],
              [.3, .6]])
B = np.array([[.1,.7],
              [.4, .2],
              [.5, .1]])
pi = np.array([.6, .4])
obs = np.array([0, 1, 0, 2])

# Problem 5
Implement the forward pass by adding the above method to your class. To verify that your code works, you should get the following output using the toy HMM:

In [13]:
h = hmm()
h.A = A
h.B = B
h.pi = pi
alpha, c = h._forward(obs)
print(-1*(np.log(c)).sum()) # the log prob of observation
# Expected output should be -4.6429135909

-4.6429135909


# Problem 6
Implement the backward pass by adding the above method to your class. Using the same toy example as before, your code should produce the following output:

In [14]:
beta = h._backward(obs, c)
print(beta)
# Expected output:
# [[ 3.1361635 2.89939354]
# [ 2.86699344 4.39229044]
# [ 3.898812 2.66760821]
# [ 3.56816483 3.56816483]]

[[ 3.1361635   2.89939354]
 [ 2.86699344  4.39229044]
 [ 3.898812    2.66760821]
 [ 3.56816483  3.56816483]]
