## Hidden Markov Model 

HMM is a statistical model that is used to analyze sequential data, where the data is assumed to be generated by a process 
that is modeled as a Markov chain with hidden states.

In an HMM, there are two types of variables: 
- Observed variables
- Hidden variables 

An HMM consists of the following components:
- State Space: 
    - This is a set of possible states that the model can be in at any given time. 
    For example, in speech recognition, the states could correspond to different phonemes.

- Observation Space: 
    - This is a set of possible observations that can be made at each state. 
    For example, in speech recognition, the observations could correspond to acoustic features of the speech signal.

- Transition Probabilities: 
    - These are the probabilities of moving from one state to another. 
    These probabilities are usually modeled as a matrix, where the (i,j)-th entry represents the probability of 
    moving from state i to state j.

- Emission Probabilities: These are the probabilities of observing a particular observation given the state. 
    These probabilities are also usually modeled as a matrix, where the (i,j)-th entry represents the probability of 
    observing observation j given that the model is in state i.

The basic idea behind HMMs is to use the observed data to infer the hidden states that generated the data. 
This is done by using the forward-backward algorithm or the Viterbi algorithm, which are used to compute the most 
likely sequence of hidden states given the observed data.

HMMs have a wide range of applications, including 
- speech recognition
- handwriting recognition
- bioinformatics

In [1]:
import numpy as np

In [7]:
# Transition prob
p_sun_sun = 0.8
p_sun_rain = 0.2
p_rain_sun = 0.4
p_rain_rain = 0.6

# Emission prob
p_sun_yes = 0.8
p_sun_no = 0.2
p_rain_yes = 0.4
p_rain_no = 0.6

In [14]:
# initial prob
p_sunny = np.round(2/3, 2)
p_sunny

0.67

In [15]:
p_rain = np.round(1/3, 2)
p_rain

0.33

In [16]:
# given
played_golf = ['Y', 'Y', 'N', 'N', 'N', 'Y']

In [17]:
prob = []
weather = []

if played_golf[0] == 'Y':
    prob.append((p_sunny * p_sun_yes, p_rain * p_rain_yes))
else:
    prob.append((p_sunny * p_sun_no, p_rain * p_rain_no))

# 0 is the first index left to right
print(prob[0])
# -1 is the last index that goes right to left if negative
print(prob[-1])

(0.536, 0.132)
(0.536, 0.132)


In [18]:
for i in range(1, len(played_golf)):
    yest_sunny, yest_rainy = prob[-1]
    if played_golf[i] == 'Y':
        today_sunny = max(yest_sunny * p_sun_sun * p_sun_yes, yest_rainy * p_rain_sun * p_sun_yes)
        today_rainy = max(yest_sunny * p_sun_rain * p_rain_yes, yest_rainy * p_rain_rain * p_rain_yes)
    else:
        today_sunny = max(yest_sunny * p_sun_sun * p_sun_no, yest_rainy * p_rain_sun * p_sun_no)
        today_rainy = max(yest_sunny * p_sun_rain * p_rain_no, yest_rainy * p_rain_rain * p_rain_no)
    prob.append((today_sunny, today_rainy))
    
for p in prob:
    if p[0] > p[1]:
        weather.append('S')
    else:
        weather.append('R')
weather

['S', 'S', 'S', 'R', 'R', 'S']