## Hidden Markov Models (HMMs)

A **Hidden Markov Model (HMM)** is a statistical model used to represent systems that evolve over time with underlying hidden structures.  
It assumes that there is a sequence of **hidden states** that cannot be directly observed, but that produce **observable outputs** according to certain probabilities.

An HMM is defined by five key components:

1. **Hidden states**  
   The possible internal conditions of the system, which are not directly visible.  
   Example: grammatical categories like *Noun*, *Verb*, *Adjective*.

2. **Observations**  
   The visible outputs generated by the hidden states.  
   Example: the actual words in a sentence.

3. **Transition probabilities**  

   $$
   A = P(s_i \mid s_{j})
   $$

   The probability of moving from one hidden state to another between time steps.

4. **Emission probabilities**  

   $$
   B = P(o_t \mid s_j)
   $$

   The probability of producing a particular observation given the current hidden state.

5. **Initial state distribution**  

   $$
   \pi = P(s_1)
   $$

   The probability distribution over which hidden state the system starts in.

The joint probability of a sequence of hidden states $s_1, s_2, \dots, s_T$ and observations $o_1, o_2, \dots, o_T$ is given by:

$$
P(s_{1:T}, o_{1:T}) = P(s_1) \prod_{t=2}^{T} P(s_t \mid s_{t-1}) \prod_{t=1}^{T} P(o_t \mid s_t)
$$

Hidden Markov Models are widely used in **natural language processing**, **speech recognition**, and **sequence analysis**, where the true underlying process is not directly observable.


In [1]:
import numpy as np
#Define HMM parameters 

states = ["Sunny", "Rainy"]
observations = ["Walk", "Shop", "Clean"]

In [20]:
#Transition probabilities
transitions_probs = np.array([
    [0.8, 0.2], #Sunny -> [Sunny, Rainy]
    [0.4, 0.6]  #Rainy -> [Sunny, Rainy]
])

transitions_probs.shape

(2, 2)

In [19]:
#Emission probabilities
emission_probs = np.array([
    [0.6,0.3,0.1],
    [0.1,0.4,0.5]
])

emission_probs.shape

(2, 3)

In [None]:
#Initial sequence
initial_probs = np.array([0.7,0.3]) # [Sunny, Rain]

In [13]:
#Observed sequence
observed_sequence = ["Walk", "Shop", "Clean"]

In [14]:
# Map observations to indices 

obs_to_idx = {obs: idx for idx, obs, in enumerate(observations)}

In [15]:
#Function  to calculate emissions likelihood for a sequence

def calculate_emission_likelihood(observed_seq, emission_probs):
    likelihood = []
    for obs in observed_seq:
        likelihood.append(emission_probs[:, obs_to_idx[obs]])
    return np.array(likelihood)


#Calculate emission likelihood
emission_likelihood = calculate_emission_likelihood(observed_sequence, emission_probs)

#Display results
print("Emission likelihood for the observed sequence: ")
print(emission_likelihood)

Emission likelihood for the observed sequence: 
[[0.6 0.1]
 [0.3 0.4]
 [0.1 0.5]]
