## The Forward Algorithm

### Mathematical formulation

The Forward Algorithm relies on the **Markov property**, which states that the next state depends only on the current state, not on the entire past.  
This assumption allows us to compute the probability of an observation sequence efficiently through recursion.

Let:

- $N$ be the number of hidden states  
- $A_{ij} = P(s_t = j \mid s_{t-1} = i)$ : the state transition probabilities  
- $B_j(o_t) = P(o_t \mid s_t = j)$ : the emission probabilities  
- $\pi_i = P(s_1 = i)$ : the initial state distribution  

We define the **forward variable** $\alpha_t(i)$ as:

$$
\alpha_t(i) = P(o_1, o_2, \dots, o_t, s_t = i)
$$

This represents the probability of observing the first $t$ observations and being in state $i$ at time $t$.

The algorithm proceeds as follows:

1. **Initialization** (for $t = 1$)  

   $$
   \alpha_1(i) = \pi_i \, B_i(o_1)
   $$

2. **Recursion** (for $t = 2, 3, \dots, T$)  

   $$
   \alpha_t(j) = \left[ \sum_{i=1}^{N} \alpha_{t-1}(i) \, A_{ij} \right] B_j(o_t)
   $$

3. **Termination**  

   $$
   P(O \mid \lambda) = \sum_{i=1}^{N} \alpha_T(i)
   $$

where $O = (o_1, o_2, \dots, o_T)$ is the sequence of observations,  
and $\lambda = (A, B, \pi)$ denotes the model parameters.


### Intuition

The forward variable $\alpha_t(i)$ captures the **total probability of all possible hidden-state paths** that end in state $i$ after observing the first $t$ symbols.  
At each step, it accumulates probabilities from all previous states, weighted by the transition probabilities $A_{ij}$ and the emission probability $B_j(o_t)$.  

Conceptually, the Forward Algorithm behaves like a **flow of probability mass** that moves forward through time â€” each state at time $t$ receives contributions from all states at time $t-1$.  
This recursive accumulation makes it possible to compute the likelihood of the entire observation sequence without enumerating every possible state path.


In [1]:
import numpy as np
#Define HMM parameters 

states = ["Sunny", "Rainy"]
observations = ["Walk", "Shop", "Clean"]

In [2]:
#Transition probabilities
transition_probs = np.array([
    [0.8, 0.2], #Sunny -> [Sunny, Rainy]
    [0.4, 0.6]  #Rainy -> [Sunny, Rainy]
])

In [3]:
#Emission probabilities
emission_probs = np.array([
    [0.6,0.3,0.1],
    [0.1,0.4,0.5]
])

In [4]:
#Initial sequence
initial_probs = np.array([0.7,0.3]) # [Sunny, Rain]

#Observed sequence
observed_sequence = ["Walk", "Shop", "Clean"]

# Map observations to indices 
obs_to_idx = {obs: idx for idx, obs, in enumerate(observations)}

In [7]:
#Forward algorithm implementation

def forward_algorithm(observed_seq, states, initial_probs, transition_probs, emission_probs):
    num_states = len(states)
    num_obs = len(observed_seq)


    forward_probs = np.zeros((num_obs, num_states))

    #Initialization step
    for s in range(num_states):
        forward_probs[0,s] = initial_probs[s] * emission_probs[s, obs_to_idx[observed_seq[0]]]

    #Recursion step
    for t in range (1, num_obs):
        for s in range(num_states):
            forward_probs[t, s] = sum(
                forward_probs[t-1, prev_s] * transition_probs[prev_s, s] for prev_s in range(num_states)
            ) * emission_probs[s, obs_to_idx[observed_seq[t]]]
    
    
    total_probability = sum(forward_probs[num_obs-1, s] for s in range(num_states))

    return forward_probs, total_probability

In [8]:
#Run the algo

forward_probs, total_prob = forward_algorithm(observed_sequence, states, initial_probs, transition_probs, emission_probs)

#Results
print(forward_probs)

print(f"total probability of the observed sequence : {total_prob}")

[[0.42     0.03    ]
 [0.1044   0.0408  ]
 [0.009984 0.02268 ]]
total probability of the observed sequence : 0.032664000000000006
