*LAB 9 – Viterbi Algorithm for Speech Recognition*

*import numpy as np*

*STEP 1: Define states and observations*

# Lab 9: Viterbi Algorithm for Speech Recognition

## Introduction
The **Viterbi Algorithm** is a dynamic programming algorithm used to find the most likely sequence of hidden states (the Viterbi path) that results in a sequence of observed events.


## 1. Define States and Observations
We begin by defining the state space and the observation space for our model.
*   **States**: The set of possible hidden states (S1, S2, S3, S4).
*   **Observations**: The sequence of observed events that we want to decode.
The goal of the Viterbi algorithm will be to find the most likely sequence of hidden states that produced this specific sequence of observations.


In [None]:
states = ['S1','S2','S3','S4']              # Hidden states: /h/, /e/, /l/, /o/
observations = ['O1','O2','O3','O4']        # Feature vectors for /h/, /e/, /l/, /o/

*STEP 2: Transition Probability Matrix (A)*

## 2. Transition Probability Matrix (A)
The Transition Probability Matrix ($A$) defines the dynamics of the system.
*   Each element $A[i][j]$ represents the probability of transitioning from state $i$ to state $j$.
*   Rows must sum to 1, representing a valid probability distribution.
This matrix tells us how likely the system is to change from one state to another.


In [None]:
A = np.array([
    [0.0, 0.7, 0.3, 0.0],   # S1 -> *
    [0.0, 0.2, 0.6, 0.2],   # S2 -> *
    [0.0, 0.0, 0.3, 0.7],   # S3 -> *
    [0.0, 0.0, 0.1, 0.9]    # S4 -> *
])

*STEP 3: Emission Probability Matrix (B)*

## 3. Emission Probability Matrix (B)
The Emission Probability Matrix ($B$) connects the hidden states to the observed world.
*   Each element $B[i][k]$ represents the probability of observing output $k$ given that the system is in state $i$.
This matrix models the "noise" or variability in how states produce observations.


In [None]:
B = np.array([
    [0.6, 0.2, 0.1, 0.1],   # S1 emits O1 O2 O3 O4
    [0.1, 0.7, 0.1, 0.1],   # S2 emits O1 O2 O3 O4
    [0.1, 0.1, 0.6, 0.2],   # S3 emits O1 O2 O3 O4
    [0.2, 0.1, 0.2, 0.5]    # S4 emits O1 O2 O3 O4
])

*STEP 4: Initial Probabilities (π)*

## 1. Define States and Observations
We define the hidden states (S1, S2, S3, S4) and the observation sequence we want to decode.


## 4. Initial Probabilities (Pi)
The Initial Probability vector ($\pi$) defines the starting point of the process.
*   Each element represents the likelihood of the system beginning in a particular state.
This is crucial for the first step of the Viterbi algorithm.


In [None]:
pi = np.array([1.0, 0.0, 0.0, 0.0])     # Always start at /h/

*STEP 5: Observation Sequence*

## 5. Map Observations to Indices
To perform matrix operations efficiently, we convert the symbolic observation sequence (e.g., ['O1', 'O2']) into numerical indices.
*   For example, if our observation vocabulary is ['O1', 'O2', 'O3'], then 'O1' becomes 0, 'O2' becomes 1, etc.
This allows us to directly access columns in the Emission Matrix ($B$).


In [None]:
obs_seq = [0, 1, 2, 3]   # [O1, O2, O3, O4]

*VITERBI ALGORITHM*

## 2. Transition Probability Matrix (A)
This matrix defines the probability of moving from one state to another.


## 6. Viterbi Algorithm Implementation
The Viterbi algorithm is a dynamic programming method used to find the most likely sequence of hidden states (the Viterbi path). It works in three main phases:
1.  **Initialization**: We calculate the probability of the first state based on the Initial Probabilities ($\pi$) and the Emission Probabilities ($B$) for the first observation.
2.  **Recursion**: For each subsequent time step $t$, we calculate the probability of being in each state $j$. We do this by considering all possible previous states $i$, multiplying the probability of being in $i$ at $t-1$ by the transition probability from $i$ to $j$ and the emission probability of the current observation. We take the maximum of these values. Crucially, we store a "backpointer" to the previous state that yielded this maximum probability.
3.  **Termination & Backtracking**: After processing all observations, we identify the final state with the highest probability. We then follow the backpointers in reverse order to reconstruct the optimal path of states from the end back to the beginning.


In [None]:
def viterbi(A, B, pi, obs_seq, states):
    T = len(obs_seq)
    N = len(states)

    delta = np.zeros((T, N))
    psi = np.zeros((T, N), dtype=int)

    print("=== STEP 1: Initialization ===")
    for i in range(N):
        delta[0, i] = pi[i] * B[i, obs_seq[0]]
        print(f"delta[0][{states[i]}] = {delta[0][i]:.6f}")

    print("\n=== STEP 2: Recursion ===")
    for t in range(1, T):
        print(f"\nTime t = {t} Observation = {observations[obs_seq[t]]}")
        for j in range(N):
            values = delta[t-1] * A[:, j]
            psi[t, j] = np.argmax(values)
            delta[t, j] = np.max(values) * B[j, obs_seq[t]]
            print(f"delta[{t}][{states[j]}] = {delta[t][j]:.8f} (from {states[psi[t,j]]})")

    print("\n=== STEP 3: Termination ===")
    P_star = np.max(delta[T-1])
    last_state = np.argmax(delta[T-1])
    print(f"Most likely sequence probability = {P_star:.10f}")
    print(f"Final state = {states[last_state]}")

    print("\n=== STEP 4: Backtracking ===")
    path = [0]*T
    path[T-1] = last_state

    for t in range(T-2, -1, -1):
        path[t] = psi[t+1][path[t+1]]

    best_path = [states[i] for i in path]
    return best_path, P_star


# Run Viterbi
best_states, best_prob = viterbi(A, B, pi, obs_seq, states)

# Convert states → phonemes
phoneme_map = {'S1':'/h/', 'S2':'/e/', 'S3':'/l/', 'S4':'/o/'}
best_phonemes = [phoneme_map[s] for s in best_states]

print("\n================================")
print("Most Likely State Sequence:", best_states)
print("Most Likely Phoneme Sequence:", best_phonemes)
print(f"Probability of sequence = {best_prob:.10f}")
print("================================")

=== STEP 1: Initialization ===
delta[0][S1] = 0.600000
delta[0][S2] = 0.000000
delta[0][S3] = 0.000000
delta[0][S4] = 0.000000

=== STEP 2: Recursion ===

Time t = 1 Observation = O2
delta[1][S1] = 0.00000000 (from S1)
delta[1][S2] = 0.29400000 (from S1)
delta[1][S3] = 0.01800000 (from S1)
delta[1][S4] = 0.00000000 (from S1)

Time t = 2 Observation = O3
delta[2][S1] = 0.00000000 (from S1)
delta[2][S2] = 0.00588000 (from S2)
delta[2][S3] = 0.10584000 (from S2)
delta[2][S4] = 0.01176000 (from S2)

Time t = 3 Observation = O4
delta[3][S1] = 0.00000000 (from S1)
delta[3][S2] = 0.00011760 (from S2)
delta[3][S3] = 0.00635040 (from S3)
delta[3][S4] = 0.03704400 (from S3)

=== STEP 3: Termination ===
Most likely sequence probability = 0.0370440000
Final state = S4

=== STEP 4: Backtracking ===

Most Likely State Sequence: ['S1', 'S2', 'S3', 'S4']
Most Likely Phoneme Sequence: ['/h/', '/e/', '/l/', '/o/']
Probability of sequence = 0.0370440000


*FINAL INFERENCE*

Inference:
The Viterbi Algorithm successfully identified the phoneme sequence (/h/, /e/, /l/, /o/) as the most probable sequence for the given observation order [O1, O2, O3, O4].