# Hidden Markov Models (HMMs) and Viterbi Algorithm

## ðŸ“š Learning Objectives

By completing this notebook, you will:
- Understand Hidden Markov Models (HMMs) structure and components
- Implement HMMs for sequence prediction
- Implement Viterbi algorithm for sequence decoding
- Apply HMMs to practical problems (speech recognition, POS tagging)

## ðŸ”— Prerequisites

- âœ… Understanding of probability and Markov chains
- âœ… Python 3.8+ installed

---

## Official Structure Reference

This notebook covers practical activities from **Course 02, Unit 3**:
- Working with Hidden Markov Models (HMMs) for sequence prediction
- Implementing Viterbi algorithm for sequence decoding
- Applying HMMs to practical problems (speech recognition, POS tagging)
- **Source:** `DETAILED_UNIT_DESCRIPTIONS.md` - Unit 3 Practical Content

---

## Introduction to Hidden Markov Models

**Hidden Markov Models (HMMs)** are statistical models for sequences where:
- **Hidden states**: Unobserved states (e.g., weather: Sunny, Rainy)
- **Observations**: Observed outputs (e.g., activities: Walk, Shop, Clean)
- **Transitions**: Probabilities between hidden states
- **Emissions**: Probabilities of observations given states


In [None]:
import numpy as np

print("âœ… Libraries imported!")
print("Ready to work with HMMs and Viterbi algorithm!")


## Part 1: Simple HMM Implementation

Let's implement a simple HMM for weather prediction.


In [None]:
class SimpleHMM:
    """Simple Hidden Markov Model implementation"""
    
    def __init__(self, states, observations, transition_probs, emission_probs, initial_probs):
        """
        Parameters:
        - states: List of hidden states
        - observations: List of possible observations
        - transition_probs: Dict of transition probabilities P(state_i | state_j)
        - emission_probs: Dict of emission probabilities P(obs | state)
        - initial_probs: Initial state probabilities
        """
        self.states = states
        self.observations = observations
        self.transition_probs = transition_probs
        self.emission_probs = emission_probs
        self.initial_probs = initial_probs
    
    def forward(self, obs_sequence):
        """Forward algorithm: Compute probability of observation sequence"""
        T = len(obs_sequence)
        N = len(self.states)
        
        # Initialize alpha (forward probabilities)
        alpha = np.zeros((T, N))
        
        # Initialization
        for i, state in enumerate(self.states):
            alpha[0, i] = self.initial_probs[state] * self.emission_probs[state][obs_sequence[0]]
        
        # Recursion
        for t in range(1, T):
            for j, state_j in enumerate(self.states):
                alpha[t, j] = sum(
                    alpha[t-1, i] * self.transition_probs[state_j][self.states[i]] 
                    for i in range(N)
                ) * self.emission_probs[state_j][obs_sequence[t]]
        
        # Termination
        return alpha, sum(alpha[T-1, :])

# Example: Weather HMM
states = ['Sunny', 'Rainy']
observations = ['Walk', 'Shop', 'Clean']

# Transition probabilities: P(next_state | current_state)
transition_probs = {
    'Sunny': {'Sunny': 0.7, 'Rainy': 0.3},
    'Rainy': {'Sunny': 0.4, 'Rainy': 0.6}
}

# Emission probabilities: P(observation | state)
emission_probs = {
    'Sunny': {'Walk': 0.6, 'Shop': 0.3, 'Clean': 0.1},
    'Rainy': {'Walk': 0.1, 'Shop': 0.4, 'Clean': 0.5}
}

# Initial probabilities
initial_probs = {'Sunny': 0.6, 'Rainy': 0.4}

# Create HMM
hmm = SimpleHMM(states, observations, transition_probs, emission_probs, initial_probs)

print("=" * 60)
print("Hidden Markov Model: Weather Prediction")
print("=" * 60)
print(f"States: {states}")
print(f"Observations: {observations}")


## Part 2: Viterbi Algorithm for Sequence Decoding

The Viterbi algorithm finds the most likely sequence of hidden states given observations.


In [None]:
def viterbi(hmm, obs_sequence):
    """
    Viterbi algorithm: Find most likely sequence of hidden states
    
    Returns:
    - best_path: Most likely state sequence
    - best_prob: Probability of best path
    """
    T = len(obs_sequence)
    N = len(hmm.states)
    
    # Initialize viterbi and backpointer tables
    viterbi_table = np.zeros((T, N))
    backpointer = np.zeros((T, N), dtype=int)
    
    # Initialization
    for i, state in enumerate(hmm.states):
        viterbi_table[0, i] = hmm.initial_probs[state] * hmm.emission_probs[state][obs_sequence[0]]
        backpointer[0, i] = 0
    
    # Recursion
    for t in range(1, T):
        for j, state_j in enumerate(hmm.states):
            # Find best previous state
            probs = [
                viterbi_table[t-1, i] * hmm.transition_probs[state_j][hmm.states[i]]
                for i in range(N)
            ]
            best_prev = np.argmax(probs)
            viterbi_table[t, j] = probs[best_prev] * hmm.emission_probs[state_j][obs_sequence[t]]
            backpointer[t, j] = best_prev
    
    # Termination: Find best final state
    best_final = np.argmax(viterbi_table[T-1, :])
    best_prob = viterbi_table[T-1, best_final]
    
    # Backtrack to find best path
    best_path = [hmm.states[best_final]]
    for t in range(T-1, 0, -1):
        best_final = backpointer[t, best_final]
        best_path.insert(0, hmm.states[best_final])
    
    return best_path, best_prob

# Example: Decode observation sequence
obs_seq = ['Walk', 'Shop', 'Clean']
print("=" * 60)
print(f"Observation Sequence: {obs_seq}")
print("=" * 60)

best_states, prob = viterbi(hmm, obs_seq)
print(f"Most likely state sequence: {best_states}")
print(f"Probability: {prob:.6f}")


## Part 3: Application - Part-of-Speech Tagging

Let's apply HMMs to POS tagging (simplified example).


In [None]:
# Example: POS Tagging HMM
pos_states = ['Noun', 'Verb', 'Det']
pos_words = ['the', 'cat', 'runs']

# Transition: P(POS_i | POS_j)
pos_transitions = {
    'Noun': {'Noun': 0.1, 'Verb': 0.3, 'Det': 0.6},
    'Verb': {'Noun': 0.5, 'Verb': 0.2, 'Det': 0.3},
    'Det': {'Noun': 0.7, 'Verb': 0.2, 'Det': 0.1}
}

# Emission: P(word | POS)
pos_emissions = {
    'Noun': {'the': 0.1, 'cat': 0.7, 'runs': 0.2},
    'Verb': {'the': 0.05, 'cat': 0.1, 'runs': 0.85},
    'Det': {'the': 0.8, 'cat': 0.15, 'runs': 0.05}
}

pos_initial = {'Noun': 0.3, 'Verb': 0.3, 'Det': 0.4}

pos_hmm = SimpleHMM(pos_states, pos_words, pos_transitions, pos_emissions, pos_initial)

# Tag sentence: "the cat runs"
sentence = ['the', 'cat', 'runs']
pos_sequence, pos_prob = viterbi(pos_hmm, sentence)

print("=" * 60)
print("POS Tagging Example:")
print("=" * 60)
print(f"Sentence: {sentence}")
print(f"Tagged: {list(zip(sentence, pos_sequence))}")
print(f"Probability: {pos_prob:.6f}")


## Summary

### Key Concepts:
1. **HMM Components**: States, observations, transitions, emissions
2. **Forward Algorithm**: Compute probability of observation sequence
3. **Viterbi Algorithm**: Find most likely hidden state sequence
4. **Applications**: Speech recognition, POS tagging, sequence prediction

### Applications:
- Natural language processing (POS tagging, NER)
- Speech recognition
- Bioinformatics (gene prediction)
- Time series prediction

**Reference:** Course 02, Unit 3: "Working with Hidden Markov Models (HMMs)" and "Implementing Viterbi algorithm for sequence decoding"
