*Import Libraries*

# Lab 8: Hidden Markov Models (HMM)

## Introduction
This lab demonstrates the components of a Hidden Markov Model (HMM). An HMM is a statistical model where the system being modeled is assumed to be a Markov process with unobserved (hidden) states.


In [None]:
import numpy as np
import random

*Define HMM Parameters (Initial, Transition, Emission)*

*(Task a)*

## 1. Define HMM Parameters
In this section, we define the fundamental components of a Hidden Markov Model (HMM). An HMM is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states.
*   **Hidden States**: These represent the underlying reality that we cannot directly observe. In speech, these could be phonemes (e.g., /s/, /p/, /a/, /t/).
*   **Observations**: These are the visible outputs generated by the hidden states. In speech, these would be the acoustic features.
*   **Initial Probability ($\pi$)**: The probability of the system starting in a specific state.
*   **Transition Probability ($A$)**: The probability of moving from one state to another. For example, the probability that phoneme /s/ is followed by /p/.
*   **Emission Probability ($B$)**: The probability of observing a specific output given that the system is in a particular state.


In [None]:
# Hidden states (phonemes)
states = ['/s/', '/p/', '/iː/', '/tʃ/']
num_states = len(states)

# Observations
observations = ['Energy', 'Pitch', 'Duration']
num_obs = len(observations)

# Initial Probabilities (start always with /s/)
pi = {
    '/s/': 1.0,
    '/p/': 0.0,
    '/iː/': 0.0,
    '/tʃ/': 0.0
}

# Transition Probability Matrix
transition_prob = {
    '/s/' : {'/s/': 0.1, '/p/': 0.8, '/iː/': 0.1, '/tʃ/': 0.0},
    '/p/' : {'/s/': 0.0, '/p/': 0.1, '/iː/': 0.8, '/tʃ/': 0.1},
    '/iː/': {'/s/': 0.0, '/p/': 0.0, '/iː/': 0.2, '/tʃ/': 0.8},
    '/tʃ/': {'/s/': 0.2, '/p/': 0.0, '/iː/': 0.0, '/tʃ/': 0.8}
}

# Emission Probability Matrix
emission_prob = {
    '/s/' : {'Energy':0.7, 'Pitch':0.2, 'Duration':0.1},
    '/p/' : {'Energy':0.5, 'Pitch':0.3, 'Duration':0.2},
    '/iː/': {'Energy':0.3, 'Pitch':0.5, 'Duration':0.2},
    '/tʃ/': {'Energy':0.4, 'Pitch':0.4, 'Duration':0.2}
}

*Display the HMM Matrices*

*(Task b)*

## 2. Display HMM Matrices
We print the defined matrices (Initial, Transition, and Emission) to verify that the model structure is correctly set up. These matrices fully characterize the HMM and govern its behavior.


In [None]:
def display_hmm(pi, transition_prob, emission_prob):
    print("\n=== Initial Probabilities ===")
    for s in pi:
        print(f"{s}: {pi[s]}")

    print("\n=== Transition Probability Matrix ===")
    for s_from in transition_prob:
        print(f"{s_from}: {transition_prob[s_from]}")

    print("\n=== Emission Probability Matrix ===")
    for st in emission_prob:
        print(f"{st}: {emission_prob[st]}")

display_hmm(pi, transition_prob, emission_prob)


=== Initial Probabilities ===
/s/: 1.0
/p/: 0.0
/iː/: 0.0
/tʃ/: 0.0

=== Transition Probability Matrix ===
/s/: {'/s/': 0.1, '/p/': 0.8, '/iː/': 0.1, '/tʃ/': 0.0}
/p/: {'/s/': 0.0, '/p/': 0.1, '/iː/': 0.8, '/tʃ/': 0.1}
/iː/: {'/s/': 0.0, '/p/': 0.0, '/iː/': 0.2, '/tʃ/': 0.8}
/tʃ/: {'/s/': 0.2, '/p/': 0.0, '/iː/': 0.0, '/tʃ/': 0.8}

=== Emission Probability Matrix ===
/s/: {'Energy': 0.7, 'Pitch': 0.2, 'Duration': 0.1}
/p/: {'Energy': 0.5, 'Pitch': 0.3, 'Duration': 0.2}
/iː/: {'Energy': 0.3, 'Pitch': 0.5, 'Duration': 0.2}
/tʃ/: {'Energy': 0.4, 'Pitch': 0.4, 'Duration': 0.2}


*Generate a Phoneme Sequence + Observations*

*(Task c)* *We simulate phoneme transitions and sample observations from emission probabilities.*

## 3. Generate Sequence
This function simulates the HMM process to generate a sequence of states and observations.
1.  **Initialization**: The process begins by selecting a starting state based on the Initial Probability distribution ($\pi$).
2.  **Emission**: Once in a state, the system emits an observation based on the Emission Probability distribution ($B$) for that state.
3.  **Transition**: The system then transitions to a new state based on the Transition Probability distribution ($A$) of the current state.
4.  **Repetition**: Steps 2 and 3 are repeated for the desired length of the sequence.
This simulation helps us understand how the model generates data.


In [None]:
def sample_from_distribution(dist_dict):
    """Randomly pick a key based on probability distribution."""
    items = list(dist_dict.items())
    keys = [k for k, _ in items]
    probs = [p for _, p in items]
    return random.choices(keys, weights=probs, k=1)[0]

def generate_sequence(length=4):
    sequence = []
    obs_sequence = []

    # Start with /s/
    current_state = '/s/'
    sequence.append(current_state)

    # Generate next states
    for _ in range(length - 1):
        next_state = sample_from_distribution(transition_prob[current_state])
        sequence.append(next_state)
        current_state = next_state

    # Generate observations for each state
    for st in sequence:
        obs = sample_from_distribution(emission_prob[st])
        obs_sequence.append(obs)

    return sequence, obs_sequence

phoneme_seq, obs_seq = generate_sequence()

print("\nGenerated Phoneme Sequence:", phoneme_seq)
print("Generated Observations:", obs_seq)


Generated Phoneme Sequence: ['/s/', '/p/', '/iː/', '/tʃ/']
Generated Observations: ['Pitch', 'Energy', 'Duration', 'Energy']


*Inference:*
The implemented Hidden Markov Model successfully simulates the phoneme sequence
for the word "speech". Since the model starts with /s/ and transitions follow the
defined probabilities, the generated sequence typically resembles the order of
phonemes occurring in natural pronunciation. The emission probabilities also
help generate corresponding acoustic observations (Energy, Pitch, Duration)
based on the likelihood of each feature for a given phoneme. This demonstrates
how HMMs can model both phoneme transitions and acoustic features in speech
processing tasks.