# Lab Exercise 8 – Hidden Markov Model (HMM)

**Aim:** To implement a Hidden Markov Model (HMM) to simulate phoneme transitions for the word 'speech' in speech processing.

**Tasks:**
1. Represent HMM parameters (initial, transition, and emission probabilities).
2. Display the matrices.
3. Generate a sequence of phonemes and observations.
4. Provide an inference.

In [1]:
import random
import pandas as pd

### Task (a): Represent HMM Parameters
Here we define the hidden states (phonemes), observations (acoustic properties), and the probability matrices.

In [2]:
# 1. Hidden States (Phonemes)
states = ['/s/', '/p/', '/ie:/', '/tS/']

# 2. Observations (Acoustic Properties)
observations = ['Energy', 'Pitch', 'Duration']

# 3. Initial Probabilities (Start Probability)
# "Let the initial probability of starting the phoneme, ‘/s/’ is 1."
start_probability = {
    '/s/': 1.0, 
    '/p/': 0.0, 
    '/ie:/': 0.0, 
    '/tS/': 0.0
}

# 4. Transition Probabilities
# Probabilities of moving from one phoneme to another
transition_probability = {
    '/s/':   {'/s/': 0.1, '/p/': 0.8, '/ie:/': 0.1, '/tS/': 0.0},
    '/p/':   {'/s/': 0.0, '/p/': 0.1, '/ie:/': 0.8, '/tS/': 0.1},
    '/ie:/': {'/s/': 0.0, '/p/': 0.0, '/ie:/': 0.2, '/tS/': 0.8},
    '/tS/':  {'/s/': 0.2, '/p/': 0.0, '/ie:/': 0.0, '/tS/': 0.8}
}

# 5. Emission Probabilities
# Probabilities of an observation given a phoneme
emission_probability = {
    '/s/':   {'Energy': 0.7, 'Pitch': 0.2, 'Duration': 0.1},
    '/p/':   {'Energy': 0.5, 'Pitch': 0.3, 'Duration': 0.2},
    '/ie:/': {'Energy': 0.3, 'Pitch': 0.5, 'Duration': 0.2},
    '/tS/':  {'Energy': 0.4, 'Pitch': 0.4, 'Duration': 0.2}
}

### Task (b): Display Matrices
A helper function to neatly display the transition and emission matrices using Pandas.

In [3]:
def display_parameters():
    print("-" * 50)
    print("HMM PARAMETERS DISPLAY")
    print("-" * 50)
    
    print("\n1. Initial Probabilities:")
    for state, prob in start_probability.items():
        print(f"   P(Start = {state}) = {prob}")
        
    print("\n2. Transition Probability Matrix:")
    # Using pandas for a cleaner matrix view
    trans_df = pd.DataFrame(transition_probability).T
    print(trans_df)
    
    print("\n3. Emission Probability Matrix:")
    emit_df = pd.DataFrame(emission_probability).T
    print(emit_df)
    print("-" * 50)

### Task (c): Sequence Generation
Simulating the process of generating a sequence of phonemes and their corresponding acoustic observations.

In [4]:
def generate_sequence(length=4):
    # 1. Choose initial state
    curr_state = random.choices(
        list(start_probability.keys()), 
        weights=list(start_probability.values())
    )[0]
    
    generated_phonemes = []
    generated_observations = []
    
    print(f"\nGenerating sequence of length {length}...")
    
    for i in range(length):
        generated_phonemes.append(curr_state)
        
        # 2. Generate Observation based on current state (Emission)
        obs_probs = emission_probability[curr_state]
        observation = random.choices(
            list(obs_probs.keys()),
            weights=list(obs_probs.values())
        )[0]
        generated_observations.append(observation)
        
        # 3. Transition to next state
        trans_probs = transition_probability[curr_state]
        next_state = random.choices(
            list(trans_probs.keys()),
            weights=list(trans_probs.values())
        )[0]
        
        curr_state = next_state
        
    return generated_phonemes, generated_observations

### Execution

In [5]:
display_parameters()

# Set seed for reproducibility
random.seed(42) 

phonemes, observations = generate_sequence(length=4)

print("\nRESULTS:")
print(f"Generated phoneme sequence: {phonemes}")
print(f"Corresponding observations: {observations}")

--------------------------------------------------
HMM PARAMETERS DISPLAY
--------------------------------------------------

1. Initial Probabilities:
   P(Start = /s/) = 1.0
   P(Start = /p/) = 0.0
   P(Start = /ie:/) = 0.0
   P(Start = /tS/) = 0.0

2. Transition Probability Matrix:
       /s/  /p/  /ie:/  /tS/
/s/    0.1  0.8    0.1   0.0
/p/    0.0  0.1    0.8   0.1
/ie:/  0.0  0.0    0.2   0.8
/tS/   0.2  0.0    0.0   0.8

3. Emission Probability Matrix:
       Energy  Pitch  Duration
/s/       0.7    0.2       0.1
/p/       0.5    0.3       0.2
/ie:/     0.3    0.5       0.2
/tS/      0.4    0.4       0.2
--------------------------------------------------

Generating sequence of length 4...

RESULTS:
Generated phoneme sequence: ['/s/', '/p/', '/ie:/', '/tS/']
Corresponding observations: ['Energy', 'Energy', 'Pitch', 'Energy']


### Task (d): Inference

The implemented Hidden Markov Model (HMM) successfully simulates the stochastic nature of speech production for the word "speech".

1. **Sequential Modeling:** The **Transition Matrix** strongly biases the model to move forward through the phonemes. For example, once in state `/s/`, there is an $0.8$ ($80\%$) probability of moving to `/p/`. This preserves the temporal order required to form the word "speech".
2. **Duration Modeling:** The self-loop probabilities (e.g., $P(/ie:/ \rightarrow /ie:/) = 0.2$) allow the model to simulate the *duration* of a phoneme. A higher self-loop probability would mean the sound is held longer before transitioning.
3. **Acoustic Variation:** The **Emission Matrix** handles the variability in how a sound is actually heard. For instance, the phoneme `/s/` is modeled as having high Energy ($0.7$), while `/ie:/` is characterized more by Pitch ($0.5$).
4. **Result:** By combining these probabilities, the HMM generates a sequence that is likely to be `/s/ -> /p/ -> /ie:/ -> /tS/`, but it allows for natural variations, mimicking how real speech is never mathematically perfect.