## Lab Exercise 7: Hidden Markov Model Implementation

### a) Represent the HMM parameters (initial probabilities, transition probabilities, and emission probabilities) using suitable data structures in Python.
### b) Write a function to neatly display the transition and emission matrices along with the initial probabilities.

In [1]:
import numpy as np
import pandas as pd
initial_probabilities = {'/s/': 1.0, '/p/': 0.0, '/ie:/': 0.0, '/tS/': 0.0}

transition_probabilities = {
    '/s/': {'/s/': 0.1, '/p/': 0.8, '/ie:/': 0.1, '/tS/': 0.0},
    '/p/': {'/s/': 0.0, '/p/': 0.1, '/ie:/': 0.8, '/tS/': 0.1},
    '/ie:/': {'/s/': 0.0, '/p/': 0.0, '/ie:/': 0.2, '/tS/': 0.8},
    '/tS/': {'/s/': 0.2, '/p/': 0.0, '/ie:/': 0.0, '/tS/': 0.8}
}

emission_probabilities = {
    '/s/': {'Energy': 0.7, 'Pitch': 0.2, 'Duration': 0.1},
    '/p/': {'Energy': 0.5, 'Pitch': 0.3, 'Duration': 0.2},
    '/ie:/': {'Energy': 0.3, 'Pitch': 0.5, 'Duration': 0.2},
    '/tS/': {'Energy': 0.4, 'Pitch': 0.4, 'Duration': 0.2}
}
print("Initial Probabilities:")
initial_df = pd.DataFrame.from_dict(initial_probabilities, orient='index', columns=['Probability'])
print(initial_df)

print("\nTransition Probabilities:")
transition_df = pd.DataFrame.from_dict(transition_probabilities, orient='index')
print(transition_df)

print("\nEmission Probabilities:")
emission_df = pd.DataFrame.from_dict(emission_probabilities, orient='index')
print(emission_df)

Initial Probabilities:
       Probability
/s/            1.0
/p/            0.0
/ie:/          0.0
/tS/           0.0

Transition Probabilities:
       /s/  /p/  /ie:/  /tS/
/s/    0.1  0.8    0.1   0.0
/p/    0.0  0.1    0.8   0.1
/ie:/  0.0  0.0    0.2   0.8
/tS/   0.2  0.0    0.0   0.8

Emission Probabilities:
       Energy  Pitch  Duration
/s/       0.7    0.2       0.1
/p/       0.5    0.3       0.2
/ie:/     0.3    0.5       0.2
/tS/      0.4    0.4       0.2


### (c) Write a program to generate a single sequence of phonemes and corresponding acoustic observations for the word speech based on the defined probabilities.

In [2]:
phoneme_sequence = []
observation_sequence = []

current_phoneme = '/s/'
phoneme_sequence.append(current_phoneme)

for _ in range(3):
    transitions = list(transition_probabilities[current_phoneme].items())
    next_states, probabilities = zip(*transitions)
    current_phoneme = np.random.choice(next_states, p=probabilities)
    phoneme_sequence.append(current_phoneme)

for phoneme in phoneme_sequence:
    emissions = list(emission_probabilities[phoneme].items())
    observations, probabilities = zip(*emissions)
    observation = np.random.choice(observations, p=probabilities)
    observation_sequence.append(observation)
    
print("\nGenerated Phoneme Sequence:", phoneme_sequence)
print("Generated Observation Sequence:", observation_sequence)


Generated Phoneme Sequence: ['/s/', '/p/', '/tS/', '/tS/']
Generated Observation Sequence: ['Pitch', 'Energy', 'Duration', 'Energy']


### d) Write an inference for the above HMM implementation

##### -> Representation of HMM Parameters: The initial probabilities, transition probabilities, and emission probabilities are structured as dictionaries for easy mapping between phonemes and their respective probabilities.
##### -> Sequence Generation: Starting with the initial phoneme /s/, transitions to subsequent phonemes are guided by the transition probability matrix. At each step, the next phoneme is chosen probabilistically based on the transition probabilities of the current phoneme. The corresponding acoustic observations (e.g., energy, pitch, duration) are generated probabilistically based on the emission probabilities.
##### ->Generated Sequences: The generated phoneme sequence follows the transition probabilities closely, simulating the structure of the word "speech." The observation sequence provides measurable acoustic features tied to the phonemes.
##### -> Expected Output: A sample phoneme sequence could look like: ['/s/', '/p/', '/ie:/', '/tS/']. Corresponding observation sequence might include random values such as ['Energy', 'Pitch', 'Duration', 'Energy'], reflecting the probabilistic nature of the emission process.
#### Insights:
##### -> The implementation showcases the power of HMM in modeling sequences with hidden states and observations, which is a fundamental concept in speech processing.
##### -> By tuning the transition and emission probabilities, the model can simulate different phoneme transitions and acoustic patterns, making it adaptable for various linguistic tasks.

## END