## **Hidden Markov Models (HMM)**

HMMs are statistical models used for sequential data. They form the foundation of classical speech recognition systems, modeling the relationship between observed audio features and phonemes.

**Imports**

In [14]:
!pip install librosa hmmlearn sounddevice soundfile numpy scipy scikit-learn




[notice] A new release of pip is available: 24.2 -> 25.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [15]:
import numpy as np
import librosa  # For audio loading and feature extraction
from hmmlearn import hmm
import sounddevice as sd  # For recording audio
import soundfile as sf  # For saving audio (optional)




**Record the audio**

In [16]:

# 1. Record Audio
fs = 44100  # Sample rate
seconds = 5  # Duration of recording

print("Recording audio...")
myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=1)
sd.wait()  # Wait until recording is finished
print("Recording finished.")

# Optional: Save the recording (for debugging/inspection)
sf.write('recording.wav', myrecording, fs)



Recording audio...
Recording finished.


**Extract features**

In [17]:
#2. Extract Features (MFCCs)
# Librosa is commonly used. Adjust parameters as needed.
mfccs = librosa.feature.mfcc(y=myrecording.flatten(), sr=fs, n_mfcc=13) # Ensure it is a one-dimensional array
mfccs = mfccs.T # Transpose to have time frames first (samples, features)

# Ensure mfccs has the correct shape (samples, features)
print("MFCC shape:", mfccs.shape)



MFCC shape: (431, 13)


**Train the HHM**

In [18]:
# 3. Train the HMM (assuming you have initialized your model)
# Example HMM initialization (adapt as needed):
n_states = 3  # Number of hidden states
model = hmm.GaussianHMM(n_components=n_states, covariance_type="diag", n_iter=100) # n_components is the number of states.

try:
    model.fit(mfccs)
    print("HMM trained successfully.")

    # 4. (Optional) Generate samples or make predictions (depending on your HMM use case)
    # Example: Get the state sequence for the recorded audio
    hidden_states = model.predict(mfccs)
    print("Predicted Hidden States:", hidden_states)

except ValueError as e:
    print(f"Error during HMM training: {e}")
    print("Check the shape of your MFCC features. It should be (n_samples, n_features).")
    print("Also, ensure that the number of components in your HMM is less than or equal to the number of samples.")


HMM trained successfully.
Predicted Hidden States: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
