## Hidden Markov Model
HMM is priobabilistic model used for NLP task, that describe relationship between sequence of observation and sequence of hidden state. It is used various application such as speech recognition, POS tagging, NER .It is extenson of Markov model 

HNN consist of two types of states
- hidden states(latent states)
- observation state(output states)

Key Component
 - **Transition Probabilities**: These represent the probabilities of transitioning from one hidden state to another. The transition probabilities form the transition matrix, where each element represents the probability of transitioning from one hidden state to another.
 
- **Emission Probabilities**: These represent the probabilities of emitting an observable state from each hidden state. The emission probabilities form the emission matrix, where each element represents the probability of observing a specific output state given a hidden state.

- **Initial State Probabilities**: These represent the probabilities of starting the sequence from a specific hidden state. The initial state probabilities represent the probability distribution of the first hidden state.



HMMs are trained using the **Viterbi algorithm** or the **Baum-Welch** algorithm to estimate the model parameters (transition probabilities, emission probabilities, and initial state probabilities) from the training data

## Example:  Hidden Markov Models
https://www.youtube.com/watch?v=fX5bYmnHqqE

Problem statement - Teacher wear red/Green/blue shirt 

Assumption - A teacher happy(H) or sad(S) based of previous day assumption

**Transition Matrix**

|  | $$H_t$$ | $$S_t$$ |
|---|---|---|
| $$H_{t-1}$$ | 0.7 | 0.3 |
| $$S_{t-1}$$ | 0.5 | 0.5 |

If teacher is happy pervious day then, probability of teacher is happy today is 0.7<br>
If teacher is happy pervious day then, probability of teacher is sad today is 0.5

**Emission Matrix**

|  | R | G | B |
|---|---|---|---|
| H | 0.8 | 0.1 | 0.1 |
| S | 0.2 | 0.3 | 0.5 |

If teacher is happy then, probability of teacher wear red shirt is 0.8<br>



## Example: Part-of-Speech Tagging with Hidden Markov Models
POS tagging is the process of assigning a part-of-speech label to each word in a sentence.

**Hidden States** - POS tag

##### Transition Probabilities:
P(Noun | Noun) = 0.5<br>
P(Verb | Verb) = 0.4<br>
P(Preposition | Preposition) = 0.3<br>
....

##### Emission Probabilities
P("I" | PRON) = 0.2<br>
P("work" | Verb) = 0.5<br>
....




## Viterbi algorithm
Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states in a hidden Markov model (HMM).

In [12]:
import numpy as np
from hmmlearn import hmm

In [25]:
state = ['H','S']
observation_state = ['R','G','B']
n_state = len(state)
n_observation = len(observation_state)

In [9]:
state_probabilty = np.array([0.6,0.4])
transition_matrix = np.array([[0.7,0.3],[0.5,0.5]])
emission_matrix = np.array([[0.8,0.1,0.1],[0.2,0.3,0.5]])

In [19]:
model  = hmm.CategoricalHMM(n_components=n_state)
model.startprob_ = state_probabilty
model.transmat_ = transition_matrix
model.emissionprob_ = emission_matrix

In [24]:
observation_seq = np.array([0,1,1,2,0]).reshape(-1,1)
# predict the most likely sequence of hidden states
model.predict(observation_seq)

array([0, 1, 1, 1, 0], dtype=int64)