# **<ins>Viterbi Algorithm</ins>**

The Viterbi algorithm is a <b>dynamic programming algorithm</b> used for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states (the Viterbi path) that results in a sequence of observed events, especially in the context of Markov information sources and hidden Markov models (HMM).
<br>
The purpose of the Viterbi algorithm is to make an inference based on a trained model and some observed data. In decoding problem we need to find the <b>most probable</b> hidden state in every iteration of t.

## Implementation
Here, the example has two states and three possible observations(emissions). There are five elements of HMM which needs to be adjusted:
- states: these are the hidden states which are not directly observed, their presence is observed by observation symbols that hidden states emits.
- observations: it refers to the data we know and can observe.
- start probability: It is a matrix of the initial probability of the state at time t=0. In this case the probability that a person is healthy on the first day is 0.6, while the probability of having fever is 0.4. When the observation sequence starts, initial hidden state which emits symbol(observation) is decided from initial transition pobability.
- transition: transition probability is the probability of moving from one state of a system into another state.
- emission: emission probability refers to the relationship between the hidden state in the model and the observations as provided by the input data.

In [19]:
# five elements for HMM
states = ('Healthy', 'Fever')
observations = ('normal', 'cold', 'dizzy') 
start_probability = {'Healthy': 0.6, 'Fever': 0.4}

transition_probability = {
   'Healthy' : {'Healthy': 0.7, 'Fever': 0.3},
   'Fever' :   {'Healthy': 0.4, 'Fever': 0.6},
   }
 
emission_probability = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever'   : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6},
   }
    
    
def Viterbi_algo(obs, states, s_pro, t_pro, e_pro):
    path = { s:[] for s in states} # init path: path[s] represents the path ends with s
    curr_prob = {}
    for s in states:
        curr_prob[s] = s_pro[s]*e_pro[s][obs[0]]
    for i in range(1, len(obs)):
        last_pro = curr_prob
        curr_prob = {}
        for curr_state in states:
            max_pro, last_sta = max(((last_pro[last_state]*t_pro[last_state][curr_state]*e_pro[curr_state][obs[i]], last_state) 
                       for last_state in states))
            curr_prob[curr_state] = max_pro
            path[curr_state].append(last_sta)

    # find the final largest probability
    max_pro = -1
    max_path = None
    for s in states:
        path[s].append(s)
        if curr_prob[s] > max_pro:
            max_path = path[s]
            max_pro = curr_prob[s]
        print ('%s: %s'%(curr_prob[s], path[s])) # different path and their probability
    return max_path


if __name__ == '__main__':
    obs = ['normal', 'cold', 'dizzy']
    print (Viterbi_algo(obs, states, start_probability, transition_probability, emission_probability))

0.00588: ['Healthy', 'Healthy', 'Healthy']
0.01512: ['Healthy', 'Healthy', 'Fever']
['Healthy', 'Healthy', 'Fever']


So, the steps of the states are 'Healthy Healthy Fever' with largest probability of 0.01512. This reveals that the observations ['normal', 'cold', 'dizzy'] were most likely generated by states ['Healthy', 'Healthy', 'Fever']. 
<br> 
Here is the graphical representation of given example of HMM:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/0/0c/An_example_of_HMM.png/450px-An_example_of_HMM.png" style="width:40%">

## References
- https://en.wikipedia.org/wiki/Viterbi_algorithm
- https://towardsdatascience.com/hidden-markov-model-hmm-simple-explanation-in-high-level-b8722fa1a0d5