# Viterbi Algorithm

Useful resources:

* https://en.wikipedia.org/wiki/Viterbi_algorithm
* http://blog.ivank.net/viterbi-algorithm-clarified.html
* http://www.cs.toronto.edu/~sengels/tutorials/viterbi.html

This notebook is using the example from the above mentioned wikipedia article. The viterbi algorithm is used in the decoding step when used in conjunction with Hidden Markov Models (HMMs).

## Doctor example

Consider a village where all villagers are either **healthy** or have a **fever** and only the village doctor can determine whether each has a fever. The doctor diagnoses fever by asking patients how they feel. The villagers may only answer that they feel **normal, dizzy, or cold**.

The doctor believes that the health condition of his patients operate as a discrete Markov chain. There are two states, **"Healthy" and "Fever"**, but the doctor cannot observe them **directly**; they are **hidden** from him. On each day, there is a certain chance that the patient will tell the doctor he/she is **"normal", "cold", or "dizzy"**, depending on their health condition.

The observations (normal, cold, dizzy) along with a hidden state (healthy, fever) form a hidden Markov model (HMM).

In [None]:
# *** Observations/Emissions ***
#
# What you can actually measure/see/observe
obs = ('normal', 'cold', 'dizzy')


# *** Hidden states ***
#
# States which you cannot measure/see/observe
# and cause the observations/emissions.
# Same concept as latent variables?
states = ('Healthy', 'Fever')

# ** Start probability ***
#
# start_probability represents the doctor's belief about 
# which state the HMM is in when the patient first visits 
# (all he knows is that the patient tends to be healthy). 
# The particular probability distribution used here is not 
# the equilibrium one, which is (given the transition 
# probabilities) approximately {'Healthy': 0.57, 'Fever': 0.43}
start_p = {'Healthy': 0.6, 'Fever': 0.4}

# *** Transition probability***
#
#
#
trans_p = {
   'Healthy' : {'Healthy': 0.7, 'Fever': 0.3},
   'Fever' : {'Healthy': 0.4, 'Fever': 0.6}
   }

# *** Emission probability***
# 
#
#
emit_p = {
   'Healthy' : {'normal': 0.5, 'cold': 0.4, 'dizzy': 0.1},
   'Fever' : {'normal': 0.1, 'cold': 0.3, 'dizzy': 0.6}
   }

In [None]:
def viterbi(obs, states, start_p, trans_p, emit_p):
    V = [{}]
    for st in states:
        V[0][st] = {"prob": start_p[st] * emit_p[st][obs[0]], "prev": None}
    # Run Viterbi when t > 0
    for t in range(1, len(obs)):
        V.append({})
        for st in states:
            max_tr_prob = max(V[t-1][prev_st]["prob"]*trans_p[prev_st][st] for prev_st in states)
            for prev_st in states:
                if V[t-1][prev_st]["prob"] * trans_p[prev_st][st] == max_tr_prob:
                    max_prob = max_tr_prob * emit_p[st][obs[t]]
                    V[t][st] = {"prob": max_prob, "prev": prev_st}
                    break
    for line in dptable(V):
        print line
    opt = []
    # The highest probability
    max_prob = max(value["prob"] for value in V[-1].values())
    previous = None
    # Get most probable state and its backtrack
    for st, data in V[-1].items():
        if data["prob"] == max_prob:
            opt.append(st)
            previous = st
            break
    # Follow the backtrack till the first observation
    for t in range(len(V) - 2, -1, -1):
        opt.insert(0, V[t + 1][previous]["prev"])
        previous = V[t + 1][previous]["prev"]

    print 'The steps of states are ' + ' '.join(opt) + ' with highest probability of %s' % max_prob

def dptable(V):
    # Print a table of steps from dictionary
    yield " ".join(("%12d" % i) for i in range(len(V)))
    for state in V[0]:
        yield "%.7s: " % state + " ".join("%.7s" % ("%f" % v[state]["prob"]) for v in V)