# Season Prediction Using HMM


## Goal

The goal of this experiment is to understand how to leverage the solutions of the four HMM problems (evaluation, smoothing, most likely explanation and learning) for creating a predictor program.

## Scenario

We will develop a program for predicting seasons. For simplicity we assume that there are only two seasons: Rainy & Dry.

We assume that the seasons cannot be observed directly, i.e., they are hidden. However, a season can be predicted by observing some actions. For example, by observing the pattern of using umbrellas the hidden season can be predicted.

For simplicity, we assume that there are only two observable states: Umbrella and No Umbrella

Thus, by observing umbrellas (or no umbrellas) at an interval of a day, we intend to predict the hidden season.

Note that in rainy season people carry umbrella more often than in a dry season. Therefore, the pattern of carrying umbrella differs based on seasons.

## Challenge

The key challenge in solving this problem is that the HMM model (state transition probability matrix, emission probability matrix and the initial state probability matrix) is unknown. Because of the difference in the pattern of using umbrella in the two seasons, we need two HMMs for modeling the Rainy and Dry season.

Thus, to be able to predict the hidden season based on the observation about umbrella, we need to learn two HMMs for the Rainy and Dry season.

## Problem Description

We need to solve the following two problems:
1. Learn two HMMs for the Rainy and Dry seasons (this is a Learning problem, which is Problem 4 from the Lecture slides)
2. Evaluate a sequence of observations using the two HMMs (this is an Evaluation problem, which is Problem 1 from the Lecture slides)


## Solution Approach

The above two problems are solved as follows:
1. The learning problem is solved by Expectation-Maximization (EM) method. The EM method is implemented by the Baum-Welch algorithm.
2. The evaluation problem is solved by using the Forward algorithm.

We will use two APIs to solve these two problems.


## HmmLearn API
https://hmmlearn.readthedocs.io/en/latest/#

##### Learning:
We will use this API to learn the two HMMs from two long sequence of observations (about umbrellas) collected from Rainy and Dry seasons, respectively.

##### Evaluation:
Then, we will use this API to evaluate any given sequence of observations. Based on the evaluation scores, we will predict the more probable season.

We will use another API for evaluation.

## Hidden_markov API
https://pypi.org/project/hidden_markov/


##### Evaluation:
We will use this API to evaluate any given sequence of observations. Then, based on the evaluation score, we will predict the more probable season.




## Installation

##### HmmLearn
For installing the hmmlearn API use the following command:

   pip install hmmlearn

See the URL below for detail:
https://pypi.org/project/hmmlearn/


##### Hidden_markov
For installing hidden_markov API use the following command:

   pip install hidden_markov

See the URL below for detail:
https://pypi.org/project/hidden_markov/




## Training HMMs using the HmmLearn API

URL: https://hmmlearn.readthedocs.io/en/latest/#

We train two HMMs:
- Rainy Season HMM
- Dry Season HMM

In [1]:

# Train a HMM for the Rainy Season

import numpy as np
from hmmlearn import hmm

states = ["Rainy", "Dry"]
n_states = len(states)

observations = ['Umbrella','No Umbrella']
n_observations = len(observations)

# Observations are coded by 0 and 1
obs_map = {0:'Umbrella', 1:'No Umbrella'}

# Rainy Season: Observation sequence for Umbrellas for 30 days in a Rainy season
seqs = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
                  0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]])



# Learning model parameters: transmission & emission matrices
for n in range(1000):
    model_train_rainy = hmm.MultinomialHMM(n_components=n_states)
    model_train_rainy.fit(seqs)
    
    
print("Rainy Season HMM: Transition Probabilities: ")
print(model_train_rainy.transmat_)
print("\nRainy Season HMM: Emission Probabilities: ")
print(model_train_rainy.emissionprob_)



Rainy Season HMM: Transition Probabilities: 
[[ 0.7129008   0.2870992 ]
 [ 0.72261078  0.27738922]]

Rainy Season HMM: Emission Probabilities: 
[[ 0.96643964  0.03356036]
 [ 0.84900254  0.15099746]]


In [2]:
# Train a HMM for the Dry Season

import numpy as np
from hmmlearn import hmm

states = ["Rainy", "Dry"]
n_states = len(states)

observations = ['Umbrella','No Umbrella']
n_observations = len(observations)

# Observations are coded by 0 and 1
obs_map = {0:'Umbrella', 1:'No Umbrella'}

# Dry Season: Observation sequence for Umbrellas for 30 days in a Dry season
seqs = np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1,
                  1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1]])



# Learning model parameters: transmission & emission matrices
for n in range(1000):
    model_train_dry = hmm.MultinomialHMM(n_components=n_states)
    model_train_dry.fit(seqs)
    
    
print("Dry Season HMM: Transition Probabilities: ")
print(model_train_dry.transmat_)
print("\nDry Season HMM: Emission Probabilities: ")
print(model_train_dry.emissionprob_)

Dry Season HMM: Transition Probabilities: 
[[ 0.41916811  0.58083189]
 [ 0.2526468   0.7473532 ]]

Dry Season HMM: Emission Probabilities: 
[[ 0.25856467  0.74143533]
 [ 0.03419646  0.96580354]]


## Evaluate an Observation Sequence by the two HMMs 

We have learned two HMMs based on two seasons' data (observation about umbrellas).

As part of the learning, we generated two matrices for each HMM:
- State transition probability matrix
- Observation (emission) probability matrix

Next we will evaluate an observed sequence. To do this, we need another matrix: initial state transition matrix

We construct the initial state probability matrices for the two HMMs by assuming that initially there is a 50-50% chance of observing umbrella & no umbrella.


## Evaluation Using the HmmLearn API

Note that this API returns the normalized evaluation probability.

In [4]:

# Evaluate an observation sequence using the HmmLearn API

# Note that this API returns the normalized evaluation probability

import numpy as np
from hmmlearn import hmm


states = ["Rainy", "Dry"]
n_states = len(states)

observations = ['Umbrella', 'No Umbrella']
n_observations = len(observations)


obs_map = {0:'Umbrella', 1:'No Umbrella'}


# Use the following observation sequence for evaluation
obs = np.array([[0, 0, 1, 1]]).T


print("\n")
print("%s : %30s" % ("State Code", "Observed State"))

for i in range (obs.shape[0]):
    print("%10d : %30s" % (obs[i, 0], obs_map[obs[i, 0]]))


    
# First evaluate the sequence using the "Rainy" HMM
print("\n--------------------------------- Model: Rainy -----------------------------------")

model_rainy = hmm.MultinomialHMM(n_components=n_states, init_params="")


# Rainy
model_rainy.startprob_ = np.array([0.5, 0.5])

model_rainy.transprob_ = np.array([
  [model_train_rainy.transmat_[0, 0], model_train_rainy.transmat_[0, 1]],
  [model_train_rainy.transmat_[1, 0], model_train_rainy.transmat_[1, 1]]
])
model_rainy.emissionprob_ = np.array([
  [model_train_rainy.emissionprob_[0, 0], model_train_rainy.emissionprob_[0, 1]],
  [model_train_rainy.emissionprob_[1, 0], model_train_rainy.emissionprob_[1, 1]]
])



# Estimate model parameters.
model_rainy = model_rainy.fit(obs)


# Find most likely state (hidden) sequence corresponding to the observation sequence
# We use the following function "decode": the first argument should be the observation sequence (matrix)
logprob_rainy, predicted_hidden_states_rainy = model_rainy.decode(obs, algorithm="viterbi")


print("\nPredicted Hidden States (Rainy):", ", ".join(map(lambda x: states[x], predicted_hidden_states_rainy)))

print("\nProbability of Rainy:")
print(logprob_rainy)



# Then evaluate the sequence using the "Dry" HMM
print("\n--------------------------------- Model: Dry -----------------------------------")

model_dry = hmm.MultinomialHMM(n_components=n_states, init_params="")

# Dry
model_dry.startprob_ = np.array([0.5, 0.5])
model_dry.transprob_ = np.array([
  [model_train_dry.transmat_[0, 0], model_train_dry.transmat_[0, 1]],
  [model_train_dry.transmat_[1, 0], model_train_dry.transmat_[1, 1]]
])
model_dry.emissionprob_ = np.array([
  [model_train_dry.emissionprob_[0, 0], model_train_dry.emissionprob_[0, 1]],
  [model_train_dry.emissionprob_[1, 0], model_train_dry.emissionprob_[1, 1]]
])



# Estimate model parameters.
model_dry = model_dry.fit(obs)


# Find most likely state (hidden) sequence corresponding to the observation sequence
# We use the following function "decode": the first argument should be the observation sequence (matrix)
logprob_dry, predicted_hidden_states_dry = model_dry.decode(obs, algorithm="viterbi")


print("\nPredicted Hidden States (Dry):", ", ".join(map(lambda x: states[x], predicted_hidden_states_dry)))

print("\nProbability of Dry:")
print(logprob_dry)


# Predict a season by comparing the score of the two probabilities based on two HMMs
print("\n--------------------------------- Season Prediction -----------------------------------")


print("Predicted Season is:")

if(logprob_rainy > logprob_dry):
    print("Rainy Season!")
else:
    print("Dry Season!")
    




State Code :                 Observed State
         0 :                       Umbrella
         0 :                       Umbrella
         1 :                    No Umbrella
         1 :                    No Umbrella

--------------------------------- Model: Rainy -----------------------------------

Predicted Hidden States (Rainy): Rainy, Rainy, Dry, Dry

Probability of Rainy:
-1.559663074783069

--------------------------------- Model: Dry -----------------------------------

Predicted Hidden States (Dry): Dry, Dry, Rainy, Rainy

Probability of Dry:
-1.399539881893381

--------------------------------- Season Prediction -----------------------------------
Predicted Season is:
Dry Season!


## Evaluation Using the Hidden_markov API

Note that this API returns only the un-normalized evaluation probability.

In [5]:

# Evaluate an observation sequence using the Hidden_markov API

# Note that this API returns only the un-normalized evaluation probability

import numpy as np

from hidden_markov import hmm 


states = ["Rainy", "Dry"]
n_states = len(states)

observations = ['Umbrella', 'No Umbrella']
n_observations = len(observations)


obs_map = {0:'Umbrella', 1:'No Umbrella'}


# Use the following observation sequence for evaluation
obs = np.array([0, 0, 1, 1])

obs_list = []
for i in range(len(obs)):
    obs_list.append(obs_map[obs[i]])
    

print("\n")
print("%s : %30s" % ("State Code", "Observed State"))

for i in range (obs.shape[0]):
    print("%10d : %30s" % (obs[i], obs_map[obs[i]]))


print("\n--------------------------------- Model: Rainy -----------------------------------")



# Rainy
pi_rainy = np.array([0.5, 0.5])
pi_rainy = np.matrix(pi_rainy)


a_rainy = np.array([
  [model_train_rainy.transmat_[0, 0], model_train_rainy.transmat_[0, 1]],
  [model_train_rainy.transmat_[1, 0], model_train_rainy.transmat_[1, 1]]
])

a_rainy = np.matrix(a_rainy)

b_rainy = np.array([
  [model_train_rainy.emissionprob_[0, 0], model_train_rainy.emissionprob_[0, 1]],
  [model_train_rainy.emissionprob_[1, 0], model_train_rainy.emissionprob_[1, 1]]
])

b_rainy = np.matrix(b_rainy)


rainy_model = hmm(states,observations, pi_rainy, a_rainy, b_rainy)


probability_observation_sequence_rainy = rainy_model.forward_algo(obs_list)
print("\nProbability of Rainy:") 
print(probability_observation_sequence_rainy)




print("\n--------------------------------- Model: Dry -----------------------------------")


# Dry
pi_dry = np.array([0.5, 0.5])

pi_dry = np.matrix(pi_dry)

a_dry = np.array([
  [model_train_dry.transmat_[0, 0], model_train_dry.transmat_[0, 1]],
  [model_train_dry.transmat_[1, 0], model_train_dry.transmat_[1, 1]]
])
a_dry = np.matrix(a_dry)

b_dry = np.array([
  [model_train_dry.emissionprob_[0, 0], model_train_dry.emissionprob_[0, 1]],
  [model_train_dry.emissionprob_[1, 0], model_train_dry.emissionprob_[1, 1]]
])

b_dry = np.matrix(b_dry)


dry_model= hmm(states,observations, pi_dry, a_dry, b_dry)
                      
probability_observation_sequence_dry = dry_model.forward_algo(obs_list)
print("\nProbability of Dry:") 
print(probability_observation_sequence_dry)



print("\n--------------------------------- Season Prediction -----------------------------------")


print("Predicted Season is:")

if(probability_observation_sequence_rainy > probability_observation_sequence_dry):
    print("Rainy Season!")
else:
    print("Dry Season!")





State Code :                 Observed State
         0 :                       Umbrella
         0 :                       Umbrella
         1 :                    No Umbrella
         1 :                    No Umbrella

--------------------------------- Model: Rainy -----------------------------------

Probability of Rainy:
0.003865903636926622

--------------------------------- Model: Dry -----------------------------------

Probability of Dry:
0.0398608413079284

--------------------------------- Season Prediction -----------------------------------
Predicted Season is:
Dry Season!
