<a href="https://colab.research.google.com/github/ibsenvillarroel/hidden-markov-model/blob/main/hidden_markov_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data
For a markov model we are only interested in probability distributions that have to do with states.

We can find these probabilities from large datasets or may already have these values. 

States: In each markov model we have a finite set of states. These states could be something like "warm" and "cold" or "high" and "low" or even "red", "green" and "blue". These states are "hidden" within the model, which means we do not direcly observe them.

Observations: Each state has a particular outcome or observation associated with it based on a probability distribution. An example of this is the following: On a hot day Tim has a 80% chance of being happy and a 20% chance of being sad.

Transitions: Each state will have a probability defining the likelyhood of transitioning to a different state. An example is the following: a cold day has a 30% chance of being followed by a hot day and a 70% chance of being follwed by another cold day.

To create a hidden markov model we need.

States
Observation Distribution
Transition Distribution
For our purpose we will assume we already have this information available as we attempt to predict the weather on a given day.

Imports and Setup

In [None]:
%tensorflow_version 2.x

In [None]:
!pip install tensorflow_probability==0.8.0rc0 --user --upgrade #Due to a version mismatch with tensorflow v2 and tensorflow_probability we need to install the most recent version of tensorflow_probability (see below).

Collecting tensorflow_probability==0.8.0rc0
  Downloading tensorflow_probability-0.8.0rc0-py2.py3-none-any.whl (2.5 MB)
[K     |████████████████████████████████| 2.5 MB 7.7 MB/s 
[?25hCollecting cloudpickle==1.1.1
  Downloading cloudpickle-1.1.1-py2.py3-none-any.whl (17 kB)
Installing collected packages: cloudpickle, tensorflow-probability
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gym 0.17.3 requires cloudpickle<1.7.0,>=1.2.0, but you have cloudpickle 1.1.1 which is incompatible.[0m
Successfully installed cloudpickle-1.1.1 tensorflow-probability-0.8.0rc0


#Weather Model
Taken direclty from the TensorFlow documentation (https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/HiddenMarkovModel).

We will model a simple weather system and try to predict the temperature on each day given the following information.

Cold days are encoded by a 0 and hot days are encoded by a 1.
1. The first day in our sequence has an 80% chance of being cold.
2. A cold day has a 30% chance of being followed by a hot day.
3. A hot day has a 20% chance of being followed by a cold day.
4. On each day the temperature is normally distributed with mean and standard deviation 0 and 5 on a cold day and mean and standard deviation 15 and 10 on a hot day.

To model this in TensorFlow we will do the following.

In [None]:
import tensorflow_probability as tfp  
import tensorflow as tf

In [None]:
tfd = tfp.distributions  
initial_distribution = tfd.Categorical(probs=[0.8, 0.2])  # Refer to point 2 above
transition_distribution = tfd.Categorical(probs=[[0.7, 0.3],
                                                 [0.2, 0.8]])  # refer to points 3 and 4 above
observation_distribution = tfd.Normal(loc=[0., 15.], scale=[5., 10.])  # refer to point 5 above / colocamos u npunto porque necesitamos q sea float

# the loc argument represents the mean and the scale is the standard devitation

\We've now created distribution variables to model our system and it's time to create the hidden markov model.

In [None]:
model = tfd.HiddenMarkovModel(
    initial_distribution=initial_distribution,
    transition_distribution=transition_distribution,
    observation_distribution=observation_distribution,
    num_steps=7)#en este caso el numero de steps seria el numero de dias 

The number of steps represents the number of days that we would like to predict information for. In this case we've chosen 7, an entire week.

To get the expected temperatures on each day we can do the following.

In [None]:
mean = model.mean()

# due to the way TensorFlow works on a lower level we need to evaluate part of the graph
# from within a session to see the value of this tensor

# in the new version of tensorflow we need to use tf.compat.v1.Session() rather than just tf.Session()
with tf.compat.v1.Session() as sess:  
  print(mean.numpy())

[2.9999998 5.9999995 7.4999995 8.25      8.625001  8.812501  8.90625  ]
