## MACHINE LEARNING ALGORITHMS PART 3.

### Hidden Markov Models (HM models)

I will be using information and examples based on the TensorFlow 2.0 course provided by the FreeCodeCamp and the TensorFlow documentation [link](https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/HiddenMarkovModel)

Import requirements:

In [1]:
import tensorflow_probability as tfp  # We are using a different module from tensorflow this time
import tensorflow as tf


HM models are very different from the models used in the previous notebooks. Previously I was using algorithms that rely on data, but HM models deal with probability distributions. Thus, these models deal with probabilities of future events based on past events.

Here we'll see how to create and use a HM model to predict the weather. In more detail, we want to predict the weather on any given day, given the probabilty of different events occuring. 

### Types of Data 

*What types of data do we use when we work with HM models?* Let's see an example:

Let's say we have some specific information about an environment:
If it is sunny, we know (hypotheticaly) that there is 80% chance the it is going to be sunny again the next day and 20% chance that it is going to rain.
Maybe we also have some information about sunny days and colds days in general.
Also, let's say we have some information about the average temperature on these days.

Using this information we can create a HM model that will allow us to make predictions for the weather in future given the probability that we discovered. 

A general definition of what HM models are is as follows:

*The Hidden Markov Model is a finite set of states, each of which is associated with a (generally multidimensional) probability distribution. Transitions among the states are governed by a set of probabilities called transition probabilities*. 

So, in a HM model we have have **a set of states**. In the example above with the information about the weather on a given day, the states would be **hot day** and **cold day**. These states are *hidden* in the model, we never have direct access to these states while we interact with the model.

In the model what we look at is something called **observations**. We have a particular outcome/observation at each state.
Let's see an example of an observation:
*On a hot day Tim has a 80% chance of being happy and a 20% chance of being sad*. This is an observation. At that state (hot day/sunny day) we can **observe** the probability of something happening during that state. 

So, we don't care about the states in particular but we care about the outcome/observations that we get from that state. 

#### Data that will be used here

In the previous models I used hundreds of datapoints/entries for the modles to train. Here we don't need any of that. What we need here is constant values for transition distributions and observation distributions. 

**States**: In each markov model we have a finite set of states. These states could be something like "warm" and "cold" or "high" and "low" or even "red", "green" and "blue". These states are "hidden" within the model, which means we do not direcly observe them.

**Observations**: Each state has a particular outcome or observation associated with it based on a probability distribution. An example of this is the following: On a hot day Tim has a 80% chance of being happy and a 20% chance of being sad.

**Transitions**: Each state will have a probability defining the likelyhood of transitioning to a different state. An example is the following: a cold day has a 30% chance of being followed by a hot day and a 70% chance of being follwed by another cold day.

So, what that means is: there is a probability that we could transition to a different state and for each state we can transition into every other state or a defined set of states given a certain probability.  

So, to create a hidden markov model we need:

* States
* Observation Distribution
* Transition Distribution


### The Weather model

Taken direclty from the TensorFlow documentation [link](https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/HiddenMarkovModel).

We will model a simple weather system and try to predict the temperature on each day given the following information:
* Cold days are encoded by a 0 and hot days are encoded by a 1
* The first day in our sequence has an 80% chance of being cold
* A cold day has a 30% chance of being followed by a hot day
* A hot day has a 20% chance of being followed by a cold day
* On each day the temperature is normally distributed with mean and standard deviation 0 and 5 on a cold day and mean and standard deviation 15 and 10 on a hot day
* on a hot day the average temperature is 15 and ranges from 5 to 25



In [7]:
# let's model the above information
tfd = tfp.distributions 
initial_distribution = tfd.Categorical(probs=[0.2, 0.8]) # this refers to point 2 (info 2)
transition_distribution = tfd.Categorical(probs=[[0.5, 0.5], [0.2, 0.8]]) # this refers to point 3 and 4
observation_distribution = tfd.Normal(loc=[0., 15.], scale=[5., 10.]) # this goes to point 5 

# the loc argument represents the mean and the scale is the standard devitation

We've now created distribution variables to model our system and it's time to create the hidden markov model:
The arguments/parameters that the model needs are:
* initial distribution
* transition distribution
* obserrvation distribution
* number of steps

#### What is the number of steps?

the num of steps is *how many days we want the model to predict* the expected weather/temperature.

The number of times we are going to **step** on this model and run the cicle.

In [8]:
model = tfd.HiddenMarkovModel(
    initial_distribution=initial_distribution,
    transition_distribution=transition_distribution,
    observation_distribution=observation_distribution,
    num_steps=7)

In [9]:
print(model)

tfp.distributions.HiddenMarkovModel("HiddenMarkovModel", batch_shape=[], event_shape=[7], dtype=float32)


To get the **expected temperatures** on each day we can do the following:

In [10]:
mean = model.mean() # this is called "partially defined tensor"

# due to the way TensorFlow works on a lower level we need to evaluate part of the graph from within a session 
# to see the value of this tensor

# in the new version of tensorflow we need to use tf.compat.v1.Session() rather than just tf.Session()
with tf.compat.v1.Session() as sess:
    print(mean.numpy())

[12.       11.1      10.83     10.748999 10.724699 10.71741  10.715222]
