<a href="https://colab.research.google.com/github/mrabhi05/Tensorflow/blob/main/Clustering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Classification

Clustering is a Machine Learning technique that involves the grouping of data points. In theory, data points that are in the same group should have similar properties and/or features, while data points in different groups should have highly dissimilar properties and/or features.

### Basic Algorithm for K-Means.



*   Step 1: Randomly pick K points to place K centroids 
*   Step 2: Assign all of the data points to the centroids by distance. The closest centroid to a point is the one it is assigned to.
*   Step 3: Average all of the points belonging to each centroid to find the middle of those clusters (center of mass). Place the centroids into that position.
*   Step 4: Reassign every point once again to the closest centroid
*   Step 5: Repeat Step 3-4 until no point changes which centroid it belongs to.



## Hidden Markov Models 

"The Hidden Markov Model is a finite set of states, each of which is associated with a (generally multidimensional) probability distribution. Transitions among the states are governed by a set of probabilities called Transition Probabilities"

A hidden markov model works with probabilities to predict future events or states. 

Here, we will create a hidden markov model that can predict the weather.

### Data

**States:**In each markov model we have a finite set of states. These states could be something like "warm" and "cold" or "high" and "low" or even "red", "green" and "blue". These states are "hidden" within the model, which means we do not directly observe them.

**Observation:** Each state has a particular outcome or observation associated with it based on a probability distribution. An example of this is the following: On a hot day Time has a 80% chance of being happy and a 20% chance of being sad.

**Transition:** Each state will have a probability defining the likelihood of transitioning to a different state. An example is the following: a cold day has a 30% chance of being followed by a hot day and a 70% chance of being followed by another cold day

To create a hidden markov model we need:

*   States
*   Observation Distribution
*   Transition Distribution

For our purpose we will assume we already have this information as we attempt to predict the weather on a given day.




### Imports and Setup

In [1]:
  %tensorflow_version 2.x # This line is not required unless you are in a notebook

`%tensorflow_version` only switches the major version: 1.x or 2.x.
You set: `2.x # This line is not required unless you are in a notebook`. This will be interpreted as: `2.x`.


TensorFlow 2.x selected.


In [2]:
import tensorflow_probability as tfp  # We are using a different module from tensorflow this time
import tensorflow as tf

 ### Weather Model

 We will model a simple weather system and try to predict the temperature on each day given the following information.

 1. Cold days are encoded by a 0 and hot days are encoded by a 1.
 2. The first day in our sequence has an 80% chance of being cold.
 3. A cold day has a 30% chance of being followed by a hot day.
 4. A hot day has a 20% chance of being following by a cold day.
 5. On each day the temperature is normally distributed with mean and standard deviation 0 and 5 on a cold day and standard deviation 15 and 10 on a hot day

In [4]:
tfd = tfp.distributions

initial_distribution = tfd.Categorical(probs=[0.8,0.2])  # Refer to Point 2 above
transition_distribution = tfd.Categorical(probs=[[0.7,0.3],[0.2,0.8]])  #Refer to Point 3 and 4 above 
observation_distribution = tfd.Normal(loc=[0.,15.], scale=[5.,10.])  # Refer to Point 5 above

# The loc argument represents the mean and scale is the standard deviation



We've  created distribution variables to model our system and it's time to create the hidden markov model.

In [6]:
model = tfd.HiddenMarkovModel(
    initial_distribution=initial_distribution,
    transition_distribution=transition_distribution,
    observation_distribution=observation_distribution,
    num_steps=7  
)

The number of steps represents the number of days that we would like to predict information for. In this case we've chosen 7, an entire week.

TO get the expected temperatures on each day we can do the following.

In [7]:
mean = model.mean()

# Due to the way Tensorflow works on a lower level we need to evaluate part of the graph
# from within a session to see the value of this tensor

# In the new version of tensorflow we need to use tf.compat.v1.Session() rather than just tf.Session()
with tf.compat.v1.Session() as sess:
  print(mean.numpy())

[2.9999998 5.9999995 7.4999995 8.25      8.625001  8.812501  8.90625  ]
