# Markov Chain

In [2]:
import pandas as pd
import numpy as np

![](weather.png)

In [3]:
data = ['cold', 'cold', 'hot', 'cold', 'cold', 'hot', 'cold', 'hot', 'hot']

### 1. Convert the data to a DataFrame
with a single column `weather`

In [11]:
df = pd.DataFrame(data, columns = ["weather"])

### 2. Identify transitions
create another column so that we have the columns:

* $Y_t$ – the current state
* $Y_{t-1}$ –  the state before

In [12]:
df['day_before'] = df['weather'].shift(1)
df

Unnamed: 0,weather,day_before
0,cold,
1,cold,cold
2,hot,cold
3,cold,hot
4,cold,cold
5,hot,cold
6,cold,hot
7,hot,cold
8,hot,hot


### 3. Count transitions
Count the absolute number of each possible transition

In [22]:
df['one'] = 1
df.groupby(['day_before', 'weather']).one.count()

In [23]:
ct = pd.DataFrame(df.groupby(['day_before', 'weather']).one.count().unstack())
ct

weather,cold,hot
day_before,Unnamed: 1_level_1,Unnamed: 2_level_1
cold,2,3
hot,2,1


### 4. Calulate a transition matrix
The transition matrix $P$ has the element $p_{ij}$, with rows $i$ and columns $j$, such that:

$$
p_{ij} = P(Y_t = y_j | Y_{t-1} = y_i)
$$

For example $p_{0,1} = p_{cold, hot}$ is the probability of a hot day when it was cold the day before. 

In [37]:
P = (ct.T / ct.sum(axis = 1)).T
P

weather,cold,hot
day_before,Unnamed: 1_level_1,Unnamed: 2_level_1
cold,0.4,0.6
hot,0.666667,0.333333


### 5. Calulate probabilities for the next day

In [47]:
initial_state = np.array([0, 1])
np.dot(initial_state, P)

array([0.66666667, 0.33333333])

### 6. Calulate probabilities two days ahead

### 7. Calculate the probabilities many days ahead