---
# Markov Chain I: Weather Forecast

In [3]:
data = ['cold', 'cold', 'hot', 'cold', 'cold', 'hot', 'cold', 'hot', 'hot']
data

['cold', 'cold', 'hot', 'cold', 'cold', 'hot', 'cold', 'hot', 'hot']

## How can we calculate the transition matrix?

The transition matrix $P$ has the element $p_{ij}$, with rows $i$ and columns $j$, such that:

$$
p_{ij} = P(Y_t = y_j | Y_{t-1} = y_i)
$$

For example $p_{0,1} = p_{cold, sun}$ is the probability of a sunny day when it was colding the day before. 

#### General Properties of the Markov Chain

- current state is only dependend from its direct predecessor
- finite state space: `states = ['cold', 'hot'] = [0, 1]`
- no hidden states: all states are known and observable
- discrete time: time is measured in discrete steps
- time-homogenous: transition probabilities do not change with time

### Create a dataframe with the list of states in time

### 1. Define STATES

In [4]:
data

['cold', 'cold', 'hot', 'cold', 'cold', 'hot', 'cold', 'hot', 'hot']

#### Time

In [6]:
list(range(len(data)))

[0, 1, 2, 3, 4, 5, 6, 7, 8]

In [7]:
import pandas as pd
import numpy as np

data_dict = {
    'time': list(range(len(data))), # a list from 0 to the length of data array in steps of 1
    'weather_t': data
}

states = pd.DataFrame(data_dict)
states

Unnamed: 0,time,weather_t
0,0,cold
1,1,cold
2,2,hot
3,3,cold
4,4,cold
5,5,hot
6,6,cold
7,7,hot
8,8,hot


In [8]:
states['weather_t'].shift(-1)

0    cold
1     hot
2    cold
3    cold
4     hot
5    cold
6     hot
7     hot
8     NaN
Name: weather_t, dtype: object

### Add a column with the weather tomorrow


### 2. Define TRANSITIONS

In [13]:
states['weather_t+1'] = states['weather_t'].shift(-1)

In [14]:
states

Unnamed: 0,time,weather_t,weather_t+1
0,0,cold,cold
1,1,cold,hot
2,2,hot,cold
3,3,cold,cold
4,4,cold,hot
5,5,hot,cold
6,6,cold,hot
7,7,hot,hot
8,8,hot,


### Calculate the frequencies between the weather today and tomorrow (crosstab)


### 3. Get TRANSITION PROBABILITIES

In [15]:
pd.crosstab(states['weather_t'], states['weather_t+1'])

weather_t+1,cold,hot
weather_t,Unnamed: 1_level_1,Unnamed: 2_level_1
cold,2,3
hot,2,1


In [42]:
P = pd.crosstab(states['weather_t'],states['weather_t+1'], normalize = 0) 
P

weather_t+1,cold,hot
weather_t,Unnamed: 1_level_1,Unnamed: 2_level_1
cold,0.4,0.6
hot,0.666667,0.333333


In [None]:
# normalize = 0, divide by sum of rows
# normalize = 1, divide by sum of columns

### 4. Predict next step with `choices`

### Predict: If today is hot, what's the weather tomorrow? 
(Choose randomly from the states with given transition probabilities) 

In [17]:
weather_states = ['cold', 'hot']

In [18]:
from random import choices

In [29]:
# Example of choices "Choose randomly from the states with given transition probabilities"
choices(['happy', 'sad'], weights = [0.9, 0.1])

['sad']

In [43]:
# If today is hot, take the first row with probabilities
P[P.index == 'hot']

weather_t+1,cold,hot
weather_t,Unnamed: 1_level_1,Unnamed: 2_level_1
hot,0.666667,0.333333


In [44]:
# If today is cold, take the second row with probabilities
P[P.index == 'cold']

weather_t+1,cold,hot
weather_t,Unnamed: 1_level_1,Unnamed: 2_level_1
cold,0.4,0.6


In [45]:
# extract the pure values of the data frame .values --> from Dataframe into numpy array
P.values

array([[0.4       , 0.6       ],
       [0.66666667, 0.33333333]])

In [46]:
P[P.index == 'hot'].values

array([[0.66666667, 0.33333333]])

### "If today is hot, what will the weather be tomorrow"

In [53]:
choices(weather_states, weights = P[P.index == 'hot'].values[0])

['hot']

---
# Markov Chain II: Customer churn in Netflix

- Customer can pay for different plans: Basic, Standard, Premium
- Customer can also become inactive
- Once a customer is `churned` it stays `churned` (absorbing state)

In [54]:
states = ['churned', 'inactive', 'basic', 'standard', 'premium']

# a two dimensional matrix with 5 rows and 5 columns
P = np.array([
    [1, 0, 0, 0, 0],  #--> rows have to sum one
    [0.1, 0.8, 0.1, 0, 0],
    [0, 0.05, 0.8, 0.1, 0.05],
    [0, 0, 0.05, 0.9, 0.05],
    [0, 0, 0, 0.01, 0.99],
])

P = pd.DataFrame(P, index=states, columns=states)
P

Unnamed: 0,churned,inactive,basic,standard,premium
churned,1.0,0.0,0.0,0.0,0.0
inactive,0.1,0.8,0.1,0.0,0.0
basic,0.0,0.05,0.8,0.1,0.05
standard,0.0,0.0,0.05,0.9,0.05
premium,0.0,0.0,0.0,0.01,0.99


In [None]:
# all rows have the of their columns (hence axis = 1) ~ 1
assert all(P.sum(axis=1) > 0.999)

## Customer flow

Lets assume we have 150 customers on `basic`, 100 customers on `standard` and  50 customers on `premium`!
How does the distribution change on average after one (two, three, ...) discrete time steps (here, time step = year)?

In [55]:
# which costumers do we have at each state
initial = np.array([0, 0, 150, 100, 50])
initial

array([  0,   0, 150, 100,  50])

In [56]:
# What's the probability of each state transitioning into another
P

Unnamed: 0,churned,inactive,basic,standard,premium
churned,1.0,0.0,0.0,0.0,0.0
inactive,0.1,0.8,0.1,0.0,0.0
basic,0.0,0.05,0.8,0.1,0.05
standard,0.0,0.0,0.05,0.9,0.05
premium,0.0,0.0,0.0,0.01,0.99


### State propagation over a population
If you start with a state distribution $S$ for a population at time $t$, you can calculate the distribution of the next step with:

$S_{t+1} = S_{t} \cdot P$

also written as a function of the initial state:

In [57]:
## what's the distribution next year (aka next time step)
initial.dot(P).round()

array([  0.,   8., 125., 106.,  62.])

In [58]:
## what's the distribution in 2 years 
initial.dot(P).dot(P).round()

array([  1.,  12., 106., 108.,  73.])

In [59]:
## what's the distribution in 3 years 
initial.dot(P).dot(P).dot(P).round()

array([  2.,  15.,  91., 109.,  83.])

#### What is distribution after 10 years? 

In [60]:
initial = np.array([0, 0, 150, 100, 50])
for i in range(10):
    initial = initial.dot(P)
    print(initial.round())

[  0.   8. 125. 106.  62.]
[  1.  12. 106. 108.  73.]
[  2.  15.  91. 109.  83.]
[  3.  17.  80. 108.  92.]
[  5.  17.  71. 106. 101.]
[  7.  17.  64. 103. 108.]
[  9.  17.  58. 101. 116.]
[ 10.  17.  53.  97. 122.]
[ 12.  16.  49.  94. 129.]
[ 14.  15.  46.  91. 135.]


#### What is the steadty state distribution?
A steady state or equilibrium is reached if:

$S \cdot P = S$ 

This happens in a regular MC for $t -> inf$

In [61]:
initial = np.array([0, 0, 150, 100, 50])
for i in range(10000):
    initial = initial.dot(P)
    #print(initial.round())

In [62]:
initial.round()

array([300.,   0.,   0.,   0.,   0.])

---

### So, what do we need to simulate the client movement as a Markov Chain process?
- `states`: dairy, fruits, ...., checkout (absorving state)
    - Possibility to model an entrance (<- not as a state in the data) with the information of the "first location"
- `transitions` <- it's ok if it takes you the rest of the day
    - You get the transitions with `.shift()` BUT IMPORTANT **you have to separate by client id**:
        - by grouping, or
        - sorting (by time and client id)
- `transition probabilities`

#### Q: how to handle time? (in the future steps of the simulation)
- **Option 1**: 
    - Simulate minute by minute. For that resample your data to minutes with `.resample()`
    - This will give you transitions form a sate to the same state (e.g. diary -> diary ) > 0. 
- **Option 2**: 
    - Calculate the average time clients spend in one section and assign it to move through time
- **Option 3**: 
    - Simplify and ignore concrete time spans and just assign random values

## Supermarket project goals

1. **Data exploration** <- it's ok to make this your goal and make it very nicely in what could actually be a business report
2. **MC simulation** <- core of the week, next main goal
    - What do we need?
        - `states`: check. dairy, fruits, etc
        -  get `transitions`: e.g. using shift with some kinf of grouping by costumer <- ***!!!** 
3. Make a more fancy simulation with `Classes` OOP (object oriented programming)