#  MSCA 37011 - Deep Learning and Image Recognition

## Markov Chain and  Marov Property##

*The Markov property states that given the present, the future is conditionally independent of the past.*

source: https://mpatacchiola.github.io/blog/2016/12/09/dissecting-reinforcement-learning.html

Let’s suppose we have a chain with only two states $s_0$ and $s_1$, where $s_0$ is the initial state. The process is in $s_0$ 90% of the time and it can move to $s_1$ the remaining 10% of the time. When the process is in state $s_1$ it will remain there 50% of the time. 

Given this data we can create a Transition Matrix *T* as follows:
    
$$
T=\left(\begin{array}{cc} 
0.90 & 0.10\\
0.50 & 0.50
\end{array}\right)
$$ 

The transition matrix is always a square matrix, and since we are dealing with probability distributions all the entries are within 0 and 1 and a single row sums to 1. 

We can compute the k-step transition probability as the k-th power of the transition matrix:

In [3]:
import numpy as np

#Declaring the Transition Matrix T
T = np.array([[0.90, 0.10],
              [0.50, 0.50]])

#Obtaining T after 3 steps
T_3 = np.linalg.matrix_power(T, 3)
#Obtaining T after 50 steps
T_50 = np.linalg.matrix_power(T, 50)
#Obtaining T after 100 steps
T_100 = np.linalg.matrix_power(T, 100)

#Printing the matrices
print("T: " + str(T))
print("T_3: " + str(T_3))
print("T_50: " + str(T_50))
print("T_100: " + str(T_100))


T: [[0.9 0.1]
 [0.5 0.5]]
T_3: [[0.844 0.156]
 [0.78  0.22 ]]
T_50: [[0.83333333 0.16666667]
 [0.83333333 0.16666667]]
T_100: [[0.83333333 0.16666667]
 [0.83333333 0.16666667]]


Now we define the initial distribution which represent the state of the system at k=0. Our system is composed of two states and we can model the initial distribution as a vector with two elements, the first element of the vector represents the probability of staying in the state $s_0$ and the second element the probability of staying in state $s_1$. Let’s suppose that we start from $s_0$, the vector $\mathbf{v}$ representing the initial distribution will have this form:

$$\mathbf{v} = (1, 0)$$

We can calculate the probability of being in a specific state after k iterations multiplying the initial distribution and the transition matrix: $\mathbf{v} \cdot T^{k}$. 

Let’s do it in Numpy:

In [4]:
#Declaring the initial distribution
v = np.array([[1.0, 0.0]])

#Printing the initial distribution
print("v: " + str(v))
print("v_1: " + str(np.dot(v,T)))
print("v_3: " + str(np.dot(v,T_3)))
print("v_50: " + str(np.dot(v,T_50)))
print("v_100: " + str(np.dot(v,T_100)))


v: [[1. 0.]]
v_1: [[0.9 0.1]]
v_3: [[0.844 0.156]]
v_50: [[0.83333333 0.16666667]]
v_100: [[0.83333333 0.16666667]]


The possibility to be in $s_0$ at k=3 is given by (0.729 + 0.045 + 0.045 + 0.025) which is equal to 0.844 we got the same result. Now let’s suppose that at the beginning we have some uncertainty about the starting state of our process, let’s define another starting vector as follows :
$$\mathbf{v}=(0.5,0.5)$$

In [6]:
#Declaring the initial distribution
v = np.array([[0.5, 0.5]])

#Printing the initial distribution
print("v: " + str(v))
print("v_1: " + str(np.dot(v,T)))
print("v_3: " + str(np.dot(v,T_3)))
print("v_50: " + str(np.dot(v,T_50)))
print("v_100: " + str(np.dot(v,T_100)))

v: [[0.5 0.5]]
v_1: [[0.7 0.3]]
v_3: [[0.812 0.188]]
v_50: [[0.83333333 0.16666667]]
v_100: [[0.83333333 0.16666667]]


**What is happening in the long run?** 

The result after 50 and 100 iterations are the same and v_50 is equal to v_100 no matter which starting distribution we have. The chain **converged to equilibrium** meaning that as the time progresses it forgets about the starting distribution.

But we have to be careful, the convergence is not always guaranteed. The dynamics of a Markov chain can be very complex, in particular it is possible to have transient and recurrent states.