# Learning and Decision Making

## Laboratory 2: Markov chains

In the end of the lab, you should submit all code/answers written in the tasks marked as "Activity n. XXX", together with the corresponding outputs and any replies to specific questions posed to the e-mail <adi.tecnico@gmail.com>. Make sure that the subject is of the form [&lt;group n.&gt;] LAB &lt;lab n.&gt;.

### 1. Modeling

Consider once again the simplified Trivial game described in the Homework and for which you wrote a Markov chain model:

<img src="trivial.png" width="400px">

Recall that your chain should describe a single player, where: 

* The player rolls a single die in each play; 
* At each intersection, the player continues along any of the possible intersecting paths with equal probability. 

---

#### Activity 1.        

Implement your Markov chain model in Python. In particular,

* Create a list with all the states;
* Define a `numpy` array with the corresponding transition probabilities.

The order for the states used in the transition probability matrix should match that in the list of states. 

**Note 1**: Don't forget to import `numpy`. If you need additional matrix operations (such as matrix powers or eigenvalues and eigenvectors), you may also import the library `numpy.linalg`.

**Note 2**: Make sure to print the result in the end.

---

In [1]:
import numpy as np

#Dictionary displaying the states' neighbors
trivial_board = {0: [1, 2, 3],
                 1: [0, 8],
                 2: [0, 6],
                 3: [0, 4],
                 4: [3, 5, 9],
                 5: [4, 6],
                 6: [2, 5, 7],
                 7: [6, 8],
                 8: [1, 7, 9],
                 9: [8, 4]
                }

#Die's possible values
die_values = [1, 2, 3, 4, 5, 6]

#Dictionary containing the transition probabilities for every die value
matrix_dies = {}

#List with all the states
states = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

matrix = np.zeros((10,10))

#init matrix when the die shows a 1
for pos in states:
    for next_pos in trivial_board[pos]:
        val = 1 / len(trivial_board[pos])
        matrix[pos][next_pos] = val
    
matrix_dies[1] = matrix
matrix_d1 = matrix.copy()

#transition probabilities for the other possible die values
for die in die_values[1:]:
    matrix = np.dot(matrix, matrix_d1)
    matrix_dies[die] = matrix
        
#Transition Probabilities Matrix
transProbs = np.zeros((10,10))
for matrix in matrix_dies.values():
    transProbs += matrix
transProbs = transProbs/6

np.set_printoptions(precision=3)

print(transProbs)
print(states)

[[0.185 0.123 0.123 0.123 0.105 0.043 0.105 0.043 0.105 0.043]
 [0.185 0.123 0.083 0.083 0.065 0.043 0.065 0.083 0.185 0.083]
 [0.185 0.083 0.123 0.083 0.065 0.083 0.185 0.083 0.065 0.043]
 [0.185 0.083 0.083 0.123 0.185 0.083 0.065 0.043 0.065 0.083]
 [0.105 0.043 0.043 0.123 0.185 0.123 0.105 0.043 0.105 0.123]
 [0.065 0.043 0.083 0.083 0.185 0.123 0.185 0.083 0.065 0.083]
 [0.105 0.043 0.123 0.043 0.105 0.123 0.185 0.123 0.105 0.043]
 [0.065 0.083 0.083 0.043 0.065 0.083 0.185 0.123 0.185 0.083]
 [0.105 0.123 0.043 0.043 0.105 0.043 0.105 0.123 0.185 0.123]
 [0.065 0.083 0.043 0.083 0.185 0.083 0.065 0.083 0.185 0.123]]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


---

#### Activity 2.

Compute the probability of the following trajectories:

* "Pink with pie piece" - "Green in outer rim" - "Blue with pie slice" - "Pink in outer rim"
* "Pink with pie piece" - "Pink with pie piece" - "Blue in outer rim"
* "Center" - "Pink with pie piece" - "Blue in outer rim"

**Note:** Make sure to print the result in the end.

---

In [2]:
CENTER = 0
BLUE_IN_INNER_RIM = 1
GREEN_IN_INNER_RIM = 2
PINK_IN_INNER_RIM = 3
GREEN_WITH_PIE_SLICE = 4
PINK_IN_OUTER_RIM = 5
BLUE_WITH_PIE_SLICE = 6
GREEN_IN_OUTER_RIM = 7
PINK_WITH_PIE_PIECE = 8
BLUE_IN_OUTER_RIM = 9

first_trajectory = transProbs[PINK_WITH_PIE_PIECE][GREEN_IN_OUTER_RIM] * transProbs[GREEN_IN_OUTER_RIM][BLUE_WITH_PIE_SLICE] * transProbs[BLUE_WITH_PIE_SLICE][PINK_IN_OUTER_RIM]

second_trajectory = transProbs[PINK_WITH_PIE_PIECE][PINK_WITH_PIE_PIECE] * transProbs[PINK_WITH_PIE_PIECE][BLUE_IN_OUTER_RIM] 

third_trajectory = transProbs[CENTER][PINK_WITH_PIE_PIECE] * transProbs[PINK_WITH_PIE_PIECE][BLUE_IN_OUTER_RIM]

print("first trajectory probability: ", first_trajectory)
print("second trajectory probability: ", second_trajectory)
print("third trajectory probability: ", third_trajectory)

first trajectory probability:  0.002822514634738381
second trajectory probability:  0.02286236854138089
third trajectory probability:  0.012955342173449166


### 2. Stability

---

#### Activity 3.

Justify whether the chain implemented in Activity #1 is:

* Irreducible
* Aperiodic
* Ergodic

---

The chain is irreducible. This can be seen by analyzing the transition probability matrix and observing that all entries are different than 0, meaning that every state y can be reached from any other state x.

By observing the transition probability matrix, we can see that the probability of reaching a state x from the same state x is different than 0. For example, if the chain starts in state x, it can return to the same state at t=1. This means that the chain is aperiodic.

Since we already proved that the chain is Irreducible and Aperiodic, the chain is also Ergodic.

---

#### Activity 4

Compute the stationary distribution for the chain.

**Note:** The stationary distribution is a *left* eigenvector of the transition probability matrix associated to the eigenvalue 1. As such, you may find useful the numpy function `numpy.linalg.eig`. Also, recall that the stationary distribution is *a distribution*.

---

In [9]:
import numpy.linalg

eigens = numpy.linalg.eig(transProbs.T)

eigenvalues = eigens[0]

#rounding the eigenvalues
eigenvalues = eigenvalues.round(decimals=6)

eigenvector1 = np.zeros(10)

for column in range(0,10):
    if eigenvalues[column] == 1:
        break
        
for row in range(0,10):
    eigenvector1[row] = eigens[1][row][column]

#normalizing
sum = np.cumsum(eigenvector1)

eigenvector1 = eigenvector1 / sum[-1]
    
print(eigenvector1)


[0.125 0.083 0.083 0.083 0.125 0.083 0.125 0.083 0.125 0.083]


### 3. Simulation

You are now going to *simulate* the Markov chain that you defined in Question #1.

---

#### Activity 5

Generate a 10,000-step long trajectory of the chain defined in Activity #1. 

---

In [6]:
long_trajectory = np.zeros(10000,dtype=np.int32)

long_trajectory[0] = np.random.choice(states)

for i in range(1,10000):
    long_trajectory[i] = np.random.choice(10, 1, p = transProbs[long_trajectory[i-1]].tolist())

print(long_trajectory)
    

[0 0 0 ... 8 8 1]


---

#### Activity 6

Draw a histogram of the trajectory generated in Activity #5. Make sure that the histogram has one bin for each state. Compare the relative frequencies with the result of Activity #4.

**Note**: Don't forget to load `matplotlib`.

---

In [None]:
# Insert your code here

_Provide your answer here (double click to edit)._