## Markov Processes

### Markov Chain - Continuing

These are the chains which have no terminations.

![Markov Chain](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mcchain_continuing.png "Markov Chain")

In [1]:
import numpy as np

In [2]:
# MC with no end

# import numpy library to do vector algebra
import numpy as np

# define transition matrix
P = np.array([[0.3, 0.7], [0.2, 0.8]])
print("Transition Matrix:\n", P)

# define a random starting solution for state probabilities
# Here we assume equal probabilities for all the states
S = np.array([0.5, 0.5])

# run through 10 iterations to calculate steady state
# transition probabilities
for i in range(10):
    S = np.dot(S, P)
    if i % 5 == 0:
        print("\nIter {0}. State Probability vector S = {1}".format(i, S))


print("\nFinal Vector S={0}".format(S))

Transition Matrix:
 [[0.3 0.7]
 [0.2 0.8]]

Iter 0. State Probability vector S = [0.25 0.75]

Iter 5. State Probability vector S = [0.2222225 0.7777775]

Final Vector S=[0.22222222 0.77777778]


### Markov Chain - Episodic


![Markov Chain Episodic](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mc_episodic.png "Markov Chain Episodic")

In [3]:
# MC Episodic

# import numpy library to do vector algebra
import numpy as np

# define transition matrix
P = np.array([
    [0.3, 0.5, 0.2, 0.0],
    [0.1, 0.9, 0.0, 0.0],
    [0.4, 0, 0, 0.6],
    [0, 0, 0, 1]
])

print("Transition Matrix:\n", P)

# define any starting solution to state probabilities
# Here we assume equal probabilities for all the states
S = np.array([1, 0, 0, 0])

# run through 10 iterations to calculate steady state
# transition probabilities which should give prob ~ 1 for terminal state
for i in range(1000):
    S = np.dot(S, P)
    # print("\nIter {0}. Probability vector S = {1}".format(i, S))


print("\nFinal Vector S={0}".format(S))

Transition Matrix:
 [[0.3 0.5 0.2 0. ]
 [0.1 0.9 0.  0. ]
 [0.4 0.  0.  0.6]
 [0.  0.  0.  1. ]]

Final Vector S=[4.73405766e-09 2.84857504e-08 9.63092427e-10 9.99999966e-01]


### Markov Reward Process - Continuing

Now the process has a reward for each transition in addition to the transition probabilities.

![Markov Reward Process](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mrp_continuing.png "Markov Reward Process")