# Lecture: Implementation of Markov reward processes

Suppose we have the markov reward process as depicted in the lecture slides. An implementation and sampling of episodes could look like in `markov_reward_process.py`. The implementation of the `markov_chain.py` is enhanced by rewards and a discount factor.

In [None]:
!git clone https://github.com/Fjoelsak/RL.git
!cp RL/02_MDP/markov_reward_process.py ./

In [None]:
import markov_reward_process as mrp

states = ['FB', 'C1', 'C2', 'C3', 'Pass', 'Pub', 'Sleep']
rewards = [-1, -2, -2, -2, 10, 1, 0]
transProbs = {0: {0: 0.9, 1: 0.1},
            1: {0: 0.5, 2: 0.5},
            2: {3: 0.8, 6: 0.2},
            3: {4: 0.6, 5: 0.4},
            4: {6: 1.0},
            5: {1: 0.2, 2: 0.4, 3: 0.4},
            6: {6: 1.0}
}

Student_MRP = mrp.MarkovRewardProcess(states, rewards, transProbs, [6], 0.9)
print("Transposition probability matrix:\n", Student_MRP.trans_prob_matrix)

for i in range(10):
    eps, _ = Student_MRP.sample(1)
    if len(eps) < 30:
        print("Possible episode: ", eps)

# Excercise: Calculating returns and analytical solution of the value function

## Task 1
Implement the method `calc_return()` in `markov_reward_process.py` starting from a list of immediate rewards that the agent gets by sampling episodes in the MRP.

Here, the calculations from the lecture slides are provided for testing purposes

In [None]:
print(Student_MRP.calc_return([-2,-2,-2,10]))       # Result should be -2.25
print(Student_MRP.calc_return([-2,-1,-1,-2,-2]))    # Result should be -3.125
print(Student_MRP.calc_return([-2,-2,-2,1,-2,-2,10,0]))     # Result should be -3.41
print(Student_MRP.calc_return([-2,-1,-1,-2,-2,-2, 1, -2]))    # Result should be -3.2

## Task 2

Implement the `analytical_sol()` method in `markov_reward_process.py` with the approach described in the lecture slides for the given MRP by solving the linear equation. You can test your solution with the value functions shown for different discount factors in the lecture slides

In [None]:
print(Student_MRP.analytical_sol())