# Setup

**Run the following cell before doing anything else**. Alternatively, you can select **Run - Run All Cells** in the menu.

In [2]:
import numpy as np

# Computations in slides

## Slide 15 - Computation of Value Function

In this example, we need to solve the system:

\begin{align*}
V(1)&=0.3(2 + V(2)) + 0.6(2 + V(4)) + 0.1(-2 + V(5))\\
V(2)&=0.4(4 + V(1)) + 0.5(1 + V(3)) + 0.1(-3 + V(5))\\
V(3)&=0.8(2 + V(2)) + 0.1(1 + V(4)) + 0.1(-1 + V(5))\\
V(4)&=0.2(2 + V(1)) + 0.7(1 + V(3)) + 0.1(-1 + V(5))\\
V(5)&=0
\end{align*}

We first rewrite the system in the form $AV=b$:

\begin{align*}
V(1) - 0.3V(2)-0.6V(4)&=0.3\cdot2 + 0.6\cdot 2 + 0.1\cdot(-2)\\
-0.4V(1) + V(2) - 0.5V(3) &=0.4\cdot4 + 0.5\cdot1 + 0.1\cdot(-3)\\
-0.8V(2) + V(3) - 0.1V(4)&=0.8\cdot2 + 0.1\cdot1 + 0.1\cdot(-1)\\
-0.2V(1) - 0.7V(3) + V(4)&=0.2\cdot2 + 0.7\cdot1 + 0.1\cdot(-1)\\
\end{align*}


In [18]:
A = np.array([
    [ 1.0, -0.3,  0.0, -0.6],
    [-0.4,  1.0, -0.5,  0.0],
    [ 0.0, -0.8,  1.0, -0.1],
    [-0.2,  0.0, -0.7,  1.0],
], dtype=np.float64)

print(A)

b = np.array([
    0.3 * 2 + 0.6 * 2 + 0.1 * (-2),
    0.4 * 4 + 0.5 * 1 + 0.1 * (-3),
    0.8 * 2 + 0.1 * 1 + 0.1 * (-1),
    0.2 * 2 + 0.7 * 1 + 0.1 * (-1),
], dtype=np.float64)

print(b)

[[ 1.  -0.3  0.  -0.6]
 [-0.4  1.  -0.5  0. ]
 [ 0.  -0.8  1.  -0.1]
 [-0.2  0.  -0.7  1. ]]
[1.6 1.8 1.6 1. ]


In [10]:
V = np.linalg.solve(A, b)
print(V)

[15.51196172 15.94258373 15.87559809 15.215311  ]


In [13]:
for i, v in enumerate(V):
    print(f'V({i+1}) = {V[i]:6.3f}')
print(f'V(5) = {0:6.3f}')

V(1) = 15.512
V(2) = 15.943
V(3) = 15.876
V(4) = 15.215
V(5) =  0.000


## Slide 44 - Solution by Jacobi iteration

We want to solve the system:

\begin{align*}
V(\mathtt{h})&=\frac{2}{3}r_{\mathtt{s}}+\frac{1}{3}r_{\mathtt{w}}
+\gamma\left[\left(\frac{2}{3}\alpha+\frac{1}{3}\right)V(\mathtt{h})+\frac{2}{3}(1-\alpha)V(\mathtt{l})\right]\\
V(\mathtt{l})&=-\frac{3}{5}(1-\beta)+\frac{1}{5}\beta r_{\mathtt{s}}+\frac{1}{5}r_{\mathtt{w}}\\
&+\gamma\left[\left(\frac{1}{5}(1-\beta)+\frac{3}{5}\right)V(\mathtt{h})+
\left(\frac{1}{5}\beta+\frac{1}{5}\right)V(\mathtt{l})\right]
\end{align*}

To use a Jacobi iteration, we first write this system in the form:
$$
V = b + \gamma QV
$$

Let's first define the parameters in the model:

In [16]:
alpha = 0.8
beta = 0.3
r_search = 15
r_wait = 10
gamma = 0.9

Now define $Q$ and $b$. We adopt the convention that $\mathtt{V[0]}=V_{\mathtt{h}}$ and $\mathtt{V[1]}=V_{\mathtt{l}}$

In [22]:
Q = np.array([
    [2/3 * alpha + 1/3, 2/3 * (1 - alpha)],
    [1/5 * (1 - beta) + 3/5, 1/5 * beta + 1/5],
], dtype=np.float64)


print(f'Matrix Q:\n{Q}')

b = np.array([
    2/3 * r_search + 1/3 * r_wait,
    -3/5 * (1 - beta) + 1/5 * beta * r_search + 1/5 * r_wait
])

print(f'Vector b:\n{b}')

Matrix Q:
[[0.86666667 0.13333333]
 [0.74       0.26      ]]
Vector b:
[13.33333333  2.48      ]


In [24]:
# Jacobi iteration:
V = np.zeros(2, dtype=np.float64)
n_iterations = 180
n_report = 30
for n in range(1, n_iterations + 1):
    V = b + gamma * Q @ V
    if (n % n_report == 0):
        print(f'n={n:3d}, V(high)={V[0]:10.8f}, V(low)={V[1]:10.8f}') 

n= 30, V(high)=113.68382504, V(low)=101.43401316
n= 60, V(high)=118.42373411, V(low)=106.17392222
n= 90, V(high)=118.62466434, V(low)=106.37485246
n=120, V(high)=118.63318201, V(low)=106.38337012
n=150, V(high)=118.63354308, V(low)=106.38373119
n=180, V(high)=118.63355839, V(low)=106.38374650


Let's check with a solution via LU decomposition:

In [26]:
A = np.eye(2, dtype=np.float64) - gamma * Q
V_lu = np.linalg.solve(A, b)
print(f' V(high)={V_lu[0]:10.8f}, V(low)={V_lu[1]:10.8f}')

 V(high)=118.63355907, V(low)=106.38374718


## Slide 46 - Solution by Gauss-Seidel iteration

In [27]:
# Gauss-Seidel iteration:
V = np.zeros(2, dtype=np.float64)
n_iterations = 180
n_report = 30
for n in range(1, n_iterations + 1):
    V[0] = b[0] + gamma * Q[0, :] @ V
    V[1] = b[1] + gamma * Q[1, :] @ V
    if (n % n_report == 0):
        print(f'n={n:3d}, V(high)={V[0]:10.8f}, V(low)={V[1]:10.8f}')

n= 30, V(high)=115.21947040, V(low)=103.29702236
n= 60, V(high)=118.53517952, V(low)=106.29480087
n= 90, V(high)=118.63072419, V(low)=106.38118412
n=120, V(high)=118.63347738, V(low)=106.38367332
n=150, V(high)=118.63355671, V(low)=106.38374505
n=180, V(high)=118.63355900, V(low)=106.38374712
