# Linear Quadratic Control with Stochastic disturbances


In this live script, we use the finite-horizon linear quadratic control framework 
to control a system in the presence of disturbances. Not only we show that the 
controller can indeed cope with disturbances, but we also show that the cost 
predicted theoretically matches the cost obtained by monte-carlo simulations.

Let us start by recalling the finite-horizon linear quadratic control framework. 
It is specified by a finite-horizon quadratic expected cost function

$$\mathbb{E}[\sum_{k=0}^{h-1}x_k^{T}\,\,Qx_k+u_k^TRu_k+x_h^T Q_hx_h] $$

and a linear model

$$ x_{k+1}=Ax_k+Bu_k+w_k$$

where $k\in\{0,\dots,h-1\}$ and $w_k$ is the disturbance at time $k.$ Each 
$w_k$is distributed according to an arbitrary probability distribution. The 
only assumption is that for different time steps $k_1$ and $k_2$, $w_{k_1}$ 
and $w_{k_2}$are independent random variables.

The optimal policy is the same as in the case where the disturbances are identically 
zero for every time step, i.e., 

$$u_k=K_kx_k$$

where

$$K_k = -(R+B^T P_{k+1}B)^{-1}B^TP_{k+1}A$$

and the $P_k$s are obtained by iterating, for $k\in \{h-1,h-2,\dots,0\}$,

$$P_k = A^TP_{k+1}A + Q - A^TP_{k+1}B(R+B^TP_{k+1}B)^{-1}B^TP_{k+1}A$$

with boundary condition $P_h = Q_h$. Interestingly we can compute the theoretical 
cost

$$J_{0}\left(x_{0}\right)=x_{0}^{\top} P_{0} x_{0}+\sum_{k=0}^{h-1} \text{trace}\left(P_{k+1} 
\mathbb{E}\left[w_{k} w_{k}^{\top}\right]\right)$$

The following example uses a double integrator model to compare the theoretical 
cost and the cost obtained with monte-carlo simulations. For the double integrator 
example, the state $x_k$ comprises position $y_k$ and velocity $v_k$, i.e., 
$x_k= [y_k \ \ v_k]^T$, and the control input $u_k$ is force. After discretizing 
the equation $\ddot{y}=u$ at a sampling period $\tau$ (zero order hold discretization) 
we obtain

$$A=e^{\left[ \matrix{ 0     & \tau \cr    0& 0 } \right]}={\left[ \matrix{ 
1    & \tau \cr    0& 1 } \right]}$$  $$B = \int_0^\tau e^{As}ds\left[\matrix{0 
\cr 1} \right] = \left[\matrix{\frac{\tau^2}{2} \cr \tau} \right].$$

A disturbance with zero mean and following a uniform distribution is applied 
at time $k=11$. However, the probability distribution of the disturbances and 
the times at which the disturbances are applied to the system can be modified.


In [None]:
import numpy as np
from scipy import signal
import scipy.linalg
import matplotlib.pyplot as plt

In [None]:
#system parameters and discretization
Ac = np.array([[0, 1],[0, 0]])
Bc = np.array([[0],[1]])
tau = 0.2
A, B, C = signal.cont2discrete((Ac, Bc, np.zeros((1, 2)), np.array(0.2)), tau)[:3]
Q = np.array([[1, 0], [0, 1]])
S = np.array([[0],[0]])
R = 1
QT = np.array([[1, 0],[0, 1]])
h = 100
P = [np.zeros((2,2)) for idx in range(h+1)]
P[-1] = QT
K = [np.zeros((1,2)) for idx in range(h)]


#Riccati equations
for k in range(h-1,-1,-1):
    P[k] = A.T@P[k+1]@A + Q - (S+A.T@P[k+1]@B)*np.linalg.pinv(R+B.T@P[k+1]@B)@(S.T+B.T@P[k+1]@A)
    K[k] = -np.linalg.pinv(R+B.T@P[k+1]@B)@(S.T+B.T@P[k+1]@A)


#simulate system with disturbances at time k=11
x0 = np.array([[1],[0]])
t = tau*np.arange(0,h+1)
n = 2
x = np.zeros((n,h+1))
x[:,[0]] = x0
u = np.zeros((1,h))
np.random.seed(1)
cost_ = np.zeros(5000)
a = np.zeros(5000)
for imc in range(5000):
    cost_[imc] = 0
    for k in range(h):
        u[:,[k]] = K[k]@x[:,[k]]
        a[imc] = np.random.random_sample()-0.5
        if k == 10:
            x[:,[k+1]] = A@x[:,[k]]+B@u[:,[k]]+np.vstack((np.array(0),a[imc]))
        else:
            x[:,[k+1]] = A@x[:,[k]]+B@u[:,[k]]
        cost_[imc] = cost_[imc] + x[:,[k]].T@Q@x[:,[k]] + u[:,[k]].T*R*u[:,[k]]    

Compare experimental and theoretical cost

Experimental cost

In [None]:
cost_

Theoretical cost. The variance of an unif. r.v. in [-0.5,0.5] is 1/12

In [None]:
x0.T @ P[0] @ x0 + np.trace(P[11] @ np.array([[0, 0], [0, 1/12]]))

In [None]:
td = np.arange(0,h+1)*tau

# plots
f = plt.figure()
ax = plt.gca()
ax.plot(td, x[0])
ax.grid(True)
ax.set_xlabel('t');

In [None]:
f = plt.figure()
ax = plt.gca()
ax.plot(td, x[1])
ax.grid(True)
ax.set_xlabel('t');

In [None]:
f = plt.figure()
ax = plt.gca()
ax.plot(td[:-1],u[0])
ax.grid(True)
ax.set_xlabel('t');

In [None]:
f, ax = plt.subplots(1,3)
ax[0].plot(td, x[0])
ax[0].set_xlabel('t');
ax[0].set_ylabel('y(t)');
ax[0].grid(True)
ax[0].set_xlim(0, h*tau)
ax[0].set_ylim(-2, 1)
ax[1].plot(td, x[1])
ax[1].set_xlabel('t');
ax[1].set_ylabel('v(t)');
ax[1].grid(True)
ax[2].plot(td[:-1], u[0])
ax[2].set_xlabel('t');
ax[2].set_ylabel('u(t)');
ax[2].grid(True)