# Linear Quadratic Control with Input Constraints
In this live script we use the linear quadratic regulator framework to control a double integrator system in the presence of input constraints.

Let us start by presenting the framework. As in linear quadratic control, we have a finite-horizon quadratic cost function

$$\sum_{k=0}^{h-1}x_k^{T}\,\,Qx_j+u_k^TRu_k+x_h^T Q_hx_h $$

and a linear model

$$x_{k+1}=Ax_k+Bu_k$$

where $k\in\{0,\dots,h-1\}$. Suppose that $u \in \mathbb{R}$. However, now we also consider input constraints

$$|u|\leq c$$

for a given positive constant $c$.

We want to find a control policy to minimize the cost function.

Let us first show that we can formulate this problem as a quadratic programming problem.

Define

$$\bar{x} = [x_1^T \ \ x_2^T \dots x_h^T ]^T$$

and

$$\bar{u} = [u_0 \ \ u_1 \dots u_{h-1} ]^T$$

Then, we can write the cost function as

$$(\bar{A}x_0+\bar{B}\bar{u})^T \bar{Q}(\bar{A}x_0+\bar{B}\bar{u})+\bar{u}^T\bar{R}\bar{u}$$

and the dynamic model linear equations as

$$x=\bar{A}x_0+\bar{B}\bar{u}$$

where

$$\bar{A}=\left[\matrix{ A \cr A^2 \cr \dots \cr A^h}\right] \bar{B}=\left[\matrix{ 
B & 0 & 0 & 0 &\dots & 0 \cr AB & B & 0 & 0 &\dots & 0 \cr A^2B & AB & B & 0 
&\dots & 0 \cr A^3B & A^2B & AB & B &\dots & 0 \cr \dots & \dots & \dots & \dots 
&\dots & \dots\cr A^{h-1}B & A^{h-2}B & A^{h-3}B & \dots &\dots & B}\right]$$ 

$$\bar{Q}=\left[\matrix{ Q & 0 & 0  &\dots & 0 \cr 0 & Q & 0 &\dots & 0 \cr\dots 
& \dots & \dots & \dots &\dots \cr 0 & 0  & 0 &\dots & Q_h}\right] \bar{R}=\left[\matrix{ 
R & 0 & 0  &\dots & 0 \cr 0 & R & 0 &\dots & 0 \cr\dots & \dots & \dots & \dots 
&\dots \cr0 & 0  & 0 &\dots & R}\right]$$

Moreover, the inequality constraints can be written as

$$\left[\matrix{ I \cr -I}\right]\bar{u}\leq c \left[\matrix{ 1 \cr 1 \cr 
\dots\cr 1}\right]$$

Let us test these ideas for the double integrator example. For this example, the state $x_k$ comprises position $y_k$ and velocity $v_k$, i.e., $x_k= [y_k \ \ v_k]^T$, and the control input $u_k$ is force. After discretizing the equation $\ddot{y}=u$ at a sampling period $\tau$ (zero order hold discretization) we obtain

$$A=e^{\left[ \matrix{ 0     & \tau \cr    0& 0 } \right]}={\left[ \matrix{ 
1    & \tau \cr    0& 1 } \right]} B = \int_0^\tau e^{As}ds\left[\matrix{0 
\cr 1} \right] = \left[\matrix{\frac{\tau^2}{2} \cr \tau} \right].$$

Then to test these ideas you can run the matlab code below with parameters 

`optpolicy=1, fdisturbances=0 `
 
if there are not constraints or with 

`optpolicy=2, fdisturbances=0 `
 
if there are constraints (the code also allows for different options discussed in the sequel)

In [None]:
import numpy as np
import scipy.optimize
from scipy import signal
import matplotlib.pyplot as plt
%matplotlib ipympl

In [None]:
def solve_qp_scipy(H,f,A,b):
    def f_to_min(x):
        return 0.5 * np.dot(x, H).dot(x) + np.dot(f, x)
    
    lbnd = [-np.inf for idx in range(b.shape[0])]
    ubnd = [b[idx,0] for idx in range(b.shape[0])]
    constraints=scipy.optimize.LinearConstraint(A,lb=lbnd,ub=ubnd)

    result = scipy.optimize.minimize(f_to_min, x0=np.zeros(H.shape[0]) ,constraints=constraints)

    return result

In [None]:
def quadconstrainedcontrol(A,B,Q,R,L,h,x0,opt):
    # opt = 1, CEC, opt = 2, MPC
    
    n = A.shape[0]
    m = B.shape[1]
    # define matrices M, N
    Abar = np.zeros((n*h,n))
    Bbar = np.zeros((n*h,m*h))
    Qbar = np.zeros((n*h,n*h))
    Rbar = np.zeros((m*h,m*h))
    
    for i in range(h):
        Abar[i*n:(i+1)*n,:] = np.linalg.matrix_power(A,i+1)
        for j in range(i+1):
            Bbar[i*n:(i+1)*n,j*m:(j+1)*m] = np.linalg.matrix_power(A,(i-j)) @ B
            
        Qbar[i*n:(i+1)*n,i*n:(i+1)*n]  = Q
        Rbar[i*m:(i+1)*m,i*m:(i+1)*m]  = R    
    
    if opt==1: # CEC
        G = Rbar+Bbar.T@Qbar@Bbar
        a = ((Bbar.T@Qbar@Abar)@x0).T
        C_ = np.vstack((np.eye(m*h),-np.eye(m*h)))
        b = np.vstack((L*np.ones((m*h,1)),L*np.ones((m*h,1))))
        solution = solve_qp_scipy(G, a, C_, b)
        u = solution.x
    elif opt==2: # MPC
        pass
#         G = Rbar+Bbar.T@Qbar@Bbar
#         a = ((Bbar.T@Qbar@Abar)@x0).T
#         C_ = np.vstack((np.eye(m*h),-np.eye(m*h)))
#         b = np.vstack((L*np.ones((m*h,1)),L*np.ones((m*h,1))))
#         Aeq = Bbar[-n+1:,:]
#         Beq = -np.linalg.matrix_power(A,h)@x0
#         solution = solve_qp_scipy(G, a, C_, b)
#         u = solution.x
            
    return u

In [None]:
# Model definition

optpolicy = 1
fdisturbances = 0
# double integrator model 
Ac = np.array([[0, 1],[0, 0]])
Bc = np.array([[0],[1]])
tau = 0.2
Q = np.array([[1, 0], [0, 1]])
S = np.array([[0],[0]])
R = 0.01
QT = np.array([[1, 0], [0 ,1]])
n = 2
m = 1
x0 = np.array([[1],[0]])
c = 0.25
sigmaw = 0.01  # disturbance level
H = 25         # prediction horizon for mpc
h = 50         # simulation horizon
A, B, C = signal.cont2discrete((Ac, Bc, np.zeros((1,n)), np.array([0])), tau)[:3]

# Preliminaries
P = [np.zeros(A.shape) for idx in range(h+1)]
K = [np.zeros(A.shape) for idx in range(h+1)]
if optpolicy==1:
    P[h] = QT
    for k in range(h)[::-1]: # Riccati equations
        P[k] = A.T@P[k+1]@A + Q - (S+A.T@P[k+1]@B)@np.linalg.pinv(R+B.T@P[k+1]@B)@(S.T+B.T@P[k+1]@A)
        K[k] = -np.linalg.pinv(R+B.T@P[k+1]@B)@(S.T+B.T@P[k+1]@A)
elif optpolicy==2:
    U2 = quadconstrainedcontrol(A,B,Q,R,c,h,x0,1)
elif optpolicy==4:
    U3 = np.vstack((np.atleast_2d(quadconstrainedcontrol(A,B,Q,R,c,H,x0,2)).T, np.zeros((h-H,1))))

In [None]:
# Simulate system
t = tau*np.arange(h+1)
x = np.zeros((n,h+1))
u = np.zeros((1,h))
x[:,[0]] = x0

np.random.seed(15)

if fdisturbances == 1:
    gainnodisturbances = 1
else:
    gainnodisturbances = 0
 
cost = 0

w = np.zeros((2,h))
for k in range(h):
    
    if optpolicy==1:   # LQR
        u[:,[k]] = K[k]@x[:,[k]]
    elif optpolicy==2: # CEC no disturbances
        u[:,[k]] = U2[k]
    elif optpolicy==3: # CEC 
        u_ = quadconstrainedcontrol(A,B,Q,R,c,h+1-k,x[:,[k]],1)
        u[:,[k]] = u_[0]
    elif optpolicy==4: # MPC no disturbances
        u[:,[k]] = U3[k]
    elif optpolicy==5: # MPC 
        if (k+1) <= (h-H-1):
            u_ = quadconstrainedcontrol(A,B,Q,R,c,H,x[:,[k]],2)
            u[:,[k]] = u_[0]
        else:
            u_ = quadconstrainedcontrol(A,B,Q,R,c,h-k,x[:,[k]],1)
            u[:,[k]] = u_[0]
            
    w[:,[k]] = B*sigmaw*np.random.randn()*gainnodisturbances
    x[:,[k+1]] = A@x[:,[k]] + B*u[:,[k]] + w[:,[k]]
    cost += x[:,[k]].T@Q@x[:,k] + u[:,[k]].T*R*u[:,[k]]

v = x[0,:]

In [None]:
# Continuous-time simulation and plots
N = 1000
ts = tau/N
nl = h*N
uc = np.kron(u,np.ones((1,N)))
Ad, Bd = signal.cont2discrete((Ac, Bc, np.zeros((1,Ac.shape[0])),np.zeros((1,Bc.shape[1]))), ts)[:2]
xc = np.zeros((n,nl+1))
xc[:,0] = x[:,0]
for k in range(h):
    xc[:,[k*N]] = x[:,[k]]
    for l in range(N):
        xc[:,[k*N+l+1]] = Ad@xc[:,[k*N+l]]+Bd*u[:,[k]]
    
tc = ts*np.arange(nl+1)

In [None]:
fig, axes = plt.subplots(1, 3)
ax = axes[0]
ax.plot(tc,xc[0,:])
ax.set_xlabel('t')
ax.set_ylabel('y(t)')
ax.grid(True)
ax.set(xlim=(0, h*tau), ylim=(-0.2, 1))
ax = axes[1]
ax.plot(tc,xc[1,:])
ax.set_xlabel('t')
ax.set_ylabel('v(t)')
ax.grid(True)
ax = axes[2]
ax.plot(tc[:-1],uc[0])
ax.grid(True)
ax.set_xlabel('t')
ax.set_ylabel('u(t)');

If you run the code with

```
optpolicy=2, fdisturbances=1 
```
 
which mean disturbances will be added, you will notice undesired behavior. This is because the control input is only evaluated once. If we recompute at every time step (assuming there will be no future disturbances, and hence in a certainty equivalent fashion) we will obtain desired behavior. In order to 
do this we need to readjust the problem solved at every time step.

Define, at each time step $k$

$$\bar{x} = [x_{k+1}^T \ \ x_{k+2}^T \dots x_h^T ]^T$$

and

$$\bar{u} = [u_k \ \ u_{k+1} \dots u_{h-1} ]^T$$

Then, we can write the cost function from the current step until the terminal step 

$$\sum_{\ell=k}^{h-1}x_\ell^TQx_\ell+u_\ell^TRu_\ell+x_h^TQ_hx_h$$

as

$$(\bar{A}x_k+\bar{B}\bar{u})^T \bar{Q}(\bar{A}x_k+\bar{B}\bar{u})+\bar{u}^T\bar{R}\bar{u}$$

and the dynamic model linear equations as

$$x=\bar{A}x_k+\bar{B}\bar{u}$$

where now

$$\bar{A}=\left[\matrix{ A \cr A^2 \cr \dots \cr A^{h-k}}\right] \bar{B}=\left[\matrix{ 
B & 0 & 0 & 0 &\dots & 0 \cr AB & B & 0 & 0 &\dots & 0 \cr A^2B & AB & B & 0 
&\dots & 0 \cr A^3B & A^2B & AB & B &\dots & 0 \cr \dots & \dots & \dots & \dots 
&\dots & \dots\cr A^{h-1-k}B & A^{h-2-k}B & A^{h-3-k}B & \dots &\dots & B}\right]$$ 

$$\bar{Q}=\left[\matrix{ Q & 0 & 0  &\dots & 0 \cr 0 & Q & 0 &\dots & 0 \cr\dots 
& \dots & \dots & \dots &\dots \cr0 & 0  & 0 &\dots & Q_h}\right] \bar{R}=\left[\matrix{ 
R & 0 & 0  &\dots & 0 \cr 0 & R & 0 &\dots & 0 \cr\dots & \dots & \dots & \dots 
&\dots \cr0 & 0  & 0 &\dots & R}\right]$$

Moreover, the inequality constraints can still be written as

$$\left[\matrix{ I \cr -I}\right]\bar{u}\leq c \left[\matrix{ 1 \cr 1 \cr \dots\cr 1}\right]$$

If we now consider additive disturbances by running the code with

```
optpolicy=3, fdisturbances=1 
```
 
we obtain desired behavior.

Since the time horizon $h$ might be large we might want to run the optimization 
for a shorter time horizon, in a Model predictive control fashion. In this case 
we reformulate the problem as follows

Define, at each time step $k$

$$\bar{x} = [x_{k+1}^T \ \ x_{k+2}^T \dots x_{k+H-1}^T ]^T$$

and

$$\bar{u} = [u_k \ \ u_{k+1} \dots u_{k+H-1} ]^T$$

Then, we can write the cost function from the current step until $H$ steps ahead

$$\sum_{\ell=k}^{k+H-1}x_\ell^TQx_\ell+u_\ell^TRu_\ell+x_h^TQ_hx_h$$

as

$$(\bar{A}x_k+\bar{B}\bar{u})^T \bar{Q}(\bar{A}x_k+\bar{B}\bar{u})+\bar{u}^T\bar{R}\bar{u}$$

and the dynamic model linear equations as

$$x=\bar{A}x_k+\bar{B}\bar{u}$$

where now

$$\bar{A}=\left[\matrix{ A \cr A^2 \cr \dots \cr A^{H-1}}\right] \bar{B}=\left[\matrix{ 
B & 0 & 0 & 0 &\dots & 0 \cr AB & B & 0 & 0 &\dots & 0 \cr A^2B & AB & B & 0 
&\dots & 0 \cr A^3B & A^2B & AB & B &\dots & 0 \cr \dots & \dots & \dots & \dots 
&\dots & \dots\cr A^{H-2}B & A^{H-3}B & A^{H-4}B & \dots &\dots & B}\right]$$ 

$$\bar{Q}=\left[\matrix{ Q & 0 & 0  &\dots & 0 \cr 0 & Q & 0 &\dots & 0 \cr\dots 
& \dots & \dots & \dots &\dots \cr0 & 0  & 0 &\dots & Q_h}\right] \bar{R}=\left[\matrix{ 
R & 0 & 0  &\dots & 0 \cr 0 & R & 0 &\dots & 0 \cr\dots & \dots & \dots & \dots 
&\dots \cr0 & 0  & 0 &\dots & R}\right]$$

Moreover, the inequality constraints can still be written as

$$\left[\matrix{ I \cr -I}\right]\bar{u}\leq c \left[\matrix{ 1 \cr 1 \cr \dots\cr 1}\right]$$

We can check the behavior of this policy without disturbances by picking

```
optpolicy=4, fdisturbances=0 
```
 
and with disturbances by picking

```
optpolicy=5, fdisturbances=1 
```