# Assignment 5
- **Assigned:** Tuesday, April 12
- **Due:** Tuesday, April 26 @ 5pm  ET

In [2]:
import copy
import warnings
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import animation
import numpy as np

import scipy.linalg
import scipy.signal
from scipy.integrate import odeint

import ipywidgets as widgets 
from IPython.display import display, HTML, Math

from sympy import Matrix, latex, init_printing
init_printing()

# try:
#   import control
# except ModuleNotFoundError:
#   print("Could not find control - installing")
#   !pip install control # takes 20 sec
#   !pip install slycot # takes 2-3 min
# import control

float_formatter = "{:.6f}".format
np.set_printoptions(formatter={'float': '{: 0.6f}'.format})
np.set_printoptions(precision=6)
plt.rcParams["font.serif"] = "cmr12"
plt.rcParams["figure.dpi"] = 150

## Problem 1

Using dynamic programming, find the minimum-time path and time from point A to point B, moving only to the right.  The time to travel between points is shown on each segment. Compare the DP result (path and cost) with the "forward greedy" path (i.e., the one where you just decide the best path of the options available as you move from A to B assuming that you also only move to the right to avoid repeated nodes).

<div align="center">
  <img src="http://drive.google.com/uc?export=view&id=1OPwMWH6co55i_EBtmdZj3Fpfy5nEadgO" alt="problem description" width="50%" />
</div>

$
\renewcommand{\jmatc}[4]{\left[ \ba{cc} #1 & #2 \\[.5em] #3 & #4 \ea
\right]}
\newcommand{\jmatr}[4]{\left[ \ba{rr} #1 & #2 \\[.5em] #3 & #4 \ea
\right]}
\newcommand{\jmthr}[9]{\left[ \ba{rrr} #1 & #2 & #3 \\
#4 & #5 & #6 \\ #7 & #8 & #9\ea \right]}
\newcommand{\jvthr}[3]{\left[ \ba{r} #1 \\ #2 \\ #3\ea \right]}
\newcommand{\jhthr}[3]{\left[ \ba{rrr} #1 & #2 & #3\ea \right]}
\newcommand{\jvechc}[2]{\left[ \ba{cc} #1 & #2 \ea \right]}
\newcommand{\jcolc}[2]{\left[ \ba{c} #1 \\ #2 \ea \right]}
\newcommand{\jvecvc}[2]{\left[ \ba{c} #1 \\ #2 \ea \right]}
\newcommand{\bmat}{\begin{bmatrix}}
\newcommand{\emat}{\end{bmatrix}}
\newcommand{\expect}[1]{\expec\left[ #1 \right]}
\newcommand{\mb}{\mathbf}
\newcommand{\argmax}{\operatornamewithlimits{argmax}}
\newcommand{\argmin}{\operatornamewithlimits{argmin}}
\newcommand{\bea}{\begin{eqnarray}}
\newcommand{\eea}{\end{eqnarray}}
\newcommand{\beas}{\begin{eqnarray*}}
\newcommand{\eeas}{\end{eqnarray*}}
\newcommand{\ba}{\begin{array}}
\newcommand{\ea}{\end{array}}
\newcommand{\njbv}{\mathbf{v}}
\newcommand{\njbw}{\mathbf{w}}
\newcommand{\njbx}{\mathbf{x}}
\newcommand{\njby}{\mathbf{y}}
$

## Problem 2

We discussed in class that LQR is a great **regulator** in that
it quickly returns the system states to $0$ while balancing the amount
of control used. However, we are also interested in tracking a
reference command, so that $y(t)=r(t)$ as $t\to \infty$.

1. Design a steady state LQR controller for the system using
$R_{xx}=I_2$, $R_{uu}=0.01$
$$
\beas \dot x
= \jmatc{1}{1}{1}{2} x + \jvecvc{1}{0} u \qquad y=\jvechc{1}{0} x
\eeas
$$
A naive way to implement a reference tracker is to modify the LQR
controller from $u=-Kx$ to $u=r-Kx$:
$$
\dot x = (A-BK) x + Br  ~,~~ y=Cx
$$
Verify that this leads to particularly poor tracking of a step input! (e.g., using the [final value theorem](https://en.wikipedia.org/wiki/Final_value_theorem))

2. An alternative strategy is to use $u=Nr-Kx$, where $N$ is a
constant. What is a good way to choose $N$ to ensure zero steady state
error for this closed-loop system? What are the consequences of this
change in the step response of the closed-loop system?

3. A completely different approach to ensuring zero steady-state
error is to use what is often called an LQ-servo. The approach is to
add a new state to the system that integrates the tracking error: $
\dot x_i = r-y = r - C x,  $ giving:
$$
\beas
\left[\ba{c} \dot x  \\ \dot x_i \ea \right] &=& \jmatc{A}{0}{-C}{0} \left[\ba{c} x  \\ x_i \ea \right]  + \jvecvc{B}{0} u + \jvecvc{0}{1} r\\
y&=&\jvechc{C}{0} \left[\ba{c} x  \\ x_i \ea \right]
\eeas
$$
The LQR problem statement can now be modified (ignore $r$ in the design
of $u$) to place a high weighting on $x_i$ to penalize the tracking
error. Use this technique to design a new controller (keep $R_{xx}$ and
$R_{uu}$ the same as part (1) and tune the weight on $x_i$ to achieve a
performance that is similar to part (2)). Compare the transient
responses for the approaches in (2) and (3) - do you see any advantages
to one approach over the other?

In [6]:
# part 1 solving the steady state CARE 
Rxx = np.array([[1,0],[0,1]])
Ruu=0.01
A=np.array([[1,1],[1,2]])
B=np.array([[1],[0]])
C=np.array([1,0])
P_ss=scipy.linalg.solve_continuous_are(A,B,Rxx,Ruu)
print(P_ss)

#calculate optimal LQR gain K

K=np.matmul(B.T,P_ss)/Ruu
print(K)

r=1 # verify long term response of the system to a unit step reference input
x=np.array[[1],[1]]
dt=0.01 #discretization time
N=1000 #number of iterations

Ad=np.linalg.expm(A)


for k in range(N):
    

[[0.153193 0.520217]
 [0.520217 6.255535]]
[[15.319337 52.021703]]


$
\renewcommand{\jmatc}[4]{\left[ \ba{cc} #1 & #2 \\[.5em] #3 & #4 \ea
\right]}
\newcommand{\jmatr}[4]{\left[ \ba{rr} #1 & #2 \\[.5em] #3 & #4 \ea
\right]}
\newcommand{\jmthr}[9]{\left[ \ba{rrr} #1 & #2 & #3 \\
#4 & #5 & #6 \\ #7 & #8 & #9\ea \right]}
\newcommand{\jvthr}[3]{\left[ \ba{r} #1 \\ #2 \\ #3\ea \right]}
\newcommand{\jhthr}[3]{\left[ \ba{rrr} #1 & #2 & #3\ea \right]}
\newcommand{\jvechc}[2]{\left[ \ba{cc} #1 & #2 \ea \right]}
\newcommand{\jcolc}[2]{\left[ \ba{c} #1 \\ #2 \ea \right]}
\newcommand{\jvecvc}[2]{\left[ \ba{c} #1 \\ #2 \ea \right]}
\newcommand{\bmat}{\begin{bmatrix}}
\newcommand{\emat}{\end{bmatrix}}
\newcommand{\expect}[1]{\expec\left[ #1 \right]}
\newcommand{\mb}{\mathbf}
\newcommand{\argmax}{\operatornamewithlimits{argmax}}
\newcommand{\argmin}{\operatornamewithlimits{argmin}}
\newcommand{\bea}{\begin{eqnarray}}
\newcommand{\eea}{\end{eqnarray}}
\newcommand{\beas}{\begin{eqnarray*}}
\newcommand{\eeas}{\end{eqnarray*}}
\newcommand{\ba}{\begin{array}}
\newcommand{\ea}{\end{array}}
\newcommand{\njbv}{\mathbf{v}}
\newcommand{\njbw}{\mathbf{w}}
\newcommand{\njbx}{\mathbf{x}}
\newcommand{\njby}{\mathbf{y}}
$

## Problem 3

LQG controllers belong to the special class of 
model based compensators (for obvious reasons: they use the $A$,
$B_u$ and $C_y$ matrices explicitly in the compensator.) While the
LQG controllers might appear to be glamorous, they are actually
quite ordinary for SISO systems. For example, consider control
design for the second order system with the transfer function
$$
G(s) = \frac{-s+3}{(-s+4)(s+3)}
$$
Convert the system $G(s)$ to a state-space representation and then
design an LQG controller for this system. In the process, address
the following issues:

1. What are reasonable choices for the weighting matrices? In
particular, what are the restrictions on the choices, and how are
they met by your selection?
2. Calculate the corresponding compensator $A_c$, $B_c$, and $C_c$
matrices. Where are the compensator poles and zeros?
3. Can you give a classical interpretation of how your state-space
controller stabilizes this system. Does this approach agree with
what you would have expected?

## Problem 4

Consider the system with dynamics
$$
\begin{align}
\dot{x} &=
\begin{bmatrix}0&1\\-6&-5\end{bmatrix}x + \begin{bmatrix}0\\1\end{bmatrix}u \\
z &= \begin{bmatrix}1&1\end{bmatrix}x
\end{align}
$$
and cost function
$$
J=\frac{1}{2}\int_0^\infty\left(z^2(t) + \rho u^2(t)\right)dt
$$

1. Find the control law $u(t)=-Fx(t)$ that minimized the cost for $\rho=0.01$. Also report the solution to the Riccati equation and the closed-loop poles.

2. Plot the closed-loop response $z(t)$ and control input $u(t)$ for the initial conditions $x_1(0)=1$, $x_2(0)=0$.

3. Repeat parts (1) and (2) for the system
$$
\begin{align}
\dot{x} &=
\begin{bmatrix}0&1\\-6&-5\end{bmatrix}x + \begin{bmatrix}0\\1\end{bmatrix}u \\
z &= \begin{bmatrix}1&-1\end{bmatrix}x
\end{align}
$$
with the same cost function.

4. Plot/compare the state response in each closed-loop case. Also compare the performance variables $z_1(t)$ and $z_2(t)$. Which system has the better closed-loop performance? Explain why the gain $F$ and the closed-loop poles are the same, but the performance can be so different. **Hint:** check the observability.

## Problem 5

Given the cost
$$
J=\int_0^2 (u-t)^2 dt
$$
and dynamics $\dot x = u$, with $u(t) \leq 1$, $x(0)=1/8$, and
$x(2)=1$, the goal is to find the optimal control law.

1. Given the inequality constraint on $u$, we know that it is
possible that 2 types of subarcs might exist - one in which the $u$
constraint is active, and one in which it is not. Use the knowledge
of the properties of the optimal solution discussed in class to show
that, regardless of the number or order of these subarcs, the
costate $p(t)$ must be a constant for all $0 \leq t \leq 2$.

2. Show that an attempt to solve this problem using a solution in which
the control constraint is ignored gives an answer that actually
violates the constraints.

3. Similarly show that a solution in which the control
constraint is always active violates the boundary conditions.

4. The results in (1)--(3) suggest
that a likely solution is one that has an inactive subarc followed
by an active one. Use the known properties of the Hamiltonian across
corners to show that the corner time $t_s$ is related to the costate
by
$$
p(t)=-2(1-t_s)
$$
and thus conclude that since $p(t)$ is constant, then there is
only 1 corner time $t_s$.
5.  Finally show that $t_s=3/2$. What is $x(t_s)$?