<table style="width:100%"><tr>
<td> 
    
<b>Technische Universität Dortmund</b>    
Department of Bio- and Chemical Engineering\
Laboratory of Process Automation Systems\
Prof. Dr. Sergio Lucia </td>
<td>  <img src="./figures/tudo_logo.png" style="width: 60%;" align="right"/> </td>
</tr>
</table>

# Advanced Process Control - Tutorial 05
WS 2022 / 23 

***



# <span class="graffiti-highlight graffiti-id_p7jzyhv-id_fbdyloi"><i></i>Part 1: Introduction to simulating and visualizing state space systems</span>

In this tutorial we will implement our own LQR controller on a simple linear system. It is our first step towards optimal control as it is the first controller that can be calculated based upon a cost function.


First, import the required Python packages:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as lin

## <span class="graffiti-highlight graffiti-id_qmpqoz0-id_hxqn1kh"><i></i>System representation of oscillating masses</span>

In the following, we investigate a simple linear system consisting of two linear oscillating masses which are connected through springs and can be controlled with a single input.
<img src="./figures/oscillating_masses.png" style="width: 100%;" align="center"/>

We define that linear system in discrete-time state-space form such that:

$$ x_{k+1} = Ax_{k}+Bu_{k}$$

where $x\in \mathbb{R}^n$, are the states of the system
and $u \in \mathbb{R}^m$ are the inputs.
The subscript $k$ and $k+1$ respectively, denote the discrete time instance of these variables.
In our case, the states are the displacement of the masses from their resting position as well as their velocity.


The state space system is defined in terms of the **system matrix** $A\in \mathbb{R}^{n\times n}$ and the **input matrix** $B\in \mathbb{R}^{n\times m}$. We skip the derivation and parameters of the system and just give you the numerical values of the resulting matrices.

In [124]:
# System matrix:
A = np.array([[ 0.76272095,  0.45961393,  0.11486161,  0.0198116 ],
               [-0.89941626,  0.76272095,  0.41999073,  0.11486161],
               [ 0.11486161,  0.0198116 ,  0.76272095,  0.45961393],
               [ 0.41999073,  0.11486161, -0.89941626,  0.76272095]])
nx = A.shape[1]

# The input matrix is defined as
B = np.array([[0.01413191],
            [0.06277108],
            [0.22062828],
            [0.36695456]])

nu = B.shape[1]


As you should familarize yourself with the system we will simulate it first without any controller. The simulation time should be $T=5\,{s}$ with a discrete time step of $\Delta t=0.05\, s$ which gives us $101$ discrete simulation steps:

In [125]:
T=5
delta_t=0.05

## <span class="graffiti-highlight graffiti-id_63qn5yk-id_ch8gk4a"><i></i>Task 01: Simulate the linear system</span>
 Simulate the system response to a step response:
 1. Create a vector ``t``to represent the time using the time step and the overall simulation time. Write the maximum number of steps in a variable ``max_step``.
 2. Generate an input vector ``u_step`` over time with a step response which means all values should be one except the value at the timestep zero.
 3. Initialise a matrix ``x_data`` with zeros, in which you will store the results of the states. As we like to look at a step response for $x_0=(0\, 0\, 0\, 0)^\top$, $x_0$ is already set after this step.
 4. Use ``max_step`` to generate a for-loop which will simulate the the system. In each step
     1. Take the last state from ``x_data`` as the new $x_0$ and the current input $u_k$ from ``u_step`` and apply the system equation.
     2. Store the new state in ``x_store``.
 
 **Hints:** The functions [np.zeros](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html) and [np.ones](https://numpy.org/doc/stable/reference/generated/numpy.ones.html) might be of some help here.

In [126]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_fy5cy8e-id_x3poeus"><i></i><button>Show Solution</button></span>

<span class="graffiti-highlight graffiti-id_fy05lge-id_0cpc4rq"><i></i>How does this work?</span>

## <span class="graffiti-highlight graffiti-id_uv5ep9w-id_9sxl6tg"><i></i>Task 02: Visualization</span>
Plot the resulting states using the matplotlib libary. Write a function ``visualize_results(t,x,u)`` which is able to plot all states and inputs as subplots. 

You can write your own code or follow the suggestions below:
1. Create a figure with subplots using ([plt.subplot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplot.html)). The number of needed subplots can be calculated using the dimensions of the states and inputs. The method ``plt.subplot`` returns a ``figure`` and ``axes``object.
2. Use [axes.set_xlabel](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.set_xlabel.html) and [axes.set_ylabel](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.set_ylabel.html) to set appropriate x and y labels.
3. Use [axes.plot](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.plot.html) to plot the respective curves.

In [128]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_dd5ntpc-id_epttwg1"><i></i><button>Show Solution</button></span>

<span class="graffiti-highlight graffiti-id_kxtog8v-id_fkrel5a"><i></i>How does this work?</span>

If you like you can look into other response like an impulse response or increase/decrease step sizes and simulation time. 

## <span class="graffiti-highlight graffiti-id_8jmkvlo-id_x3af2z9"><i></i>Task 03: Investigate the stability of the proposed system</span>

We now want to investigate the stability property of the presented system.
For this purpose, we ask you to:

1. Determine the Eigenvalues of the system matrix. <br>
**On a sidenote**: You can trigger the help / doc string of any function simply by writing:
```
np.linalg.eig?
```
or by hitting ``shift + tab`` to open a pop-up window describing the function under your cursor.

2. What is the stability region of time-discrete state-space systems?
3. Compare the eigenvalues with the stability criteria.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_i085sew-id_8cyouzl"><i></i><button>Show Solution</button></span>

For the last task you can use the given plot function below to compare your results. Just make sure you have the eigenvalues stored in ``lam``.

In [None]:
def plot_eigenvalues(lam):
    fig, ax = plt.subplots(figsize=(6,6))
    ax.set_xlabel('Real component')
    ax.set_ylabel('Imag. component')
    ax.add_artist(plt.Circle((0, 0), 1,edgecolor='k', fill=False))
    ax.plot(np.real(lam),np.imag(lam),'o', markersize=10)
    ax.axhline(0,color='k')
    ax.axvline(0,color='k')
    ax.set_ylim(-1,1)
    ax.set_xlim(-1,1)
plot_eigenvalues(lam)    

<span class="graffiti-highlight graffiti-id_sd38wsw-id_o4y0kao"><i></i>The stability region of a discrete-time system is **within the unit circle**</span>. As the eigenvalues lay directly on this circle, we can conclude that the system is stable but **not** asymptotically stable. If you go back to the plot you see exactly this result for the step response. 

# <span class="graffiti-highlight graffiti-id_gyavstk-id_p1g9rj1"><i></i>Part 2: Derivation of the LQR controller from Dynamic Programming</span>

After getting familiar with the linear system we now want to control it using a LQR controller. A LQR controller delivers optimal inputs into the linear system based upon a cost function. No boundaries or other conditions can be includued. To get a real grasp on how to implement this, we will solve it manually without CasADi or other tools. We want to get an optimal controller gain $K$ based upon a quadratic cost function, which is essentially a feedback controller of all states ($u_k=-K\cdot x_k$).

<img src="./figures/closed_loop.svg" style="width:40%;" align="center"/>

In fact the controller in this figure is a classic closed loop which structure should already be known. Notice that the setpoint in which we want to control our system is that all states are zero, so we don't need a feedforward input. The system dynamics only depends from the initial state and the choice of the controller gain. Before applying such a controller, we will take a look at Dynamic Programming and derive the LQR controller from there.

## <span class="graffiti-highlight graffiti-id_t869hzg-id_okqvpql"><i></i>Task 04: Setup the cost function</span>
An LQR controller can be derived from Dynamic Programming for optimal problems with a quadratic cost function

$$J=\frac{1}{2}\sum_{k}^{N-1}x_k^T\cdot Q\cdot x_k+u_k^T\cdot R\cdot u_k+ \frac{1}{2}x_N^T\cdot S\cdot x_N$$

Q define the state costs. S defines the final state cost of the considered horizon $N$ (how long you calculate costs into the future) and R penalizes the input. Consider that you want to penalize the input value the same as all states with a normed cost of one.
- Generate the corresponding matrices, if you penalize the last state the same as all other states. Use the [``np.eye``](https://numpy.org/devdocs/reference/generated/numpy.eye.html) function and the system dimensions of Part 1 for this:

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_ogus2rn-id_usavn46"><i></i><button>Show Solution</button></span>

## <span class="graffiti-highlight graffiti-id_8cobc8k-id_gitejog"><i></i>Dynamic Programming</span>
As stated in the lecture, the LQR approach can be derived of the concept of Dynamic Programming. Dynamic Programming is a way to solve optimal control problems by starting from the last step of the problem and solve the optimal decision in the steps beforehand recursively. This section should give you an idea where the controller gain of the LQR actually derives from.

For this we will first look at the optimal control for an finite horizon. As shown in the lecture, the optimal control input can be calculated as a feedback in each step:

$$u_{k}=-K_k\cdot x_{k}$$

Without derivation this controller gain can be computed as:

$$K_k=(R+B^T P_k B)^{-1}B^TP_kA \, .$$

$P_k$ is the cost in each step.
It can be derived recursively as

$$P_{k-1}=Q+A^T P_k A-A^T P_k B\left(R+B^T P_k B\right)^{-1} B^T P_k A.$$

This directly results out of the optimal control problem at hand. For further information on the derivation of these formulas revisit the fifth lecture.


## <span class="graffiti-highlight graffiti-id_by8caop-id_ah40vbw"><i></i>Task 05: Calculate the optimal controller gain with Dynamic Programming</span>
1. Start with simulating a finite horizon of ``N=100``. Initiate three dimensional arrays for your cost matrices $P$ and $K$ full of zeros over time and for their dimensions in each time step. In this task you can set the index for the time at the first dimension as it will ease indexing.
2. Apply the cost for the last step ($S$) to the cost matrix at the last step. From there calculate the last controller gain matrix from the formula above.
3. Now generate a loop over time. You need to remember to set your indices right as you approach this recursively starting from the last step. In each step:
    1. Calculate the new cost matrix $P_{k-1}$.
    2. Calculate the new controller gain $K_{k-1}$.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_s57fw5x-id_of5hg59"><i></i><button>Show Solution</button></span>

<span class="graffiti-highlight graffiti-id_ypt9acs-id_u4ks4us"><i></i>How does this work?</span>

## Task 06: Visualize the controller values
Now comes the important part for understanding why for a horizon $N\rightarrow \infty$ the LQR controller is a fixed controller gain matrix. For this

1. Visualize the elements of $K$ over time. **Hint:** We have the dimension ``K.shape = (100, 1, 4)`` which cannot be plotted directly. With [np.squeeze](https://numpy.org/doc/stable/reference/generated/numpy.squeeze.html) you can remove the unused dimesion, i.e. ``K.squeeze().shape = (100, 4)``.
2. Investigate the plot. What does this result show you and what is to be expected if you make $N$ larger?
3. Retrieve the controller gain matrix at `k=0` and assign its value to ``K_end``.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_ctir2mm-id_qqsk7v0"><i></i><button>Show Solution</button></span>

<span class="graffiti-highlight graffiti-id_09o2cxq-id_ffdp08w"><i></i>If you look at the plot at hand, you might notice that the values for $K$ are getting saturated</span>, if you starting from the ***last state***. Increasing the horizon (Just increase $N$) will also deminish the importance of the last gain value. Therefore you should see that the cost for $N \rightarrow \infty$ will be constant, which means that the controller gain is also constant, as it is directly derived from it. This is useful as it allows us to design an optimal controller gain, which does not vary over time.

# <span class="graffiti-highlight graffiti-id_6x9ssfo-id_eki37ao"><i></i>Part 3: Implement a LQR controller </span>
For the horizon $N \rightarrow \infty$ the above equation results in a constant cost $P_{k-1}=P_{k}$. Now the controller can be calculated by solving the Ricatti equation:

$$P=A^T P A-\left(A^T P B\right)\left(R+B^T P B\right)^{-1}\left(B^T P A\right)+Q,$$

which can be solved for $P$.
The optimal controller gain is then computed as:

$$K= (R+B^T P B)^{-1} B^T P.$$

You might notice that this is a controller gain that does not vary over time any more. So in this case we can compute the controller gain based on the cost function and the system matrices itself. We do not need the final penalization $S$ anymore as this would punish the state after eternal time. We finally derived a controller which looks like in the picture:

<img src="./figures/closed_loop.svg" style="width:40%;" align="center"/>

## <span class="graffiti-highlight graffiti-id_sdhnogh-id_uw9cxib"><i></i>Task 07: Calculate the controller gain matrix</span>
1. Start by solving the Ricatti equation using [lin.solve_discrete_are](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.solve_discrete_are.html). As an output you get the $P$-matrix.
2. Use that matrix to calculate the correct controller gain $K$.
3. Compare this controller gain with the saturated value for the Dynamic Programming approach stored in ``K_end``. 

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_otgxspu-id_hy6jnak"><i></i><button>Show Solution</button></span>

<span class="graffiti-highlight graffiti-id_lh0817f-id_daw6iut"><i></i>As expected, the values are the same.</span>

## <span class="graffiti-highlight graffiti-id_64w2m91-id_hkactrz"><i></i>Task 08: Investigate the stability of the closed loop system</span>
The calculated feedback control gain can be used to look into the system behaviour of the closed loop system. For this:
1. Determine the matrix representation of the closed-loop system.
2. Calculate the eigenvalues for the new closed-loop system matrix.
3. Investigate the stability of the closed-loop system.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_vjckg1q-id_9cma4ez"><i></i><button>Show Solution</button></span>

Again you can visualize the eigenvalues using the eigenvalue plot function.

In [None]:
plot_eigenvalues(lam)

As expected this system is now **<span class="graffiti-highlight graffiti-id_2fsi1jq-id_l05v0a8"><i></i>stable**</span>, with all eigenvalues within the unit circle.

## <span class="graffiti-highlight graffiti-id_n4hpmqh-id_t3kh9wl"><i></i>Task 09: Simulate the closed-loop system with LQR controller</span> 

 Similar to Task 01, you should simulate the system, now with a LQR state feedback controller. Do the following steps:
1. Create a vector ``t``to represent the time using the time step and the overall simulation time. Write the maximum number of steps in a variable ``max_step``.
2. Generate an input vector ``u_data`` over time in which we will store the inputs to the oscillating mass system given by the controller. 
3. Initialise a matrix ``x_data`` with zeros  in which you will store the results of the states. Now initialise the system at a start vector $x_0=(1\, 1\, 1\, 1)^\top$.
4. Use ``max_step`` to generate a for-loop which will simulate the the system. In each step
     1. Take the last state from ``x_data`` and:
         1. Compute the current input to the system with the feedback-equation.
         2. Compute the next state with the current state and current input considering the system matrices.
     2. Store the new state in ``x_data`` and the inputs generated by the feedback loop in ``u_data``.
 
 **Hints:** The functions [np.zeros](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html) and [np.ones](https://numpy.org/doc/stable/reference/generated/numpy.ones.html) might again be of some help here.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_nctkeah-id_la4swc1"><i></i><button>Show Solution</button></span>

## Task 10: Visualize Results
1. Visualize your results by using the previously defined function ``visualize_results()`` with the correct input.
2. Analyze the results.

In [None]:
# Write your code here!

<span class="graffiti-highlight graffiti-id_bpmsfdu-id_wuimu60"><i></i><button>Show Solution</button></span>

It is a **stable** response as expected.<span class="graffiti-highlight graffiti-id_g8czbhz-id_ui84yz5"><i></i> Furthermore it is the optimal controller gain based on the cost function, so in this case the input and movements of the masses were punished</span>. You can do the steps of the LQR controller again and implement different cost functions to get a feel for the different parameters of a cost function.

## <span class="graffiti-highlight graffiti-id_adoq1oq-id_ckcvpcz"><i></i>Next steps</span>
You have implemented your first controller based upon an optimal cost function. Next week we will look into the MPC controller which includes constraints and determine the input to the system for each time step based upon a prediction. As take home messages of this exercise, the lecture and the quiz you should by now be able to
- Understand why optimal control is useful
- Understand the Dynamic Programming approach and its link to the LQR controller
- Implement an LQR controller as well as knowing its <span class="graffiti-highlight graffiti-id_adoq1oq-id_8vvbn2o"><i></i>limitations</span>
- Have a knowledge about Python packages specific for optimal control and visualizing results
- "Hang on, this controller structure looks still like the old ones!" What about that predictive controller the professor keeps on talking about? Stay tuned...