# Optimal Control (Qing Wang from BUAA)

# Part 1. Calculus of variations

1. What is a functional?

The first concept to settle is Functional. Define $t$ to be the time and $x(t)$ to be a time-related function. If we define a new function $J$ that uses $x(t)$ as its variable, then $J=J[x(t)]$ is called a functional. In short, functional is the function of a function.


2. What is variation?

In this course, we need to pay attention to two kinds of variation. 
> Variation of a function

Define two functions $X_1(t)$ and $X_2(t)$,
then the variation $\delta X$ is given as 

\begin{equation}
    \delta X = X_1(t) - X_2(t)
\end{equation}
> Variation of a functional

When the function $X(t)$ has the variation of $\delta X$,
then the corresponding functional will change:
\begin{equation}
\begin{aligned}
    \Delta J & = J[X + \delta X] - J[x] \\
    & = \delta J[X, \delta X] + \epsilon \|\delta X\|
\end{aligned}  
\end{equation}
where $\delta J[X, \delta X]$ is the linear functional of $\delta X$. 
If we have $\epsilon \rightarrow 0$ when $\|\delta X\| \rightarrow 0$,
then $\delta J[X, \delta X]$ is the variation of $J[X]$ and $\delta J[X, \delta X]$ is the principal linear part of $\Delta J$. 

Of course,
if $X = X^*$ is the extremum of functional $J(X(t))$,
then it is compulsory to have $\delta J[X^*, \delta X] = 0$.

# Part 2. Variation theory on functional extremum (When there are no extra conditions)

Suppose we want to find an optimal curve $x(t) = x^*(t)$ that can achieve extremum regarding the following cost function:
\begin{equation}
    J = \int_{t_0}^{t_f} F[x(t), \dot{x}(t), t] dt
\end{equation}

Consider minor variation of $\delta x$ and $\delta \dot{x}$ around the optimal curve $x(t)$ and $\dot{x}(t)$,
then we have
\begin{align*}
   & x(t) = x^{*}(t) + \delta x\\
   & \dot{x}(t) = \dot{x}^{*}(t) + \delta \dot{x}\\
\end{align*}

After certain calculation,
the variation of the functional is given as
\begin{equation}
    \delta J = \int_{t_0}^{t_f} \bigg[\frac{\partial F}{\partial x} - \frac{d}{dt}\bigg(\frac{\partial F}{\partial \dot{x}} \bigg) \bigg] \delta x dt + \frac{\partial F}{\partial \dot{x}} \bigg|_{t_{0}}^{t_f}
\end{equation}
where $t_f$ is the terminal time. 

To obtain the optimal curve,
we need to have $J = 0$.
Hence,
it is always necessary to have 
\begin{align*}
    \frac{\partial F}{\partial x} - \frac{d}{dt}\bigg(\frac{\partial F}{\partial \dot{x}} \bigg) = 0
\end{align*}

Regarding the second part of expression of $\delta J$,
there are two different cases:

> The terminal is fixed

We have $x(t_0) = x_0$, $x(t_f) = x_f$ and $\delta x(t_0) = \delta x(t_f) = 0$,
Accordingly,
we have 
\begin{align*}
    \frac{\partial F}{\partial \dot{x}} \bigg|_{t_{0}}^{t_f} = \bigg( \frac{\partial F}{\partial \dot{x}}\bigg)_{t_f} \delta x(t_f) - \bigg( \frac{\partial F}{\partial \dot{x}}\bigg)_{t_0} \delta x(t_0) = 0
\end{align*}

Hence, no further requirements need to be made. 

> The terminal is free

Under this circumstance, we have $\delta x(t_0) \neq 0$ and $\delta x(t_f) \neq 0$.
Hence, we need to have 
\begin{align*}
    \bigg( \frac{\partial F}{\partial \dot{x}}\bigg)_{t_f}  = \bigg( \frac{\partial F}{\partial \dot{x}}\bigg)_{t_0} = 0
\end{align*}

# Part 3. Variation theory on functional extremum (When there are extra conditions)

However, the above results are far from sufficient. 
In control engineering,
it is impossible to have a system that is free of any requirements because the system must be subjected to a certain dynamic function.
Let alone the situations where the control input is also restricted or affected by phenomenons such as input saturation and external disturbances. 

Hence,
we will start with the easiest circumstance where the system is subjected to a certain dynamic function as follows:
\begin{align*}
    \dot{x} = f[x(t), u(t), t]
\end{align*}
where $x(t) \in \mathbb{R}^n$ and $u(t) \in \mathbb{R}^m$. 

Note that $x(t)$ needs to be obtained through the principle of the minimum or dynamic programming. 

The system performance is quantised by the following function:
\begin{align*}
    J = \phi[x(t_f), t_f] + \int_{t_0}^{t_f} F[x(t), u(t), t] dt
\end{align*}

And there are three main cases:

> $t_f$ is fixed, while $x(t_f)$ is free

First, we introduce a scalar function (the Hamilton function)
\begin{align*}
    H(x, u, \lambda, t) = F(x, u, t) + \lambda^{\rm T} f(x, u, t)
\end{align*}
where $\lambda \in \mathbb{R}^n$.

Then the target function is altered to
\begin{align*}
    J_a = \phi[x(t_f), t_f] + \int_{t_0}^{t_f} [H(x, u, \lambda, t) - \lambda^{\rm T} \dot{x} ]dt
\end{align*}

According to the rules of integration by parts,
we have 
\begin{align*}
    J_a = \phi[x(t_f), t_f] - \lambda(t_f)^{\rm T} x(t_f) + \lambda(t_0)^{\rm T} x(t_0) + \int_{t_0}^{t_f} [H(x, u, \lambda, t) - \dot{\lambda}^{\rm T} x ]dt
\end{align*}

To achieve $\delta J_a = 0$ with $\delta x$, $\delta u$ and $\delta x(t_f)$ having arbitrary values,
we need to have
\begin{align*}
    & \dot{\lambda} = \frac{\partial H}{\partial x} \\
    & \dot{x} = \frac{\partial H}{\partial \lambda} {\rm \ (System \ dynamics)} \\
    & \frac{\partial H}{\partial u} = 0 \\
    & \lambda(t_f) = \frac{\partial \phi}{\partial x(t_f)}
\end{align*}


> $t_f$ is free, while $x(t_f)$ is restricted

Suppose the terminal system state is restricted by the function of $G[x(t_f), t_f] = 0_q$,
then we have a new performance function as follows:
\begin{align*}
    J_b = \theta[x(t_f), t_f] + \int_{t_0}^{t_f} [H(x, u, \lambda, t) - \dot{\lambda}^{\rm T} x ]dt
\end{align*}
where $\theta[x(t_f), t_f] = \phi[x(t_f), t_f] + v^{\rm T} G[x(t_f), t_f]$.

To ensure $J_b = 0$,
there is
\begin{align*}
    & \dot{\lambda} = \frac{\partial H}{\partial x} \\
    & \dot{x} = \frac{\partial H}{\partial \lambda} {\rm \ (System \ dynamics)} \\
    & \frac{\partial H}{\partial u} = 0 \\
    & \lambda(t_f) = \frac{\partial \theta}{\partial x(t_f)} = \frac{\partial \phi}{\partial x(t_f)} + \frac{\partial G^{\rm T}}{\partial x(t_f)} v \\
    & H(t_f) = - \frac{\partial \theta}{\partial t_f} = - \frac{\partial \phi}{\partial t_f}- \frac{\partial G^{\rm T}}{\partial t_f} v
\end{align*}


> $t_f$ is fixed, while $x(t_f)$ is restricted

Similar to the above situation, 
we have 
\begin{align*}
    J_b = \theta[x(t_f), t_f] + \int_{t_0}^{t_f} [H(x, u, \lambda, t) - \dot{\lambda}^{\rm T} x ]dt
\end{align*}

What's different is that $\delta t_f = 0$.
Hence,
the conditions are simplified to 
\begin{align*}
    & \dot{\lambda} = \frac{\partial H}{\partial x} \\
    & \dot{x} = \frac{\partial H}{\partial \lambda} {\rm \ (System \ dynamics)} \\
    & \frac{\partial H}{\partial u} = 0 \\
    & \lambda(t_f) = \frac{\partial \theta}{\partial x(t_f)} = \frac{\partial \phi}{\partial x(t_f)} + \frac{\partial G^{\rm T}}{\partial x(t_f)} v
\end{align*}


# Part 4. Principle of the minimum value

## Motivation of developing the principle of the minimum value

The variation principle is based on the assumption that the control input $u$ is not restricted and $\frac{\partial H}{\partial u}$ always exists. However, these assumptions are hard to satisfy in practical scenarios.For example, the practical actuators are usually restricted by input saturation phenomenon, meaning that the acceptable control input belongs to a closed set.

## Principles for continuous time systems

To cope with the issues of $\delta u$ can not be chosen arbitrarily,
the principle of the minimum value is proposed in 1956. 
The specific conditions to satisfy are
\begin{align*}
    & \dot{\lambda} = \frac{\partial H}{\partial x} \\
    & \dot{x} = \frac{\partial H}{\partial \lambda} {\rm \ (System \ dynamics)} \\
    & \lambda(t_f) = \frac{\partial \theta}{\partial x(t_f)} = \frac{\partial \phi}{\partial x(t_f)} + \frac{\partial G^{\rm T}}{\partial x(t_f)} v \\
    & H(t_f) = - \frac{\partial \theta}{\partial t_f} = - \frac{\partial \phi}{\partial t_f}- \frac{\partial G^{\rm T}}{\partial t_f} v \\
    & {\rm min}H_{u \in \Omega} (x^*, \lambda^*, u, t) = H(x^*, \lambda^*, u^*, t) 
\end{align*}