# Optimal Control

## Introduction

We have a system with a state $z_l \in \mathbf{R}^q$ that varies over time steps $l = 1,\ldots,L$, and actions or inputs $u_l \in \mathbf{R}^p$ that we can invoke in each step to affect the state. For example, $z_l$ might be the position and velocity of a rocket and $u_l$ the output of the rocket's thrusters. We model the evolution of the state as a linear dynamical system, i.e.,

$$z_{l+1} = F_lz_l + G_lu_l + h_l, \quad l = 1,\ldots,L-1,$$

where $F_l \in \mathbf{R}^{q \times q}, G_l \in \mathbf{R}^{q \times p}$, and $h_l \in \mathbf{R}^q$ are known dynamics matrices.

Given an initial state $z_{\text{init}}$, our goal is to find an optimal set of actions that steers the system to a terminal state $z_{\text{term}}$. We do this by solving the finite-horizon optimal control problem

$$\begin{array}{ll}
\text{minimize} & \sum_{l=1}^L \phi_l(z_l, u_l) \\
\text{subject to} & z_{l+1} = F_lz_l + G_lu_l + h_l, 
\quad l = 1,\ldots,L-1, \\
& z_1 = z_{\text{init}}, \quad z_L = z_{\text{term}}
\end{array}$$

with variables $z_l \in \mathbf{R}^q$ and $u_l \in \mathbf{R}^p$ and cost functions $\phi_l: \in \mathbf{R}^q \times \mathbf{R}^p \rightarrow \mathbf{R} \cup \{\infty\}$. We will focus on a time-invariant linear quadratic version of 
this problem where $F_l = F, G_l = G, h_l = 0$, and

$$\phi_l(z_l,u_l) = \|z_l\|_2^2 + \|u_l\|_2^2 + I_{\{u\,:\,\|u\|_{\infty} \leq 1\}}(u_l), \quad l = 1,\ldots,L.$$

Here the set indicator is defined as

$$I_{\{u\,:\,\|u\|_{\infty} \leq 1\}}(u_l) 
= \begin{cases} 0 & \|u_l\|_{\infty} \leq 1 \\ 
\infty & \text{otherwise} \end{cases}$$

## Reformulate Problem

Let $z = (z_1,\ldots,z_L) \in \mathbf{R}^{Lq}$ and $u = (u_1,\ldots,u_L) \in \mathbf{R}^{Lp}$. The objective function is

$$\sum_{l=1}^L \phi_l(z_l,u_l) = \|z\|_2^2 + \|u\|_2^2 + I_{\{u\,:\,\|u\|_{\infty} \leq 1\}}(u)$$

If we define

$$\tilde F = \left[\begin{array}{ccccc}
I &    0 & \ldots &        0 &   0 \\
-F_1 &    I & \ldots &        0 &   0 \\
0 & -F_2 & \ldots &        0 &   0 \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
0 &    0 & \ldots & -F_{L-1} & I \\
0 &    0 & \ldots &        0 & I
\end{array}\right], \quad 
\tilde G = \left[\begin{array}{ccccc}
0 &    0 & \ldots &        0 & 0 \\
-G_1 &    0 & \ldots &        0 & 0 \\
0 & -G_2 & \ldots &        0 & 0 \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
0 &    0 & \ldots & -G_{L-1} & 0 \\
0 &    0 & \ldots &        0 & 0
\end{array}\right], \quad
\tilde h = \left[\begin{array}{c}
z_{\text{init}} \\ h_1 \\ \vdots \\ h_{L-1} \\ z_{\text{term}}
\end{array}\right],
$$

then the constraints can be written compactly as $\tilde Fz + \tilde Gu = \tilde h$. Thus, the time-invariant linear quadratic control problem fits the standard form with

$$f_1(x_1) = \|x_1\|_2^2, \quad f_2(x_2) = \|x_2\|_2^2 + I_{\{u\,:\,\|u\|_{\infty} \leq 1\}}(u)$$
$$A_1 = \tilde F, \quad A_2 = \tilde G, \quad b = \tilde h,$$

where $x_1 \in \mathbf{R}^{Lq}$ and $x_2 \in \mathbf{R}^{Lp}$. (Notice that we could also split the objective across time steps, so each $f_i$ represents the state/action cost at a particular $l$).

## Generate Data

TODO.