# Table of Contents

* [Introduction](#1)
  - [Dynamical Systems](#2)
    + [Basic terminology of dynamical systems](#3) 
    + [Challenges that may arise when modelling a dynamical system](#4)
    + [Types of dynamical systems](#5)
  - [Control Theory](#5)
    + [Basic terminology of control theory](#6)
    + [Challenges that may arise when controlling a system](#7)
    + [Methods of control](#8)
    + [Markov Decision Processes](#6)
  - [Optimization Problems](#7)
    + [Optimal Control](#8)
* [Solving Optimal Control Problems](#9)
  - [Analytical/ Planning Methods](#10)
    + [Variational Calculus](#11)
    + [Minimum Principle](#12)
    + [Hamilton-Jacobi-Bellman Equation](#13)
  - [Learning Methods](#14)
    + [Dynamic Programming](#15)
    + [Model-Based Reinforcemnt Learning](#16)
    + [Model-Free Reinforment Learning](#17)
* [Implementation of Solutions](#18)
  - [Case Study 1: Trajectory Optimization](#19)
    + [System Design](#20)
    + [Control Implementation](#21)
  - [Case Study 2: Adaptive Control](#22)
    + [System Design](#23)
    + [Control Implementation](#24)
  - [Case Study 3: Stochastic Control](#25)
    + [System Design](#26)
    + [Control Implementation](#27)
* [Code Design](#28)

# Introduction <a id ="1"></a>

This introduction will consist of an overview of the core concepts behind optimal control theory and outline it's connection other fields of knowledge. This introduction will be dedicated to understanding the Optimal Control(OC) problems.

To have a better understanding of OC problems I will begin by providing an overview of the core concepts behind optimal control theory including Dynamical Systems, Control Theory, Markov Decision Process and Optimization Problems. This is to help the reader 

The topics in this section are selected to provide the reader with a broad understanding of some of the concepts that are related to OC. I will start with a few topics about optimization in general and then move on to other methods for solving trajectory optimization problems.

## Dynamical Systems <a id ="2"></a>
Dynamical systems are mathematical models that describe how a system's state changes over time. They are used to study the long-term behavior of complex systems, such as the weather, population growth, and the motion of celestial bodies. Dynamical systems can exhibit a wide range of behaviors, including stability, chaos, and oscillation, depending on the system's parameters and initial conditions.

In the topic of Dynamical Systems I will cover 
1) Basic Terminology of Dynamical Systems
2) Challenges that may arise when modelling a dynamical system
3) Types of dynamical systems.

A dynamic system is said to be:

Stable for some class of initial states if its solution trajectories do not grow without bound,

Unstable (or divergent) if the trajectories grow without bound, and

Convergent if the solution trajectories approach a single point.

Autonomous system if its dynamics are invariant in time.

A **stable point** is a state $x$ such that for some neighborhood
of $x$, the ODE is convergent toward $x$. A necessary condition for a
point to be stable is that it is an *equilibrium point*.


> **Equilibrium point**. For a continuous time dynamical system, a state $x$ such that $\dot{x} = f(x) = 0$.  For a discrete time dynamical system, a state that satisfies $x = f(x)$.

All stable points are equilibria, but the converse is not true, a point can be an equilibrium without being stable.
- Challenges that may arise when modelling a dynamical system:
    
    - **Unknown dynamics**: The system's dynamics are not well understood

    - **High dimensional systems**:
    
    - **Chaotic/Nonlinear systems**:
    
    - **Hidden variables/ Partial observability**: Partial observability means that only certain aspects of the state can possibly be measured by the available sensors. For example, a mobile robot with a GPS sensor can only measure position, whereas it may need to model velocity as part of its state. State estimation techniques, such as Kalman filtering and particle filtering, can be used to extrapolate the unobserved components of state to provide reasonable state estimates. With those estimates, there will be some remaining localization error that the controller will still need to handle.
    
    - **Noise/ Disturbances**
    
    - **Scale differences/ changes**: Generally speaking, errors can be characterized as being either noisy or systematic. A noisy error is one obeys no obvious pattern each time it is measured. A systematic error is one that does obey a pattern. These deviations fall can also be categorized as **motion uncertainty** and **state uncertainty**. Disturbances are a form of motion uncertainty that cause the state to be moved in unexpected ways at future points in time. For example, wind gusts are very hard to predict in advance, and can move a drone from a desired path. Actuation error occurs when a desired control is not executed faithfully. These errors can be treated as motion uncertainty. Measurement error is a type of state uncertainty where due to sensor noise the state is observed incorrectly. Understanding measurement error is critical for closed-loop controllers which base their behavior on the measured state. Modeling error, means that the true dynamics function differs from the actual dynamics of the system. This is sometimes considered a third class of uncertainty, but could also be treated as state uncertainty.

Motion uncertainty can be modeled as a disturbance to the dynamics
\dot{x} = f(x,u) + \epsilon_d
 where \epsilon_{d}(t) in $E_{d}$ is some
error. Here $E_d$ is a set of possible disturbances, or a probability
distribution over disturbances. Motion uncertainty will cause an
open-loop system to "drift" from its intended trajectory over time. A
properly designed closed-loop controller can regulate the disturbances
by choosing controls that drive the system back to intended trajectory.

In many cases it is convenient to talk about discrete-time systems in which time is no longer a continuous variable but a discrete quantity  t=0,1,2,…, and the dynamics are specified in the form

$x_{t+1}=f(x_{t},u_{t})$

Here, the control is allowed to change only at discrete points in time, and the state is only observed at discrete points in time. This more accurately characterizes digital control systems which operate on a given clock frequency. However, in many situations the control frequency is so high that the continuous-time model (2) is appropriate.

Usually systems of the form

$\ddot{x}=f(x,\dot{x},t)$

which relate state and controls to accelerations of the state  $\ddot{x}=\frac{dx^2}{d^2x}$
 . This does not seem to satisfy our definition of a dynamic system, since we've never seen a double time derivative. However, we can employ a stacking trick to define a first order system, but of twice the dimension. Let us define the stacked state vector

  \begin{align}
    y &= \begin{pmatrix}
           x \\
           \dot{x}
         \end{pmatrix}
  \end{align}

Then, we can rewrite ( 4 ) in a first-order form as:

$\dot{y}=g(y, u)$(6)
where  $g(y, u)=f(x, \dot{x}, u)$ simply "unstacks" the state and velocity from  $y$ . Now all of the machinery of first-order systems can be applied to the second order system. This can also be done for dynamic systems of order 3 and higher, wherein all derivatives are stacked into a single vector.



Feedback/closed-loop control is a control system that adjusts its output based on measured differences between the actual output and the desired output, with the goal of reducing those differences over time.

# Optimal Control <a id ="12"></a>

Optimal Control refers to the process of finding the control signals that will drive a dynamic system to a desired state while minimizing or maximizing a given performance criterion, such as time or energy consumption