# Learning an MPC controller with a NN
The goal of this example is learning a MPC controller using a artificial neural network.


## Introduction
Learning a MPC from simulations and use the learned model for real applications could have several advantages <cite data-footcite="Karg2020">Karg and Lucia (2020)</cite>. 
For example for embedded applications often we need to compute the control action in a very short time, with low computational power and low energy (e.g. embedded applications using batteries). For this for this it is useful to learn the controller offline, that basically approximates the solution of the optimization problem given the measured states. The learned controller can be quickly and efficiently evaluated online in the embedded system, without solving an optimization problem.

The system that is considered in this example is a linear mass-spring-damper system, whose dynamics are given by

<img src="../images/mass_spring_damper.png" alt="drawing" width="800"/>

\begin{align}
& \dot{x_1} = x_2 \\
& \dot{x_2} = \frac{1}{m} (u - k x_1 - d x_2) \\
\end{align}

Therein, the state $x_1$ is the displacement of the mass $m$, the state $x_2$ is the velocity of the mass, the input $u$ is the applied force, $k$ is the spring constant and $d$ is the damping factor.

We assume that the states can be measured perfectly, i.e., we will not define a separate output equation.

One sampling period takes $15~\text{ms}$. 

In [1]:
# Add HILO-MPC to path. NOT NECESSARY if it was installed via pip.
import sys
sys.path.append('../../')

import numpy as np
import pandas as pd

from hilo_mpc import Model, NMPC, SimpleControlLoop, ANN, Layer

# Ignore deprecation warnings coming from tensorflow
import warnings
warnings.filterwarnings("ignore") 

## Model


In [2]:
system = Model(plot_backend='bokeh', name='Linear SMD')

# Set states and inputs
system.set_dynamical_states('x', 2)
system.set_inputs('u')

# Add dynamics equations to model
system.set_dynamical_equations(['x_1', 'u - 2 * x_0 - 0.8 * x_1'])

# Sampling time
Ts = 0.015  # Ts = 15 ms

# Set-up
system.setup(dt=Ts)

# Initialize system
x_0 = [12.5, 0]
system.set_initial_conditions(x_0)

# Bound region within which inputs can be located
u_lb = -25  # Lower bound
u_ub = 25  # Upper bound

## Generate data 
We start defining the MPC we want to learn as follows. 

In [3]:
# Make controller
nmpc = NMPC(system)

# Set horizon
nmpc.horizon = 15

# Set cost function
nmpc.quad_stage_cost.add_states(names=['x_0', 'x_1'], weights=[100, 100], ref=[1, 0])
nmpc.set_box_constraints(u_ub=u_ub, u_lb=u_lb)
nmpc.quad_terminal_cost.add_states(names=['x_0', 'x_1'], weights=np.array([[8358.1, 1161.7], [1161.7, 2022.9]]),
                                   ref=[1, 0])

# Set-up controller
nmpc.setup(options={'print_level': 0})

### Simulation

In [4]:
# Vector of simulation time points
Tf = 10  # Final time
t = np.arange(0, Tf, Ts)
n_steps = int(Tf / Ts)
scl = SimpleControlLoop(system, nmpc)
scl.run(n_steps)


******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************



In [5]:
scl.plot(output_notebook=True)

'C:\\Users\\Bruno\\AppData\\Local\\Programs\\Python\\Python37\\lib\\runpy.py'

## Training
Now we define a NN and train it with the generated dataset.

In [6]:
features = system.dynamical_state_names
labels = system.input_names

ann = ANN(features, labels, learning_rate=1e-3)
ann.add_layers(Layer.dense(10, activation='ReLU'))
ann.add_layers(Layer.dense(10, activation='ReLU'))
ann.add_layers(Layer.dense(10, activation='ReLU'))
ann.setup(save_tensorboard=True, tensorboard_log_dir='../Results/learn_mpc/runs')

# Create dictionary that is compatible with the ANN
sol_dict = system.solution.to_dict('x', 'u')

learn_dict = {}
learn_dict.update({name: sol_dict[name].squeeze()[:-1] for name in features})
learn_dict.update({name: sol_dict[name].squeeze() for name in labels})

df = pd.DataFrame(data=learn_dict)

# Add dataset
ann.add_data_set(df)

# Train NN
ann.train(1, 2000, test_split=.2, patience=100, verbose=1)

Evaluate on test data


## Use the NN in closed loop
We now use the ANN controller instead of the MPC. Note that we start from a different initial conditions from the training data

In [7]:
system.reset_solution()
system.set_initial_conditions(x0=[10, 0])
for i in range(n_steps):
    u = ann.predict(x_0)
    system.simulate(u=u)
    x_0 = system.solution['xf']

system.solution.plot(output_notebook=True)

NotImplementedError: Wrong number or type of arguments for overloaded function 'Function_call'.
  Possible prototypes are:
    call(self,dict:DM,bool,bool)
    call(self,[DM],bool,bool)
    call(self,[SX],bool,bool)
    call(self,dict:SX,bool,bool)
    call(self,dict:MX,bool,bool)
    call(self,[MX],bool,bool)
  You have: '(Function,(NoneType))'


## Final notes
HILO-MPC is developed by Johannes Pohlodek and Bruno Morabito under the supervision of Prof. Rolf Findeisen
at the  Control and cyber-physical systems laboratory, TU Darmstadt (https://www.ccps.tu-darmstadt.de/ccp) and at the
Laboratory for Systems Theory and Control, Otto von Guericke University (http://ifatwww.et.uni-magdeburg.de/syst/).

This is a quick example of an ANN controller hence we did not focus on how to generate a dataset that covers a wide
range of states, nor we postprocessed/optimize the datapoints.