# Efficient data generation and handling with do-mpc

We start by importing basic modules and **do-mpc**.

In [1]:
import numpy as np
import sys
from casadi import *
import os

# Add do_mpc to path. This is not necessary if it was installed via pip
sys.path.append('../../../')

# Import do_mpc package:
import do_mpc

import matplotlib.pyplot as plt

import pandas as pd

## Toy example


**Step 1:** Create the ``sampling_plan`` with the ``SamplingPlanner``.

The planner is initiated and we set some (optional) parameters.

In [2]:
sp = do_mpc.sampling.SamplingPlanner()
sp.set_param(overwrite = True)
sp.set_param(save_format='pickle')

We then introduce new variables to the ``SamplingPlanner`` which should be sampled. These variables are later varied for each sample.

For each variable we need to provide a function which generates a sample value for the corresponding variable.

In [3]:
sp.set_sampling_var('alpha', np.random.randn)
sp.set_sampling_var('beta', lambda: np.random.randint(0,5))

In this example we have two variables ``alpha`` and ``beta``. We have:

$$
\alpha \sim \mathcal{N}(\mu,\sigma)
$$

and 

$$
\beta\sim \mathcal{U}([0,5])
$$

The sampling plan is created by calling:

```python
SamplingPlanner.gen_sampling_plan(sampling_plan_name, n_samples)
```

In [6]:
plan = sp.gen_sampling_plan('test_sampling_plan', n_samples=10)

Note that this call returns the ``sampling_plan`` as well as saves it to disc. This allows to create a sampling plan on one device and execute parts of it on other devices. The filename is the same as the ``sampling_plan_name``

**Step 2:** Create the ``Sampler`` object by providing the ``sampling_plan``:

In [8]:
sampler = do_mpc.sampling.Sampler(plan)

Most important settting of the sampler is the ``sample_function``. This function takes as arguments the defined ``sampling_var`` (the same keywords):

In [9]:
def sample_function(alpha, beta):
    return alpha*beta

sampler.set_sample_function(sample_function)

Now we can actually create all the samples:

In [10]:
sampler.sample_data()

Progress: |█████---------------------------------------------| 10.0% CompleteProgress: |██████████----------------------------------------| 20.0% CompleteProgress: |███████████████-----------------------------------| 30.0% CompleteProgress: |████████████████████------------------------------| 40.0% CompleteProgress: |█████████████████████████-------------------------| 50.0% CompleteProgress: |██████████████████████████████--------------------| 60.0% CompleteProgress: |███████████████████████████████████---------------| 70.0% CompleteProgress: |████████████████████████████████████████----------| 80.0% CompleteProgress: |█████████████████████████████████████████████-----| 90.0% CompleteProgress: |██████████████████████████████████████████████████| 100.0% Complete


The sampler will now create the sampling results as a new file for each result and store them in a subfolder with the same name as the ``sampling_plan``:

In [12]:
ls = os.listdir('./test_sampling_plan/')
ls.sort()
ls

['test_sampling_plan_00.pkl',
 'test_sampling_plan_01.pkl',
 'test_sampling_plan_02.pkl',
 'test_sampling_plan_03.pkl',
 'test_sampling_plan_04.pkl',
 'test_sampling_plan_05.pkl',
 'test_sampling_plan_06.pkl',
 'test_sampling_plan_07.pkl',
 'test_sampling_plan_08.pkl',
 'test_sampling_plan_09.pkl']

**Step 3:** Process data in the data handler class. The first step is to initiate the class with the ``sampling_plan``:

In [13]:
dh = do_mpc.sampling.DataHandler(plan)

We then define how post-processing should be done. For this toy example we do some "dummy" post-processing and request to compute two results:

In [14]:
dh.set_post_processing('res_1', lambda x: x)
dh.set_post_processing('res_2', lambda x: x**2)

The interface of ``DataHandler.set_post_processing`` requires a name that we will see again later and a function that processes the output of the previously defined ``sample_function``.

We can now obtain **obtain processed data** from the ``DataHandler`` in two ways:

**1. Indexing**:

In [15]:
dh[:3]

{'alpha': [-0.12247445889036378, -1.3251862172292217, 0.36828329367273743],
 'beta': [4, 3, 0],
 'id': ['00', '01', '02'],
 'res_1': [-0.4898978355614551, -3.975558651687665, 0.0],
 'res_2': [0.2399998892877985, 15.805066593008645, 0.0]}

Or we use a more complex filter with the ``DataHandler.filter`` method. This method reequires a function which in turn must contain argument(s) with the same names as the introduced in the ``SamplingPlanner``.

In [16]:
dh.filter(lambda alpha: alpha<0)

{'alpha': [-0.12247445889036378,
  -1.3251862172292217,
  -0.06990603420085585,
  -1.252424500755703,
  -1.4235389183405416,
  -0.362726217072821,
  -0.01543271242471949],
 'beta': [4, 3, 1, 3, 4, 3, 0],
 'id': ['00', '01', '03', '04', '07', '08', '09'],
 'res_1': [-0.4898978355614551,
  -3.975558651687665,
  -0.06990603420085585,
  -3.7572735022671093,
  -5.694155673362166,
  -1.088178651218463,
  -0.0],
 'res_2': [0.2399998892877985,
  15.805066593008645,
  0.004886853617691228,
  14.11710417083855,
  32.423408832482544,
  1.1841327769676333,
  0.0]}

Note that in many cases the result from the ``DataHandler``can be readily converted to a pandas ``DataFrame``:

In [17]:
pd.DataFrame(dh.filter(lambda alpha: alpha<0))

Unnamed: 0,alpha,beta,id,res_1,res_2
0,-0.122474,4,0,-0.489898,0.24
1,-1.325186,3,1,-3.975559,15.805067
2,-0.069906,1,3,-0.069906,0.004887
3,-1.252425,3,4,-3.757274,14.117104
4,-1.423539,4,7,-5.694156,32.423409
5,-0.362726,3,8,-1.088179,1.184133
6,-0.015433,0,9,-0.0,0.0


# Sampling closed-loop trajectories

A more reasonable use-case in the scope of **do-mpc** is to sample closed-loop trajectories of a dynamical system with a (MPC) controller. 

The approach is almost identical to our toy example above. The main difference lies in the ``sample_function`` that is passed to the ``Sampler`` and the ``post_processing`` in the ``DataHandler``.

The considered use-case will be simple oscillating mass example that is also part of the do-mpc example library. 

In [15]:
sys.path.append('../../../examples/oscillating_masses_discrete/')
from template_model import template_model
from template_mpc import template_mpc
from template_simulator import template_simulator

**Step 1:** Create the ``sampling plan`` with the ``SamplingPlanner``

We want to generate various closed-loop trajectories of the system starting from random initial states, hence we design the ``SamplingPlanner`` as follows:

In [16]:
# Initialize sampling planner
sp = do_mpc.sampling.SamplingPlanner()
sp.set_param(overwrite=True)

# Sample random feasible initial states
def gen_initial_states():
    
    x0 = np.random.uniform(-3*np.ones((4,1)),3*np.ones((4,1)))
    
    return x0

# Add sampling variable including the corresponding evaluation function
sp.set_sampling_var('X0', gen_initial_states)

This implementation is sufficient to generate the sampling plan:

In [17]:
plan = sp.gen_sampling_plan('oscillating_masses', n_samples=10)

Since we want to run the system in the closed-loop in our sample function, we need to load the corresponding configuration:

In [18]:
model = template_model()
mpc = template_mpc(model)
estimator = do_mpc.estimator.StateFeedback(model)
simulator = template_simulator(model)

We can now define the sampling function:

In [19]:
def run_closed_loop(X0):
    mpc.reset_history()
    simulator.reset_history()
    estimator.reset_history()

    # set initial values and guess
    x0 = X0
    mpc.x0 = x0
    simulator.x0 = x0
    estimator.x0 = x0

    mpc.set_initial_guess()

    # run the closed loop for 150 steps
    for k in range(10):
        u0 = mpc.make_step(x0)
        y_next = simulator.make_step(u0)
        x0 = estimator.make_step(y_next)

    # we return the complete data structure that we have obtained during the closed-loop run
    return mpc.data

Now we have all the ingredients to make our sampler:

In [20]:
# Initialize sampler with generated plan
sampler = do_mpc.sampling.Sampler(plan)

# Set the sampling function
sampler.set_sample_function(run_closed_loop)

# Generate the data
sampler.sample_data()

Progress: |█████---------------------------------------------| 10.0% CompleteProgress: |██████████----------------------------------------| 20.0% CompleteProgress: |███████████████-----------------------------------| 30.0% CompleteProgress: |████████████████████------------------------------| 40.0% CompleteProgress: |█████████████████████████-------------------------| 50.0% CompleteProgress: |██████████████████████████████--------------------| 60.0% CompleteProgress: |███████████████████████████████████---------------| 70.0% CompleteProgress: |████████████████████████████████████████----------| 80.0% CompleteProgress: |█████████████████████████████████████████████-----| 90.0% CompleteProgress: |██████████████████████████████████████████████████| 100.0% Complete


**Step 3:** Process data in the data handler class. The first step is to initiate the class with the ``sampling_plan``:

In [21]:
# Initialize DataHandler
dh = do_mpc.sampling.DataHandler(plan)

In this case, we are interested in the states and the inputs of all trajectories. We define the following post processing functions:

In [22]:
dh.set_post_processing('input', lambda data: data['_u', 'u'])
dh.set_post_processing('state', lambda data: data['_x', 'x'])

Since we want to have all bla bla bla, we do not need to to filter our data:

In [23]:
final_data = dh[:]

In [24]:
final_data

{'X0': [array([[-0.47174792],
         [-1.05760168],
         [-0.76417566],
         [ 1.01076908]]),
  array([[-0.8305777 ],
         [ 1.55993182],
         [-1.70828974],
         [ 1.24334325]]),
  array([[ 2.54954606],
         [-1.54331618],
         [ 0.22384772],
         [-1.59216515]]),
  array([[ 0.61251651],
         [ 1.73073021],
         [-1.4029336 ],
         [ 1.78302413]]),
  array([[-2.11412962],
         [-0.68188384],
         [ 1.68079124],
         [-0.53383205]]),
  array([[-2.52078374],
         [-1.00300361],
         [ 1.39946715],
         [-0.28328409]]),
  array([[ 1.88359815],
         [-2.56905323],
         [-2.30150597],
         [-0.22519468]]),
  array([[-1.10451676],
         [-1.33526398],
         [ 1.81986969],
         [-0.05775321]]),
  array([[0.6452349 ],
         [1.99605008],
         [1.19699149],
         [1.99219379]]),
  array([[2.68045447],
         [1.46766259],
         [1.10306808],
         [2.07619334]])],
 'id': ['00', '01', '