# AutoMPC Demo

Welcome!  This notebook demonstrates the core features of AutoMPC.  We will use the Cart-Pole swing-up task as an example, and we will tune an MPC pipeline consisting of a multi-layer perception (MLP) system ID model and an iLQR optimizer.

## Set-Up

To begin, we need an input dataset and specifications for the system and task.  The `benchmarks` module provides these for a few example systems.

In [1]:
import os
os.chdir("..")
import autompc as ampc
import numpy as np
from autompc.benchmarks import CartpoleSwingupBenchmark

benchmark = CartpoleSwingupBenchmark()


# Get system and task specification
system = benchmark.system
task   = benchmark.task

# Generate benchmark dataset
trajs = benchmark.gen_trajs(seed=100, n_trajs=10, traj_len=200)

Cannot import SnoptWrapper
Cannot import KnitroSolver, make sure its Python interface is installed


Next, we need to declare our MPC pipeline.  The following code initializes a pipeline with a MLP system ID model, a quadratic objective function, and and an iLQR optimizer.

In [2]:
from autompc.sysid import MLPFactory
from autompc.control import IterativeLQRFactory
from autompc.costs import QuadCostFactory

model_factory = MLPFactory(system)
ctrlr_factory = IterativeLQRFactory(system)
cost_factory  = QuadCostFactory(system)

pipeline = ampc.Pipeline(system, model_factory, ctrlr_factory, cost_factory)

We can view the joint configuration space of the pipeline

In [3]:
pipeline.get_configuration_space()

Configuration space object:
  Hyperparameters:
    _cost:dx_F, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:dx_Q, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:omega_F, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:omega_Q, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:theta_F, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:theta_Q, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:u_R, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:x_F, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _cost:x_Q, Type: UniformFloat, Range: [0.001, 10000.0], Default: 1.0, on log-scale
    _ctrlr:horizon, Type: UniformInteger, Range: [5, 25], Default: 20
    _model:hidden_size_1, Type: UniformInteger, Range: [16, 256], Default: 128
   

## Tuning

Once we have initialized the pipeline and its factories, it is straightforward to set up a tuner to search the configuration space.  Here we use an MLP model for the surrogate model.  This is an example of full pipeline which searches the configuration space of all pipeline components simultaneously.

In [4]:
from autompc.tuning import PipelineTuner

tuner = PipelineTuner(surrogate_factory=MLPFactory(system), surrogate_split=0.5)

controller, tune_result = tuner.run(pipeline, task, trajs, n_iters=100, rng=np.random.default_rng(100), 
                                   truedyn=benchmark.dynamics)

  self.re = re.compile( self.reString )
  return array(a, dtype, copy=False, order=order)


use_cuda=True
MLP Using Cuda
100%|██████████| 50/50 [00:01<00:00, 28.10it/s]

INFO:smac.utils.io.cmd_reader.CMDReader:Output to smac3-output_2021-05-02_12:21:47_203564
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:Optimizing a deterministic scenario for quality without a tuner timeout - will make SMAC deterministic and only evaluate one configuration per iteration!
INFO:smac.initial_design.sobol_design.SobolDesign:Running initial design for 25 configurations
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:<class 'smac.facade.smac_hpo_facade.SMAC4HPO'>
INFO:smac.optimizer.smbo.SMBO:Running initial design
INFO:smac.intensification.intensification.Intensifier:First run, no incumbent provided; challenger is assumed to be the incumbent



use_cuda=True
MLP Using Cuda
100%|██████████| 50/50 [00:04<00:00, 11.33it/s]
Simulating Surrogate Trajectory: 
 99%|█████████▉| 198/200 [00:08<00:00, 22.26it/s]
Simulating True Dynamics Trajectory
 99%|█████████▉| 198/200 [00:07<00:00, 25.85it/s]

INFO:smac.intensification.intensification.Intensifier:First run, no incumbent provided; challenger is assumed to be the incumbent
INFO:smac.intensification.intensification.Intensifier:Updated estimated cost of incumbent on 1 runs: 149.0000



use_cuda=True
MLP Using Cuda
100%|██████████| 50/50 [00:03<00:00, 14.34it/s]
Simulating Surrogate Trajectory: 
  4%|▍         | 9/200 [00:00<00:04, 39.53it/s]

  if (obj - new_obj) / (-expect_cost_reduction) > ls_cost_threshold:


 99%|█████████▉| 198/200 [00:04<00:00, 41.42it/s]
Simulating True Dynamics Trajectory
 99%|█████████▉| 198/200 [00:04<00:00, 41.11it/s]

INFO:smac.intensification.intensification.Intensifier:Wallclock time limit for intensification reached (used: 34.165927 sec, available: 0.000010 sec)



use_cuda=True
MLP Using Cuda
100%|██████████| 50/50 [00:05<00:00,  8.98it/s]
Simulating Surrogate Trajectory: 
 28%|██▊       | 56/200 [00:21<00:55,  2.60it/s]

INFO:smac.stats.stats.Stats:##########################################################
INFO:smac.stats.stats.Stats:Statistics:
INFO:smac.stats.stats.Stats:#Incumbent changed: 0
INFO:smac.stats.stats.Stats:#Submitted target algorithm runs: 2 / 100.0
INFO:smac.stats.stats.Stats:#Finished target algorithm runs: 2 / 100.0
INFO:smac.stats.stats.Stats:#Configurations: 2
INFO:smac.stats.stats.Stats:Used wallclock time: 61.35 / inf sec 
INFO:smac.stats.stats.Stats:Used target algorithm runtime: 34.16 / inf sec
INFO:smac.stats.stats.Stats:##########################################################
INFO:smac.facade.smac_hpo_facade.SMAC4HPO:Final Incumbent: Configuration:
  _cost:dx_F, Value: 3.1622776601683813
  _cost:dx_Q, Value: 3.1622776601683813
  _cost:omega_F, Value: 3.1622776601683813
  _cost:omega_Q, Value: 3.1622776601683813
  _cost:theta_F, Value: 3.1622776601683813
  _cost:theta_Q, Value: 3.1622776601683813
  _cost:u_R, Value: 3.1622776601683813
  _cost:x_F, Value: 3.1622776601683813
 




KeyboardInterrupt: 

It is then easy to plot the curve from the tuning result.

In [None]:
from autompc.graphs import TuningCurveGraph
import matplotlib.pyplot as plt

graph = TuningCurveGraph()

fig = plt.figure()      
ax = fig.gca()
graph(ax, tune_result)
ax.set_title("Cart-Pole Tuning Curve")
plt.show()

We can then use the resulting controller to simulate a trajectory. For example,

In [None]:
traj = ampc.simulate(controller, init_obs=np.array([1.0, 0.0, 0.0, 0.0]), max_steps=200, dynamics=benchmark.dynamics)

## Decoupled Tuning

The above examples is full pipeline tuning, which searches the configuration space of all components simultaneously.  Alternatively, we can take a decoupled tuning approach, where the model is first tuned based on prediction accuracy, then the objective function and optimizer are tuned.

First, we must tune the model

In [None]:
from autompc.tuning import ModelTuner
from autompc.evaluation import HoldoutModelEvaluator

model_evaluator = HoldoutModelEvaluator(holdout_prop=0.25, metric="rmse", trajs=trajs, system=system,
                                       rng=np.random.default_rng(100))
model_tuner = ModelTuner(system, model_evaluator)
model_tuner.add_model_factory(model_factory)

model, model_tune_result = model_tuner.run(rng=np.random.default_rng(100), n_iters=2)

Now that we have tuned the model, we can create a pipeline with the pre-tuned model, and then run pipeline tuning.

In [None]:
pipeline_fixed_model = ampc.Pipeline(system, model, ctrlr_factory, cost_factory)
controller2, tune_result2 = tuner.run(pipeline_fixed_model, task, trajs, n_iters=100, rng=np.random.default_rng(100), 
                                   truedyn=benchmark.dynamics)