# Tuning

This notebook will demonstrate how to do tuning for models and controllers.

## Set-Up

As always, we begin by obtaining our system, model, and task from the benchmarks package.

In [None]:
import autompc as ampc
import numpy as np
from autompc.benchmarks import CartpoleSwingupBenchmark

benchmark = CartpoleSwingupBenchmark()

# Get system and task specification
system = benchmark.system
task   = benchmark.task

# Generate benchmark dataset
trajs = benchmark.gen_trajs(seed=100, n_trajs=500, traj_len=200)

## Model Tuning

First, we will demonstrate how to automatically tune the hyperparameters of a system ID model.

First, we have to define the model evaluator, which handles the training and evaluation of individual model configurations.  Here, use the `HoldoutEvaluator` which randomly splits the dataset into a training set and holdout set for evaluation.  We will tune using the `RMSE` metric over a 20 step prediction horizon.  We also have to provide the evaluator with the trajectory dataset.

In [None]:
from autompc.evaluation import HoldoutModelEvaluator

model_evaluator = HoldoutModelEvaluator(holdout_prop=0.25, metric="rmse", horizon=20, trajs=trajs, 
                                        system=system, rng=np.random.default_rng(100))

We also need to construct a model factory to tune. Model tuning also supports automatic model selection, so here will create two model factories to select between

In [None]:
from autompc.sysid import MLPFactory, SINDyFactory

model_factory_1 = MLPFactory(system)
model_factory_2 = SINDyFactory(system)

Now that we have created our evaluator and our factories, we instantiate the `ModelTuner` and add both factories.

In [None]:
from autompc.tuning import ModelTuner

model_tuner = ModelTuner(system, model_evaluator)
model_tuner.add_model_factory(model_factory_1)
model_tuner.add_model_factory(model_factory_2)

We now run tuning for 100 iterations

In [None]:
model, model_tune_result = model_tuner.run(rng=np.random.default_rng(100), n_iters=100)

We can then plot the accuracy of the best model found so far over the course of the tuning process.

In [None]:
from autompc.graphs import TuningCurveGraph
import matplotlib.pyplot as plt

graph = TuningCurveGraph()

fig = plt.figure()      
ax = fig.gca()
graph(ax, model_tune_result)
ax.set_title("Model Tuning Curve")
plt.show()

## Pipeline Tuning

Next, we will demonstrate how to do hyperparameter tuning for the entire MPC pipeline.  First, we create our factories and pipeline object. Here, we use an MLP system ID model, Iterative LQR control optimizer, and a quadratic cost.  AutoMPC does not currently support automatic selection of pipeline components, but we hope to add this feature soon.

In [None]:
from autompc.control import IterativeLQRFactory
from autompc.costs import QuadCostFactory


ctrlr_factory = IterativeLQRFactory(system)
cost_factory  = QuadCostFactory(system)
model_factory = MLPFactory(system)

pipeline = ampc.Pipeline(system, model_factory, cost_factory, ctrlr_factory)

Next, we instantiate the `PipelineTuner`.  AutoMPC performs tuning without access to the system dynamics, so the tuner has to train a surrogate dynamics model to use a simulator for controller evaluation.  In this example, we use an MLP surrogate model, so we pass in `MLPFactory`.  The `surrogate_split` tells what proportion of the data to use for surrogate training vs system ID training.  Here we use a 50/50 split.

Finally, AutoMPC supports several methods of selecting the surrogate model, controller by the `surrogate_mode` argument.  Here we use `defaultcfg` which means the surrogate is trained using the default MLP configuration.  Other modes include `fixedcfg`, where the user specifies the surrogate configuration, `autotune`, where the tuner first tunes the surrogate factory before running the pipeline tuning, and `autoselect`, where the tuner both automatically selects the model type from the availble system ID algorithms and tunes the model hyperparameters.

For more details on using these modes, see the [documentation](https://autompc.readthedocs.io/en/latest/source/tuning.html#pipelinetuner) for the `PipelineTuner` class.

In [None]:
from autompc.tuning import PipelineTuner

tuner = PipelineTuner(surrogate_factory=MLPFactory(system), surrogate_mode="defaultcfg", surrogate_split=0.5)

Now that we have created the tuner, we run it for 100 iterations.  We pass in the ground truth dynamics to keep track of the performance, but in a real application, we don't expect to have access to this, so this information is not used for tuning.

In [None]:
controller, tune_result = tuner.run(pipeline, task, trajs, n_iters=100, rng=np.random.default_rng(100), 
                                   truedyn=benchmark.dynamics)

After running the tuning process, we can graph the performance of the best controller found so far, both with respect to the surrogate dynamics and the true dynamics

In [None]:
from autompc.graphs import TuningCurveGraph
import matplotlib.pyplot as plt

graph = TuningCurveGraph()

fig = plt.figure()      
ax = fig.gca()
graph(ax, tune_result)
ax.set_title("Cart-Pole Tuning Curve")
plt.show()

## Decoupled Tuning

The above example is full pipeline tuning, which searches the configuration space of all components simultaneously.  Alternatively, we can take a decoupled tuning approach, where the model is first tuned based on prediction accuracy, then the objective function and optimizer are tuned.  Since full pipeline tuning requires us to train a system ID model and simulate the controller at every iteration, decoupled tuning may produce faster tuning.

First, we'll need to handle the data split between system ID training set and surrogate training set manually.

In [None]:
rng = np.random.default_rng(100)
rng.shuffle(trajs)
surr_trajs = trajs[:250]
sysid_trajs = trajs[250:]

Now, we tune our system ID model for 75 iterations using the same model tuning method as above

In [None]:
model_evaluator = HoldoutModelEvaluator(holdout_prop=0.25, metric="rmse", horizon=20, trajs=sysid_trajs, 
                                        system=system, rng=np.random.default_rng(100))

model_tuner = ModelTuner(system, model_evaluator)
model_tuner.add_model_factory(model_factory)

model, model_tune_result = model_tuner.run(rng=np.random.default_rng(100), n_iters=75)

Now, we construct a new pipeline which fixes the model to be the result of the tuning process we just ran

In [None]:
pipeline_fixed_model = ampc.Pipeline(system, model, ctrlr_factory, cost_factory)

And finally, we run our pipeline tuning in a similar manner to above.  We now use tell the tuner to use 100% of data for tuning the surrogate, since we already handled the data split and did the system ID training

In [None]:
tuner2 = PipelineTuner(surrogate_factory=MLPFactory(system), surrogate_split=1.0)


controller2, tune_result2 = tuner.run(pipeline_fixed_model, task, surr_trajs, n_iters=75, 
                                      rng=np.random.default_rng(100), truedyn=benchmark.dynamics)

Finally, we can plot the result of our decoupled pipeline tune

In [None]:
from autompc.graphs import TuningCurveGraph
import matplotlib.pyplot as plt

graph = TuningCurveGraph()

fig = plt.figure()      
ax = fig.gca()
graph(ax, tune_result2)
ax.set_title("Cart-Pole Decoupled Tuning Curve")
plt.show()