Skip to content

Differentiable predictive control (DPC) policy optimization examples.

License

Notifications You must be signed in to change notification settings

pnnl/deps_arXiv2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Differentiable Predictive Control

Examples of the differentiable predictive control (DPC) policy optimization algorithm presented in the paper "Learning Constrained Adaptive Differentiable Predictive Control Policies With Guarantees" https://arxiv.org/abs/2004.11184

DPC combines the principles of model predictive control, reinforcement learning, and differentiable programming to offer a systematic way for offline unsupervised model-based policy optimization with goal-parametrized domain-aware intrinsic rewards.

Method and Examples

methodology.
*Conceptual methodology. Simulation of the differentiable closed-loop system dynamics in the forward pass is followed by backward pass computing direct policy gradients for policy optimization *

methodology_2.
Structural equivalence of DPC architecture with MPC constraints.

cl_trajectories.
Example 1: Closed-loop trajectories of learned stabilizing neural control policy using DPC policy optimization.

cl_trajectories_2.
Example 1: Evolution of the closed-loop trajectories and DPC neural policy during training.

dpc_policy.
Example 1: Landscapes of the learned neural policy via DPC policy optimization algorithm (right) and explicit MPC policy computed using parametric programming solver (left).

example_2.
Example 2: Reference tracking of nonlinear ODE system controlled by DPC neural policy.

example_3.
Example 3: Closed-loop reference tracking control trajectories for the quadcopter model controlled by DPC neural policy.

example_4.
Example 4: Obstacle avoidance with nonlinear constraints via learned DPC neural policy compared to online IPOPT solution.

example_5.
Example 5: Adaptive DPC of unknown linear system subject to disturbances.

example_6_dpc.
Example 6: Closed-loop control trajectories for the PVTOL aircraft model controlled by DPC neural policy.

example_6_ampc.
Example 6: Closed-loop control trajectories for the PVTOL aircraft model controlled by approximate MPC neural policy.

Dependencies

For examples 1, 2, 3, and 4 have been implemented using our newly developed Neuromancer library for learning-based constrained optimization in Pytorch: neuromancer.

See environment.yml to reproduce the Conda environment for running example 5.

Files for Running the Examples

Control Example 1

  • double_integrator_DPC.py - DPC double integrator example using the Neuromancer package
  • double_integrator_eMPC.m - explicit MPC benchmark using MPT3 toolbox

Control Example 2

  • ref_tracking_ODE.py - Reference tracking for a nonlinear ODE
  • ref_tracking_ODE.ref_tracking_ODE.ipynb - jupyter notebook version

Control Example 3

  • quad_3D_linearDPC.py - Reference tracking for a quadcopter model via DPC using the Neuromancer package
  • CVXPY_linearMPC_quadcopter.py - Reference tracking for a quadcopter model online MPC using CVXPY and OSQP solver

Control Example 4

  • 2D_obstacle_avoidance_DPC.py - Parametric obstacle avoidance with nonlinear constraints via DPC using the Neuromancer package
  • 2D_obstacle_avoidance_csadi.py - Online obstacle avoidance using CasADi and IPOPT solver

Control Example 5

  • DeepMPC_sysID_ctrl_sec_2_4.py - policy optimization with ground truth model
  • DeepMPC_sysID_ctrl_sec_2_5.py - adaptive policy optimization via online simultaneous system ID and policy updates
  • DeepMPC_sysID_ctrl_sec_3_7 - computational aspects and scalability analysis

Control Example 6

  • vtol_aircraft_DPC_stabilize.py - Unsupervised DPC policy optimization for VTOL aircraft model using the Neuromancer package
  • vtol_aircraft_aMPC.py - Approximate MPC supervised by online MPC solver
  • pvtol_aircraft_iMPC.m - Online MPC solved in Matlab using Yalmip toolbox and quadprog solver

Cite as

@misc{drgona2022learning,
      title={Learning Constrained Adaptive Differentiable Predictive Control Policies With Guarantees}, 
      author={Jan Drgona and Aaron Tuor and Draguna Vrabie},
      year={2020},
      eprint={2004.11184},
      archivePrefix={arXiv},
      primaryClass={eess.SY}
}

About

Differentiable predictive control (DPC) policy optimization examples.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages