Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Parallel DDP for GPU/CPU


v0.1: WAFR 2018 Release holds the code for "A Performance Analysis of Differential Dynamic Programming on a GPU."

v0.2: ICRA 2019 Release extends the previous work by integrating LCM for hardware experiments and cleaning up the code base / interface. An extended abstract describing the hardware experiments can be found here.

Stucture of this Repository

  • config.h defines all of the default settings (parallel level, plant, etc.) for an experiment and imports all of the various helper functions and files from the following folders as needed
  • /examples/* holds the scripts that run the WAFR examples and LCM examples (see the comment at the top of each file for the compilation instructions)
  • /plants/* holds custom rigid body dynamics and/or analytical dynamics and cost functions for currently supported plants
  • /DDPHelpers/* holds most of the functions for DDP as inlined templated CUDA header files
  • /utils/* holds a variety of support code for matrix multiplication, discrete time integrators, thread/CUDA support, etc.
  • /test/* holds a variety of testing scripts for various function calls and derivatives (see the comment at the top of each file for the compilation instructions)
  • /lcmtypes/* holds LCM types for multi-computer / hardware communication


  • CUDA needs to be installed as code needs to be compiled with the NVCC comiler. Currently, this code has been tested with CUDA 9 and X.
  • For multi-computer / hardware MPC code there is an additional communicaiton dependency: LCM.

Instalation Tips for CUDA

To use with the Drake Kuka Simulator

  1. Install this fork of drake:
  2. You need to put in you .bashrc export DRAKE_PATH_ROOT=<path_to_drake> Then the scripts in the utils folder should launch the drake visualizer and simulator

Known Bugs / Ongoing Development / Roadmap

  • On roadmap to develop a CPU/GPU hybrid (only the gradients on the GPU) and a fully serial CPU version without any instruction level parallelism
  • GPU RBDYN for Kuka only works in Euler mode -- need to introduce loops and reduce shared memory for Midpoint and RK3 (or use a brand new GPU which has double the shared memory) -- potential to also optimize the gradient calc to require less shared memory
  • CPU MPC suffers from resource contention when trajRunner and Goal are on same computer -- need to improve and provide seperate compile paths -- also CPU MPC Parallel Line Search has a subtle bug (in iLQR is identical to serial but diverges in MPC -- need to debug)
  • Constraint handling / penalities need further development - would like to add full AL constraints and/or projection methods
  • Final cost shift is in development and non-functional (tied to frequency and not last goal change / shift count)
  • SLQ implementation is currently broken (and EE version needs a cost kernel)
  • EEVel rpy derivatives are currently broken (may explore forced finite diff)
  • BFGS iters may improve / stabilize the EEPos/Vel cost and should be explored
  • Square root implementation of DDP should add numerical stability and should be explored
  • Want to develop URDF > transforms and inertias tool for Arm
  • Would be nice to add a runtime and not compile time switch for Hardware vs. Sim mode and for level of parallelism


Code supporting the WAFR paper "A Performance Analysis of Differential Dynamic Programming on a GPU," and the ICRA workshop follow on work deploying the algorithm onto robot hardware.







No packages published