This repository contains the implementation for the Continuous Time Continuous Option (CTCO) Policy Gradient Algorithm. Associated paper Dynamic Decision Frequency with Continuous Options
In this algorithm, extended actions paired with duration of execution are chosen to construct options with open-loop policies to improve the exploration in continuous control tasks. The trajectory of low-level actions is parameterized by some
CTCO is evaluated against classic RL (SAC), action repetition RL (FIGAR-SAC) and hierarchical RL (DAC) methods for simulated continuous control tasks in different interaction frequencies.
We have also evaluated the CTCO performance in the real-world task of visual reacher with Faranka robotic arm.