Skip to content

amir-karimi96/continuous-time-continuous-option-policy-gradient

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

continuous-time-continuous-option-policy-gradient

This repository contains the implementation for the Continuous Time Continuous Option (CTCO) Policy Gradient Algorithm. Associated paper Dynamic Decision Frequency with Continuous Options

In this algorithm, extended actions paired with duration of execution are chosen to construct options with open-loop policies to improve the exploration in continuous control tasks. The trajectory of low-level actions is parameterized by some $\omega$ and executed for continuous time $d$ independent of the task action-cycle time $\Delta t$

Results

CTCO is evaluated against classic RL (SAC), action repetition RL (FIGAR-SAC) and hierarchical RL (DAC) methods for simulated continuous control tasks in different interaction frequencies. plot

We have also evaluated the CTCO performance in the real-world task of visual reacher with Faranka robotic arm.

Video of the algorithm in action can be found here

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published