continuous-time-continuous-option-policy-gradient

This repository contains the implementation for the Continuous Time Continuous Option (CTCO) Policy Gradient Algorithm. Associated paper Dynamic Decision Frequency with Continuous Options

In this algorithm, extended actions paired with duration of execution are chosen to construct options with open-loop policies to improve the exploration in continuous control tasks. The trajectory of low-level actions is parameterized by some $\omega$ and executed for continuous time $d$ independent of the task action-cycle time $\Delta t$

Results

CTCO is evaluated against classic RL (SAC), action repetition RL (FIGAR-SAC) and hierarchical RL (DAC) methods for simulated continuous control tasks in different interaction frequencies.

We have also evaluated the CTCO performance in the real-world task of visual reacher with Faranka robotic arm.

Video of the algorithm in action can be found here

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
CTCO		CTCO
.gitignore		.gitignore
LICENSE		LICENSE
Proofs.pdf		Proofs.pdf
README.md		README.md
requirements.txt		requirements.txt
requirements_no_cc.txt		requirements_no_cc.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CTCO

CTCO

.gitignore

.gitignore

LICENSE

LICENSE

Proofs.pdf

Proofs.pdf

README.md

README.md

requirements.txt

requirements.txt

requirements_no_cc.txt

requirements_no_cc.txt

setup.py

setup.py

Repository files navigation

continuous-time-continuous-option-policy-gradient

Results

Video of the algorithm in action can be found here

About

Releases

Packages

Languages

License

amir-karimi96/continuous-time-continuous-option-policy-gradient

Folders and files

Latest commit

History

Repository files navigation

continuous-time-continuous-option-policy-gradient

Results

Video of the algorithm in action can be found here

About

Resources

License

Stars

Watchers

Forks

Languages