ORL-TAMP

Optimistic Reinforcement Learning Task and Motion Planning (ORL-TAMP) is a framework integrating an RL policy into TAMP pipelines. The general idea is to encapsulate an RL policy into a so-called skill. A skill comprises an RL policy, a state discriminator, and a sub-goal generator. Besides steering the action, the RL policy, state discriminator, and sub-goal generator are used to verify symbolic predicates and ground geometric values.

Video

The method introduction and experiments:

Installation

The current version is tested on Ubuntu 20.04

Dependencies:
- MoveIt (ROS Noetic)
- Stable-Baselines3
We are currently trying to remove the dependency of MoveIt due to its inflexibility and ROS specificity.
Build PDDL FastDownward solver:
```
orl_tamp$ ./downward/build.py
```

Compile IK solver:

orl_tamp$ cd utils/pybullet_tools/ikfast/franka_panda/
franka_panda$ python setup.py

Run

Download the RL policy models: Retrieve and EdgePush, and save policies in the /orl_tamp/policies folder.
Run MoveIt (following the tutorial)
Run demos:
- Retrieve: orl_tamp$ ./run_demo.sh retrieve
- EdgePush: orl_tamp$ ./run_demo.sh edgepush
- Rearange: orl_tamp$ ./run_demo.sh rearrange

Train

This section we give general steps about to train your own skills.

Modify the PDDL domain file and and stream file, add the PDDL definations of the skills.
Use StableBaselines3 to standardized the policy trainning.
Generate dataset in the domain scenario.
Train the state discriminator.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
orl_tamp		orl_tamp
pics		pics
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ORL-TAMP

Video

Installation

Run

Train

About

Releases

Packages

Languages

Gaoyuan-Liu/ORL-TAMP

Folders and files

Latest commit

History

Repository files navigation

ORL-TAMP

Video

Installation

Run

Train

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages