ORL-TAMP

Optimistic Reinforcement Learning Task and Motion Planning (ORL-TAMP) is a framework integrating an RL policy into TAMP pipelines. The general idea is to encapsulate an RL policy into a so-called skill. A skill comprises an RL policy, a state discriminator, and a sub-goal generator. Besides steering the action, the RL policy, state discriminator, and sub-goal generator are used to verify symbolic predicates and ground geometric values.

Video

The method introduction and experiments:

Installation

The current version is tested on Ubuntu 20.04

Dependencies:
- MoveIt (ROS Noetic)
- Stable-Baselines3
We are currently trying to remove the dependency of MoveIt due to its inflexibility and ROS specificity.
Build PDDL FastDownward solver:
```
orl_tamp$ ./downward/build.py
```

Compile IK solver:

orl_tamp$ cd utils/pybullet_tools/ikfast/franka_panda/
franka_panda$ python setup.py

Run

Download the RL policy models: Retrieve and EdgePush, and save policies in the /orl_tamp/policies folder.
Run MoveIt (following the tutorial)
Run demos:
- Retrieve: orl_tamp$ ./run_demo.sh retrieve
- EdgePush: orl_tamp$ ./run_demo.sh edgepush
- Rearange: orl_tamp$ ./run_demo.sh rearrange

Train

This section we give general steps about to train your own skills.

Modify the PDDL domain file and and stream file, add the PDDL definations of the skills.
Use StableBaselines3 to standardized the policy trainning.
Generate dataset in the domain scenario.
Train the state discriminator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ORL-TAMP

Video

Installation

Run

Train

Files

README.md

Latest commit

History

README.md

File metadata and controls

ORL-TAMP

Video

Installation

Run

Train