Skip to content

Gaoyuan-Liu/ORL-TAMP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

ORL-TAMP

Optimistic Reinforcement Learning Task and Motion Planning (ORL-TAMP) is a framework integrating an RL policy into TAMP pipelines. The general idea is to encapsulate an RL policy into a so-called skill. A skill comprises an RL policy, a state discriminator, and a sub-goal generator. Besides steering the action, the RL policy, state discriminator, and sub-goal generator are used to verify symbolic predicates and ground geometric values.

Video

The method introduction and experiments:

Watch the video

Installation

The current version is tested on Ubuntu 20.04

  1. Dependencies:

    We are currently trying to remove the dependency of MoveIt due to its inflexibility and ROS specificity.

  2. Build PDDL FastDownward solver:

    orl_tamp$ ./downward/build.py
    
  3. Compile IK solver:

    orl_tamp$ cd utils/pybullet_tools/ikfast/franka_panda/
    franka_panda$ python setup.py
    

Run

  1. Download the RL policy models: Retrieve and EdgePush, and save policies in the /orl_tamp/policies folder.

  2. Run MoveIt (following the tutorial)

  3. Run demos:

    • Retrieve: orl_tamp$ ./run_demo.sh retrieve
    • EdgePush: orl_tamp$ ./run_demo.sh edgepush
    • Rearange: orl_tamp$ ./run_demo.sh rearrange

Train

This section we give general steps about to train your own skills.

  1. Modify the PDDL domain file and and stream file, add the PDDL definations of the skills.
  2. Use StableBaselines3 to standardized the policy trainning.
  3. Generate dataset in the domain scenario.
  4. Train the state discriminator.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages