Skip to content

Pytorch implementation of Hierarchical Intentional-Unintentional Soft Actor-Critic (HIU-SAC) algorithm

License

Notifications You must be signed in to change notification settings

domingoesteban/hiu_sac

Repository files navigation

HIU-SAC

Pytorch implementation of Hierarchical Intentional-Unintentional Soft Actor-Critic (HIU-SAC) algorithm

This repository contains the source code used for the experiments conducted in the paper: Hierarchical reinforcement learning for concurrent discovery of compound and composable policies

The algorithm has been tested on continuous control tasks in RoboLearn environments.

Some videos can be found at https://sites.google.com/view/hrl-concurrent-discovery

The code has been tested with PyTorch 1.0.1 and Python 3.5 (or later).

Pre-Installation

It is recommended to first create either a virtualenv or a conda environment.

# Create the conda environment
conda create -n <condaenv_name> python=3.5
# Activate the conda environment
conda activate <condaenv_name>
# Create the virtual environment
virtualenv -p python3.5 <virtualenv_name>
# Activate the virtual environment
source <virtualenv_name>/bin/activate

Installation

  1. Clone this repository
git clone https://github.com/domingoesteban/hiu_sac
  1. Install the requirements of this repository
cd hiu_sac
pip install -r requirements.txt

Use

  • Run HIU-SAC in one of the environments. Options: navigation2d, reacher, pusher, centauro
# python train.py -e <env_name>
python train.py -e navigation2d
  • Visualize the learned policy (Specify the log directory that is printed during the learning process)
python eval.py <path_to_log_directory>
  • Plot the learning curves in the composable and compound tasks (Specify the log directory that is printed during the learning process)
python eval.py <path_to_log_directory> -p

Citation

If this repository was useful for your research, we would appreciate that you can cite it:

@article{esteban2019hiusac,
  title={Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable Policies},
  author={Domingo Esteban and Leonel Rozo and Darwin G. Caldwell},
  journal={arXiv preprint arXiv:1905.09668},
  year={2019}
}