CASSI with Solo

This repository provides the Cooperative Adversarial Self-supervised Skill Imitation (CASSI) algorithm that enables Solo to extract diverse skills through adversarial imitation from unlabeled, mixed motions using NVIDIA Isaac Gym.

Paper: Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions
Project website: https://sites.google.com/view/icra2023-cassi/home

Maintainer: Chenhao Li
Affiliation: Autonomous Learning Group, Max Planck Institute for Intelligent Systems, and Robotic Systems Lab, ETH Zurich
Contact: chenhli@ethz.ch

Installation

Create a new python virtual environment with python 3.8

Install pytorch 1.10 with cuda-11.3

 pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

Install Isaac Gym
- Download and install Isaac Gym Preview 4
```
cd isaacgym/python
pip install -e .
```
- Try running an example
```
cd examples
python 1080_balls_of_solitude.py
```
- For troubleshooting, check docs in isaacgym/docs/index.html

Install solo_gym

 git clone https://github.com/martius-lab/cassi.git
 cd solo_gym
 pip install -e .

Configuration

The Solo environment is defined by an env file solo8.py and a config file solo8_config.py under solo_gym/envs/solo8/. The config file sets both the environment parameters in class Solo8FlatCfg and the training parameters in class Solo8FlatCfgPPO.
The provided code examplifies the training of Solo 8 with unlabeled mixed motions. Demonstrations induced by 6 locomotion gaits are randomly mixed and augmented with perturbations to 6000 trajectoires with 120 frames and stored in resources/robots/solo8/datasets/motion_data.pt. The state dimension indices are specified in reference_state_idx_dict.json. To train with other demonstrations, replace motion_data.pt and adapt reward functions defined in solo_gym/envs/solo8/solo8.py accordingly.

Usage

Train

python scripts/train.py --task solo8

The trained policy is saved in logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt, where <experiment_name> and <run_name> are defined in the train config.
To disable rendering, append --headless.

Play a trained policy

python scripts/play.py

By default the loaded policy is the last model of the last run of the experiment folder.
Other runs/model iteration can be selected by setting load_run and checkpoint in the train config.
Use u and j to command the forward velocity, h and k to switch between the extracted skills.

Citation

@inproceedings{li2023versatile,
  title={Versatile skill control via self-supervised adversarial imitation of unlabeled mixed motions},
  author={Li, Chenhao and Blaes, Sebastian and Kolev, Pavel and Vlastelica, Marin and Frey, Jonas and Martius, Georg},
  booktitle={2023 IEEE international conference on robotics and automation (ICRA)},
  pages={2944--2950},
  year={2023},
  organization={IEEE}
}

References

The code is built upon the open-sourced Isaac Gym Environments for Legged Robots and the PPO implementation. We refer to the original repositories for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
learning		learning
resources/robots/solo8		resources/robots/solo8
scripts		scripts
solo_gym		solo_gym
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
abstract.png		abstract.png
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

learning

learning

resources/robots/solo8

resources/robots/solo8

scripts

scripts

solo_gym

solo_gym

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

abstract.png

abstract.png

setup.py

setup.py

Repository files navigation

CASSI with Solo

Installation

Configuration

Usage

Train

Play a trained policy

Citation

References

About

Releases

Packages

Languages

License

martius-lab/cassi

Folders and files

Latest commit

History

Repository files navigation

CASSI with Solo

Installation

Configuration

Usage

Train

Play a trained policy

Citation

References

About

Resources

License

Stars

Watchers

Forks

Languages