Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning

This repository contains the code for our paper Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning (ICRA-21).

Introduction

This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a mutual information lower-bound. The key innovation is a specially-designed density ratio estimator that encourages consistency between the latent codes of each modality. We tasked our method to learn policies (in a self-supervised manner) on multi-modal Natural MuJoCo benchmarks and a challenging Table Wiping task. Experiments show our method significantly outperforms state-of-the-art deep reinforcement learning methods, particularly in the presence of missing observations.

Fig 1. MuMMI training uses a density ratio estimator that acts to minimize the squared distances between themean of each modality expert and a transformed fused latent code.This encourages the experts to project to points in a shared latentspace.

Environment Setup

The code is tested on Ubuntu 16.04, Python 3.7 and CUDA 10.2. Please download the relevant Python packages by running:

Get dependencies:

pip3 install --user tensorflow-gpu==2.1.0
pip3 install --user tensorflow_probability
pip3 install --user git+git://github.com/deepmind/dm_control.git
pip3 install --user pandas
pip3 install --user matplotlib

Please install Mujoco from https://github.com/openai/mujoco-py.

Download other required files from Google Drive: link. Put natural_train.pkl and natural_valid.pkl under the main folder (these two files are used to generate complex observations).

Usage

To run MuMMI or baselines on mujoco, run the following:

python  [methods] --logdir [log path] --task [task]
e.g. python dreamer.py --logdir ./logdir/dmc_walker_walk/dreamer --task dmc_walker_walk
e.g. python mummi.py --logdir ./logdir/dmc_walker_walk/mummi --task dmc_walker_walk
e.g. python cvrl.py --logdir ./logdir/dmc_walker_walk/cvrl --task dmc_walker_walk

To change hyperparameters, please modify config.py.

BibTeX

To cite this work, please use:

@inproceedings{Chen2021MuMMI,
title={Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning},
author={Kaiqi Chen and Yong Lee and Harold Soh},
year={2021},
booktitle={IEEE International Conference on Robotics and Automation (ICRA)}}

Acknowledgement

This repo contains code that's based on the following repos: Yusufma03/CVRL.

References

[Ma et al., 2020] Xiao Ma, Siwei Chen, David Hsu, Wee Sun Lee: Contrastive Variational Model-Based Reinforcement Learning for Complex Observations. In CoRL, 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
dm_control/suite		dm_control/suite
image		image
.gitignore		.gitignore
README.md		README.md
config.py		config.py
cvrl.py		cvrl.py
dreamer.py		dreamer.py
models.py		models.py
mummi.py		mummi.py
tools.py		tools.py
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dm_control/suite

dm_control/suite

image

image

.gitignore

.gitignore

README.md

README.md

config.py

config.py

cvrl.py

cvrl.py

dreamer.py

dreamer.py

models.py

models.py

mummi.py

mummi.py

tools.py

tools.py

wrappers.py

wrappers.py

Repository files navigation

Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning

Introduction

Environment Setup

Usage

BibTeX

Acknowledgement

References

About

Releases

Packages

Languages

clear-nus/MuMMI

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning

Introduction

Environment Setup

Usage

BibTeX

Acknowledgement

References

About

Topics

Resources

Stars

Watchers

Forks

Languages