Learning Self-Supervised Behaviors with Graph Attention-based Architectures

This repository contains the code associated to the Learning Self-Supervised Behaviors with Graph Attention-based Architectures paper published at the Workshop on Agent Learning in Open-Endedness (ALOE) at ICLR 2022.

Abstract Although humans live in an open-ended world with endless challenges, they do not have to learn from scratch whenever they encounter a new task. Rather, they have access to a handful of previously learned skills, which they rapidly adapt to new situations. In artificial intelligence, autotelic agents—which are intrinsically motivated to represent and set their own goals—exhibit promising skill transfer capabilities. However, their learning capabilities are highly constrained by their policy and goal space representations. In this paper, we propose to investigate the impact of these representations. We study different implementations of autotelic agents using four types of Graph Neural Networks policy representations and two types of goal spaces, either geometric or predicate-based. We show that combining object-centered architectures that are expressive enough with semantic relational goals enables an efficient transfer between skills and promotes behavioral diversity. We also release our graph-based implementations to encourage further research in this direction.

We model both the critic and the policy using four different GNN-based architectures: full graph networks, interaction networks, relation networks and deep sets. The code for each architecture can be found in the rl_modules/ folder.

Requirements

gym
mujoco
pytorch
pandas
matplotlib
numpy

To reproduce the results, you need a machine with 24 cpus.

Training Graph-based Autotelic Agents

The following lines launch the training of our autotelic agents. When using semantic goal spaces, agents explore their behavioral space, discover new goals and attempt to learn them. When using continuous goal space, agents first select a class of goals based on the number of desired stacks. This selection is based on an LP-based Curriculum Learning which is shown to stabilize the learning procedure

mpirun -np 24 python train.py --algo semantic

mpirun -np 24 python train.py --algo continuous

To specify the GNN-based architecture to be used, refer to one of the following lines:

mpirun -np 24 python train.py --architecture full_gn

mpirun -np 24 python train.py --architecture interaction_network

mpirun -np 24 python train.py --architecture relation_network

mpirun -np 24 python train.py --architecture deep_sets

Note: The folder configs/ contains the hyper-parameters used for both types of goal spaces.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
configs		configs
env		env
her_modules		her_modules
mpi_utils		mpi_utils
rl_modules		rl_modules
README.md		README.md
arguments.py		arguments.py
demo.py		demo.py
goal_sampler.py		goal_sampler.py
logger.py		logger.py
plots.py		plots.py
rollout.py		rollout.py
run_jobs.py		run_jobs.py
train.py		train.py
updates.py		updates.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Self-Supervised Behaviors with Graph Attention-based Architectures

About

Releases

Packages

Languages

akakzia/rlgraph

Folders and files

Latest commit

History

Repository files navigation

Learning Self-Supervised Behaviors with Graph Attention-based Architectures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages