Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer Empowerment

This repo contains the code accompanying the paper Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer Empowerment. It builts on the MADDPG algorithm and uses the simulator from particle-env. One of the scenarios extends the single-agent OpenAI Gym Car to multiple agents.

We consider multi-agent reinforcement learning (MARL) for cooperative communication and coordination tasks. MARL agents can be brittle because they can overfit their training partners' policies. This overfitting can produce agents that adopt policies that act under the expectation that other agents will act in a certain way rather than react to their actions. Our objective is to bias the learning process towards finding reactive strategies towards other agents' behaviors. Our method, transfer empowerment, measures the potential influence between agents' actions. Results from three simulated cooperation scenarios support our hypothesis that transfer empowerment improves MARL performance. We discuss how transfer empowerment could be a useful principle to guide multi-agent coordination by ensuring reactiveness to one's partner.

Requirements

pip install -e .

How to Run

All training code is contained within main.py. To view options simply run:

python main.py --help

If you want to checkout the training loss on tensorboard, activate the VE and use:

tensorboard --logdir models/model_name

If you want to train methods from the paper for scenario 'simple_order':

python main.py simple_order si --social_influence
python main.py simple_order te --variational_transfer_all_action_pi_empowerment
python main.py simple_order je --variational_joint_empowerment

If you want to train methods from the paper for scenario 'cars':

Simulation Videos

Cooperative Communication

The moving agent needs to go to a landmark with a particular color. However, it is blind and another agent sends messages that help to navigate. Since there are more landmarks than communication channels, the speaking agent cannot simply output a symbol corresponding to a particular color. If the listening agent is not receptive to the messages, the speaker will output random signals. This in turn forces the listener to ignore them. With empowerment agents remain reactive to one another.

DDPG	MADDPG	EMADDPG

python simple_speaker_listener3 maddpg+ve --recurrent --variational_transfer_empowerment

Cooperative Coordination

In this simple task agents need to cover all landmarks. MADDPG algorithm is trained by self-play, causing them to agree upon a rule. For example, agent 1 goes to the red, agent 2 goes to the green and agent 3 to the blue landmark. At test time, agent 1 is paired with agent 2 and 3 from a different run and so the former rule does not necessarily results in the most efficient landmark selection. In contrast, EMADDPG uses empowerment that results in picking a landmark closest to each agent.

MADDPG	EMADDPG

python main.py maddpg+ve --recurrent --variational_joint_empowerment

Cooperative Driving

Cars need to stay on the road and need to avoid collisions. Agents only obtain a small top view image and their own states, such as orientation and velocity.

Visual inputs:

Red Agent	Green Agent

	DDPG	MADDPG
Overtaking
Obstacle avoidance
Junctions

Cooperative Coordination

Agent	Average dist.	Collisions %
MADDPG	1.767	20.9
EMADDPG	0.180	2.01

The average distance of a landmark (lower is better) and number of collisions between agents.

Cooperative Communication

Agent	Taget reach %	Average distance	Obstacle hits %
MADDPG	84.0	2.233	53.5
EMADDPG	98.8	0.012	1.90

The target is reached if it has <.1 from the target landmark (higher is better).

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
algorithms		algorithms
assets		assets
baselines		baselines
cars		cars
empowerment		empowerment
evalutation		evalutation
multiagent		multiagent
utils		utils
.gitignore		.gitignore
README.md		README.md
estimate_empowerment.py		estimate_empowerment.py
main.py		main.py
setup.py		setup.py

tessavdheiden/social_empowerment

Folders and files

Latest commit

History

Repository files navigation

Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer Empowerment

Requirements

How to Run

Simulation Videos

Cooperative Communication

Cooperative Coordination

Cooperative Driving

Cooperative Coordination

Cooperative Communication

About

Resources

Stars

Watchers

Forks

Languages