Skip to content

Commit

Permalink
Merge pull request #419 from Rohan138/master
Browse files Browse the repository at this point in the history
  • Loading branch information
jkterry1 committed Jul 16, 2021
2 parents 07e96c6 + f35548d commit 383c152
Show file tree
Hide file tree
Showing 23 changed files with 140 additions and 84 deletions.
10 changes: 9 additions & 1 deletion docs/mpe.md
Expand Up @@ -11,7 +11,7 @@ pip install pettingzoo[mpe]

Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks.

These environments are from [OpenAI's MPE](https://github.com/openai/multiagent-particle-envs) codebase, with several minor fixes, mostly related to making the action space discrete, making the rewards consistent and cleaning up the observation space of certain environments.
These environments are from [OpenAI's MPE](https://github.com/openai/multiagent-particle-envs) codebase, with several minor fixes, mostly related to making the action space discrete by default, making the rewards consistent and cleaning up the observation space of certain environments.

### Types of Environments

Expand Down Expand Up @@ -43,8 +43,16 @@ If an agent cannot see or observe the communication of a second agent, then the

### Action Space

Note: [OpenAI's MPE](https://github.com/openai/multiagent-particle-envs) uses continuous action spaces by default.

Discrete action space (Default):

The action space is a discrete action space representing the combinations of movements and communications an agent can perform. Agents that can move can choose between the 4 cardinal directions or do nothing. Agents that can communicate choose between 2 and 10 environment-dependent communication options, which broadcast a message to all agents that can hear it.

Continuous action space (Set by continuous_actions=True):

The action space is a continuous action space representing the movements and communication an agent can perform. Agents that can move can input a velocity between 0.0 and 1.0 in each of the four cardinal directions, where opposing velocities e.g. left and right are summed together. Agents that can communicate can output a continuous value over each communication channel in the environment which they have access to.

### Rendering

Rendering displays the scene in a window that automatically grows if agents wander beyond its border. Communication is rendered at the bottom of the scene. The `render()` method also returns the pixel map of the rendered area.
Expand Down
8 changes: 5 additions & 3 deletions docs/mpe/simple.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple"
agents: "1"
manual-control: "No"
action-shape: "(5)"
action-values: "Discrete(5)"
action-values: "Discrete(5)/Box(0.0, 1.0, (5,))"
observation-shape: "(4)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_v2"
Expand All @@ -22,10 +22,12 @@ Observation space: `[self_vel, landmark_rel_position]`
### Arguments

```
simple_v2.env(max_cycles=25)
simple_v2.env(max_cycles=25, continuous_actions=False)
```



`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous

8 changes: 5 additions & 3 deletions docs/mpe/simple_adversary.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Adversary"
agents: "3"
manual-control: "No"
action-shape: "(5)"
action-values: "Discrete(5)"
action-values: "Discrete(5)/Box(0.0, 1.0, (5))"
observation-shape: "(8),(10)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_adversary_v2"
Expand All @@ -28,11 +28,13 @@ Adversary action space: `[no_action, move_left, move_right, move_down, move_up]`
### Arguments

```
simple_adversary_v2.env(N=2, max_cycles=25)
simple_adversary_v2.env(N=2, max_cycles=25, continuous_actions=False)
```



`N`: number of good agents and landmarks

`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous
8 changes: 5 additions & 3 deletions docs/mpe/simple_crypto.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Crypto"
agents: "2"
manual-control: "No"
action-shape: "(4)"
action-values: "Discrete(4)"
action-values: "Discrete(4)/Box(0.0, 1.0, (4))"
observation-shape: "(4),(8)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_crypto_v2"
Expand Down Expand Up @@ -35,9 +35,11 @@ For Bob and Eve, their communication is checked to be the 1 bit of information t
### Arguments

```
simple_crypto_v2.env(max_cycles=25)
simple_crypto_v2.env(max_cycles=25, continuous_actions=False)
```



`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous
6 changes: 3 additions & 3 deletions docs/mpe/simple_push.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Push"
agents: "2"
manual-control: "No"
action-shape: "(5)"
action-values: "Discrete(5)"
action-values: "Discrete(5)/Box(0.0, 1.0, (5,))"
observation-shape: "(8),(19)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_push_v2"
Expand All @@ -28,7 +28,7 @@ Adversary action space: `[no_action, move_left, move_right, move_down, move_up]`
### Arguments

```
simple_push_v2.env(max_cycles=25)
simple_push_v2.env(max_cycles=25, continuous_actions=False)
```


Expand Down
13 changes: 9 additions & 4 deletions docs/mpe/simple_reference.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Reference"
agents: "2"
manual-control: "No"
action-shape: "(50)"
action-values: "Discrete(50)"
action-values: "Discrete(50)/Box(0.0, 1.0, (15))"
observation-shape: "(21)"
observation-values: "(-inf,inf)"
average-total-reward: "-57.1"
Expand All @@ -22,19 +22,24 @@ Locally, the agents are rewarded by their distance to their target landmark. Glo

Agent observation space: `[self_vel, all_landmark_rel_positions, landmark_ids, goal_id, communication]`

Agent action space: `[say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9] X [no_action, move_left, move_right, move_down, move_up]`
Agent discrete action space: `[say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9] X [no_action, move_left, move_right, move_down, move_up]`

Where X is the Cartesian product (giving a total action space of 50).

Agent continuous action space: `[no_action, move_left, move_right, move_down, move_up, say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9]`

### Arguments


```
simple_reference_v2.env(local_ratio=0.5, max_cycles=25)
simple_reference_v2.env(local_ratio=0.5, max_cycles=25, continuous_actions=False)
```



`local_ratio`: Weight applied to local reward and global reward. Global reward weight will always be 1 - local reward weight.

`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous

8 changes: 5 additions & 3 deletions docs/mpe/simple_speaker_listener.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Speaker Listener"
agents: "2"
manual-control: "No"
action-shape: "(3),(5)"
action-values: "Discrete(3),(5)"
action-values: "Discrete(3),(5)/Box(0.0, 1.0, (3)), Box(0.0, 1.0, (5))"
observation-shape: "(3),(11)"
observation-values: "(-inf,inf)"
average-total-reward: "-80.9"
Expand All @@ -29,9 +29,11 @@ Listener action space: `[no_action, move_left, move_right, move_down, move_up]`
### Arguments

```
simple_speaker_listener_v2.env(max_cycles=25)
simple_speaker_listener_v2.env(max_cycles=25, continuous_actions=False)
```



`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous
8 changes: 5 additions & 3 deletions docs/mpe/simple_spread.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Spread"
agents: "3"
manual-control: "No"
action-shape: "(5)"
action-values: "Discrete(5)"
action-values: "Discrete(5)/Box(0.0, 1.0, (5))"
observation-shape: "(18)"
observation-values: "(-inf,inf)"
average-total-reward: "-115.6"
Expand All @@ -27,7 +27,7 @@ Agent action space: `[no_action, move_left, move_right, move_down, move_up]`
### Arguments

```
simple_spread_v2.env(N=3, local_ratio=0.5, max_cycles=25)
simple_spread_v2.env(N=3, local_ratio=0.5, max_cycles=25, continuous_actions=False)
```


Expand All @@ -37,3 +37,5 @@ simple_spread_v2.env(N=3, local_ratio=0.5, max_cycles=25)
`local_ratio`: Weight applied to local reward and global reward. Global reward weight will always be 1 - local reward weight.

`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous
8 changes: 5 additions & 3 deletions docs/mpe/simple_tag.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple Tag"
agents: "4"
manual-control: "No"
action-shape: "(5)"
action-values: "Discrete(5)"
action-values: "Discrete(5)/Box(0.0, 1.0, (50))"
observation-shape: "(14),(16)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_tag_v2"
Expand Down Expand Up @@ -34,7 +34,7 @@ Agent and adversary action space: `[no_action, move_left, move_right, move_down,
### Arguments

```
simple_tag_v2.env(num_good=1, num_adversaries=3, num_obstacles=2 , max_cycles=25)
simple_tag_v2.env(num_good=1, num_adversaries=3, num_obstacles=2, max_cycles=25, continuous_actions=False)
```


Expand All @@ -47,3 +47,5 @@ simple_tag_v2.env(num_good=1, num_adversaries=3, num_obstacles=2 , max_cycles=25

`max_cycles`: number of frames (a step for each agent) until game terminates

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous

11 changes: 7 additions & 4 deletions docs/mpe/simple_world_comm.md
@@ -1,10 +1,10 @@
---
actions: "Discrete"
actions: "Discrete/Continuous"
title: "Simple World Comm"
agents: "6"
manual-control: "No"
action-shape: "(5),(20)"
action-values: "Discrete(5),(20)"
action-values: "Discrete(5),(20)/Box(0.0, 1.0, (5)), Box(0.0, 1.0, (9))"
observation-shape: "(28),(34)"
observation-values: "(-inf,inf)"
import: "from pettingzoo.mpe import simple_world_comm_v2"
Expand All @@ -31,16 +31,17 @@ Good agent action space: `[no_action, move_left, move_right, move_down, move_up]

Normal adversary action space: `[no_action, move_left, move_right, move_down, move_up]`

Adversary leader observation space: `[say_0, say_1, say_2, say_3] X [no_action, move_left, move_right, move_down, move_up]`
Adversary leader discrete action space: `[say_0, say_1, say_2, say_3] X [no_action, move_left, move_right, move_down, move_up]`

Where X is the Cartesian product (giving a total action space of 50).

Adversary leader continuous action space: `[no_action, move_left, move_right, move_down, move_up, say_0, say_1, say_2, say_3]`

### Arguments

```
simple_world_comm.env(num_good=2, num_adversaries=4, num_obstacles=1,
num_food=2, max_cycles=25, num_forests=2)
num_food=2, max_cycles=25, num_forests=2, continuous_actions=False)
```


Expand All @@ -57,3 +58,5 @@ simple_world_comm.env(num_good=2, num_adversaries=4, num_obstacles=1,

`num_forests`: number of forests that can hide agents inside from being seen

`continuous_actions`: Whether agent action spaces are discrete(default) or continuous

2 changes: 1 addition & 1 deletion pettingzoo/mpe/_mpe_utils/rendering.py
Expand Up @@ -283,7 +283,7 @@ def set_text(self, text):
self.label = pyglet.text.Label(text,
font_name=font,
color=(0, 0, 0, 255),
font_size=25,
font_size=20,
x=0, y=self.idx * 40 + 20,
anchor_x="left", anchor_y="bottom")

Expand Down

0 comments on commit 383c152

Please sign in to comment.