Skip to content
Branch: master
Find file History
liusiqi43 and Copybara-Service Add tracking for player's forward velocity.
PiperOrigin-RevId: 274799877
Change-Id: I3581cb746033ce77452735842ac2cd1a7be58ee9
Latest commit 613dce1 Oct 15, 2019
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
assets
README.md Implement environment reward based on goals scored/conceded at each t… Jun 23, 2019
__init__.py Expose `disable_walker_contacts` as an option in soccer loader. Jul 1, 2019
boxhead.py Add `ground_contact_geoms` property to the Walker base class. May 1, 2019
boxhead_test.py Release dm_control.locomotion, containing a multi-agent soccer enviro… Feb 20, 2019
explore.py Release dm_control.locomotion, containing a multi-agent soccer enviro… Feb 20, 2019
initializers.py Ensure soccer initializer sets position of walker correctly regardles… Jun 4, 2019
loader_test.py Expose `disable_walker_contacts` as an option in soccer loader. Jul 1, 2019
observables.py Add tracking for player's forward velocity. Oct 15, 2019
pitch.py Added a property to get the Pitch's ground geom. Sep 29, 2019
pitch_test.py Release dm_control.locomotion, containing a multi-agent soccer enviro… Feb 20, 2019
soccer.png Update README for soccer. Feb 21, 2019
soccer_ball.py Add tracking cameras to the soccer ball. Feb 21, 2019
soccer_ball_test.py Add tracking cameras to the soccer ball. Feb 21, 2019
task.py Replace environment interface code with dm_env import. Aug 1, 2019
task_test.py Add tracking for player's forward velocity. Oct 15, 2019
team.py Release dm_control.locomotion, containing a multi-agent soccer enviro… Feb 20, 2019

README.md

DeepMind MuJoCo Multi-Agent Soccer Environment.

This submodule contains the components and environment described in ICLR 2019 paper Emergent Coordination through Competition.

soccer

Installation and requirements

See dm_control for instructions.

Quickstart

import numpy as np
from dm_control.locomotion import soccer as dm_soccer

# Load the 2-vs-2 soccer environment with episodes of 10 seconds:
env = dm_soccer.load(team_size=2, time_limit=10.)

# Retrieves action_specs for all 4 players.
action_specs = env.action_spec()

# Step through the environment for one episode with random actions.
time_step = env.reset()
while not time_step.last():
  actions = []
  for action_spec in action_specs:
    action = np.random.uniform(
        action_spec.minimum, action_spec.maximum, size=action_spec.shape)
    actions.append(action)
  time_step = env.step(actions)

  for i in range(len(action_specs)):
    print(
        "Player {}: reward = {}, discount = {}, observations = {}.".format(
            i, time_step.reward[i], time_step.discount,
            time_step.observation[i]))

Rewards

The environment provides a reward of +1 to each player when their team scores a goal, -1 when their team concedes a goal, or 0 if neither team scored on the current timestep.

In addition to the sparse reward returned the environment, the player observations also contain various environment statistics that may be used to derive custom per-player shaping rewards (as was done in http://arxiv.org/abs/1902.07151, where the environment reward was ignored).

Episode terminations

Episodes will terminate immediately with a discount factor of 0 when either side scores a goal. There is also a per-episode time_limit (45 seconds by default). If neither team scores within this time then the episode will terminate with a discount factor of 1.

Environment Viewer

To visualize an example 2-vs-2 soccer environment in the dm_control interactive viewer, execute dm_control/locomotion/soccer/explore.py.

You can’t perform that action at this time.