CEIA RL Soccer-Twos

A pre-compiled Soccer-Twos environment with multi-agent Gym-compatible wrappers and a human-friendly visualizer. Built on top of Unity ML Agents to be used as final assignment for the Reinforcement Learning Minicourse at CEIA / Deep Learning Brazil.

Pre-compiled versions of this environment are available for Linux, Windows and MacOS (x86, 64 bits). The source code for this environment is available here.

Requirements

See requirements.txt.

Usage

For training

Import this package and instantiate the environment:

import soccer_twos

env = soccer_twos.make()

The make method accepts several options:

Option	Description
`render`	Whether to render the environment. Defaults to `False`.
`watch`	Whether to run an audience-friendly version the provided Soccer-Twos environment. Forces `render` to `True`, `time_scale` to `1` and `quality_level` to `5`. Has no effect when `env_path` is set. Defaults to `False`.
`variation`	A soccer env variation in EnvType. Defaults to `EnvType.multiagent_player`
`time_scale`	The time scale to use for the environment. This should be less than `100`x for better simulation accuracy. Defaults to `20`x realtime.
`quality_level`	The quality level to use when rendering the environment. Ranges between `0` (lowest) and `5` (highest). Defaults to `0`.
`base_port`	The base port to use to communicate with the environment. Defaults to `50039`.
`worker_id`	Used as base port shift to avoid communication conflicts. Defaults to `0`.
`env_path`	The path to the environment executable. Overrides `watch`. Defaults to the provided Soccer-Twos environment.
`flatten_branched`	If `True`, turn branched discrete action spaces into a `Discrete` space rather than `MultiDiscrete`. Defaults to `False`.
`opponent_policy`	The policy to use for the opponent when `variation==team_vs_policy`. Defaults to a random agent.
`single_player`	Whether to let the agent control a single player, while the other stays still. Only works when `variation==team_vs_policy`. Defaults to `False`.

The created env exposes a basic Gym interface. Namely, the methods reset(), step(action: Dict[int, np.ndarray]) and close() are available. The render() method has currently no effect and soccer_twos.make(render=True) should be used instead. The step() method returns extra information about the player and the ball in the last tuple element. This information may be used to build custom reward functions if needed.

We expose an RLLib-compatible multiagent interface. This means, for example, that action should be a dict where keys are integers in {0, 1, 2, 3} corresponding to each agent. Additionally, values should be single actions shaped like env.action_space.shape. Observations and rewards follow the same structure. Dones are only set for the key __all__, which means "all agents". Agents 0 and 1 correspond to the blue team and agents 2 and 3 correspond to the orange team.

Here's a full example:

import soccer_twos

env = soccer_twos.make(render=True)
print("Observation Space: ", env.observation_space.shape)
print("Action Space: ", env.action_space.shape)

team0_reward = 0
team1_reward = 0
while True:
    obs, reward, done, info = env.step(
        {
            0: env.action_space.sample(),
            1: env.action_space.sample(),
            2: env.action_space.sample(),
            3: env.action_space.sample(),
        }
    )

    team0_reward += reward[0] + reward[1]
    team1_reward += reward[2] + reward[3]
    if done["__all__"]:
        print("Total Reward: ", team0_reward, " x ", team1_reward)
        team0_reward = 0
        team1_reward = 0
        env.reset()

More information about the environment including reward functions and observation spaces can be found here.

Watching / evaluating

You may implement your own rollout script using soccer_twos.make(watch=True) or use our CLI tool. To rollout via CLI, you must create an implementation (subclass) of soccer_twos.AgentInterface and run python -m soccer_twos.watch -m agent_module. This will run a human-friendly version of the environment, where your agent will play against itself. You may instead run python -m soccer_twos.watch -m1 agent_module -m2 opponent_module to play against a different opponent.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
images		images
src/soccer_twos		src/soccer_twos
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

images

images

src/soccer_twos

src/soccer_twos

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

setup.cfg

setup.cfg

Repository files navigation

CEIA RL Soccer-Twos

Requirements

Usage

For training

Watching / evaluating

About

Releases

Packages

Languages

License

eduagarcia/soccer-twos-env

Folders and files

Latest commit

History

Repository files navigation

CEIA RL Soccer-Twos

Requirements

Usage

For training

Watching / evaluating

About

Resources

License

Stars

Watchers

Forks

Languages