# Multi Agent training using Malmo
This example expands on the single agent training example and shows how to use the multi-agent example using RLlib. Some concepts are explained with more details in the Single-Agent training example, so we recommend you to start there.

This guide shows how to startup multi-agent malmo environments and training RL agents with RLlib.

Before we look at the code, let's go through how the multi-agent setup works in Malmo.

To run a multi-agent mission Malmo requires N instances running, 1 for each agent. One of the instances is going to act as a server and the other instances join as clients.

The figure below shows the relationship between the Malmo instances.
![Multi-agent setup](imgs/malmo_multiagent_wrapper.png "Multi-agent wrapper")


In [None]:
# imports
from pathlib import Path
import os

# malmoenv imports
import malmoenv
from malmoenv.utils.launcher import launch_minecraft
from malmoenv.utils.wrappers import DownsampleObs
from malmoenv.turnbasedmultiagentenv import AgentConfig, TurnBasedRllibMultiAgentEnv

import ray
from ray.tune import register_env

The next step is to define some constants. These are mostly the same as for the single-agent example.
The main difference here is that ```NUM_WORKERS``` here represents the number of malmo environments and not the number of agents. Care should be taken when assigning resources as using this example uses 2 CPU cores per env.

In [None]:
MULTI_AGENT_ENV = "malmo_multi_agent"
MISSION_XML = os.path.realpath('../../MalmoEnv/missions/mobchase_single_agent.xml')
COMMAND_PORT = 8999 # first port's number
xml = Path(MISSION_XML).read_text()

CHECKPOINT_FREQ = 100      # in terms of number of algorithm iterations
LOG_DIR = "results/"       # creates a new directory and puts results there

NUM_WORKERS = 1            # number of environments to run - each env get multiple agents
NUM_GPUS = 0
TOTAL_STEPS = int(1e6)
launch_script = "./launchClient_quiet.sh"

Next we want to create a function that defines how the environment is generated in RLlib. This is going to be the python client connecting to the malmo instances, so make sure that these PORT numbers match the ports used later to create the Minecraft instances.
When using RLlib each worker has an index accessible by calling ```config.worker_index```, using this variable we can easily set the correct ports for each env.
If we would like to use wrappers the ```env_factory``` function is a good place to add them, see the ```DownsampleObs``` wrapper added in this example.
To use RLlib we have created 2 functions:
- ```env_factory```: Starts up a Malmo instance as we did with the ```create_env``` function in the Single agent example
- ```create_multi_agent_env```: Assigns the correct roles to the agents and wrap the environments using the TurnBasedRllibMultiAgentEnv.

Finally we have to register the env generator function to make it visible to RLlib.

In [None]:
def env_factory(agent_id, xml, role, host_address, host_port, command_address, command_port):
    env = malmoenv.make()
    env.init(xml, host_port,
             server=host_address,
             server2=command_address,
             port2=command_port,
             role=role,
             exp_uid="multiagent",
             reshape=True
             )
    env = DownsampleObs(env, shape=(84, 84))

    return env

def create_multi_agent_env(config):
    port = COMMAND_PORT + (config.worker_index * 2)
    agent_config = [
        AgentConfig(id=f"agent1", address=port),
        AgentConfig(id=f"agent2", address=port + 1),
    ]
    env = TurnBasedRllibMultiAgentEnv(xml, agent_config,
                                      env_factory=env_factory,)
    return env

register_env(MULTI_AGENT_ENV, create_multi_agent_env)

The next step is to start up the Minecraft instances. Note that this step might take a few minutes.
In the background each Malmo instance get copied to the ```/tmp/malmo_<hash>/malmo``` directory, where it gets executed (Each Minecraft instance requires its own directory).
After copying the instances are started using a the provided ```launch_script```, this is where we can define if we want to run it without rendering a window for example.
By default it uses the ```launchClient_quiet.sh``` script which renders each window, another script provided is the ```launchClient_headless.sh``` which uses xvfb to export the display

In [None]:

GAME_INSTANCE_PORTS = [COMMAND_PORT + 1 + i for i in range(NUM_WORKERS)]
instances = launch_minecraft(GAME_INSTANCE_PORTS, launch_script=launch_script)
