# Random Agent in Malmo
This guide shows how to setup a single-player Malmo mission. This example may serve as a basis to use Malmo in your RL experiments.

## Malmo launcher
In earlier versions of ```malmoenv``` each Minecraft instance had to be started manually from command line. The launcher handles these processes automatically.
Each launcher instance creates a copy of Malmo into the ```/tmp/malmo_<hash>/``` directory and starts it up using a launch script and a given port. The figure below shows this process with the first port set to 9000 and using the ```~/launch_headless.sh``` script. Note that the launcher searches for the launch script in the ```Minecraft/``` subdirectory.

![Malmo Launcher](../imgs/malmo_launcher.png)

In [None]:
# imports
from pathlib import Path
import os

# malmoenv imports
import malmoenv
from malmoenv.utils.launcher import launch_minecraft

The next step is to define some constants.

The ```MISSION_XML``` is the file defining the current mission. The ```malmoenv``` module communicates with the JAVA version of Minecraft through sockets, so it is important to make sure that the PORT numbers align. This example has been setup to work correctly with both 1 and multiple workers.

By default we provide 2 launch scripts:
- ```./launchClient_quiet.sh``` - runs Malmo as normal with redirecting the out and error streams to the ```out.txt``` file in the copied Malmo directory in the ```/tmp``` directory.
- ```./launchClient_headless.sh``` - runs Malmo without rendering a window. Malmo's output is the same as with ```launchClient_quiet.sh```. To run this ```xvfb``` should be installed on your computer. This script is useful to run Malmo on headless servers.

In [None]:
ENV_NAME = "malmo"
MISSION_XML = os.path.realpath('../../MalmoEnv/missions/mobchase_single_agent.xml')
COMMAND_PORT = 8999
xml = Path(MISSION_XML).read_text()

CHECKPOINT_FREQ = 100      # in terms of number of algorithm iterations
LOG_DIR = "results/"       # creates a new directory and puts results there

NUM_WORKERS = 1
NUM_GPUS = 0
EPISODES = 10
launch_script = "./launchClient_quiet.sh"

Next we create a dictionary called config to store the parameters required for creating Malmo environments such as the mission XML and the COMMAND_PORT. This example assumes to only use a single environment.
```env.init``` by default returns a flattened representation of the observed frame, setting ```reshape=True``` keeps it as an image with [width, height, channels] dimensions.

In [None]:
config = {
    "xml": xml,
    "port": COMMAND_PORT,
}
def create_env(config):
    env = malmoenv.make()
    env.init(config["xml"], config["port"], reshape=True)
    env.reward_range = (-float('inf'), float('inf'))
    return env

env = create_env(config)

The next step is to start up the Minecraft instances. Note that this step might take a few minutes.
In the background each Malmo instance get copied to the ```/tmp/malmo_<hash>/malmo``` directory, where it gets executed (Each Minecraft instance requires its own directory).
After copying the instances are started using a the provided ```launch_script```, this is where we can define if we want to run it without rendering a window for example.

In [None]:
GAME_INSTANCE_PORTS = [COMMAND_PORT + i for i in range(NUM_WORKERS)]
instances = launch_minecraft(GAME_INSTANCE_PORTS, launch_script=launch_script)

The final step is to run the random agent in Malmo. Using the default launch script you should see Malmo in a new window on your screen. Resetting the env might take a few seconds depending on the complexity of the mission. In this example we accumulate the rewards and the game steps and print it into the console.

At the end we close the environments and kill the JAVA instances in the background.

In [None]:
for i in range(EPISODES):
    obs = env.reset()
    steps = 0
    total_rewards = 0
    done = False
    while not done:
        action = env.action_space.sample()
        obs, reward, done, info = env.step(action)
        steps += 1
        total_rewards += reward

        if done:
            print(f"Episode finished in {steps} with reward: {total_rewards} ")

# close envs
env.close()
for instance in instances:
    instance.communicate()