# Tutorial 1: Halite Environment

In this first tutorial we will see how to use the game engine that we implemented.

If you did not have a look at the game basics before, now is the time!

https://2018.halite.io/learn-programming-challenge/game-overview

First of all we import the basic libraries that we need

In [1]:
import sys
sys.path.insert(0, "../Environment/")
import halite_env as Env

In [2]:
import numpy as np
import matplotlib.pyplot as plt

The halite environment is implemented as a class (that inherits from gym.core.Env) called HaliteEnv. 

As can be seen by calling the help function, the minimal arguments required to instantiate the environment are the number of players and the map size (the map is a square, thus it takes as input just the length of the side of the map).

In [3]:
help(Env.HaliteEnv)

Help on class HaliteEnv in module halite_env:

class HaliteEnv(gym.core.Env)
 |  HaliteEnv(num_players, map_size, episode_lenght=400, regen_map_on_reset=False, map_type=None)
 |  
 |  Stores the Halite III OpenAI gym environment.
 |  
 |  [Original, to change]
 |  This environment does not use Halite III's actual game engine
 |  (which analyzes input from terminal and is slow for RL) but instead is
 |  a replica in Python.
 |  
 |  Attributes:
 |  -----------
 |  self.map : np.ndarray
 |      Map of game as a 3D array. Stores different information on each "layer"
 |      of the array.
 |  Layer 0: The Halite currently on the sea floor
 |  Layer 1: The Halite currently on ships/factory/dropoff
 |  Layer 2: Whether a Factory or Dropoff exists at the layer (Factory is 1, Dropoff is -1)
 |  Layer 3: Whether a Ship exists at the layer
 |  Layer 4: Ownership
 |  Layer 5: Inspiration (not given as part of observation by default)
 |  
 |  self.mapSize : int
 |      Size of map (for x and y)
 |

In [6]:
# parameters for a single player in a 7x7 map
NUM_PLAYERS = 1
MAP_SIZE = 7

# instantiate class
env = Env.HaliteEnv(num_players=NUM_PLAYERS, map_size=MAP_SIZE)

## Environment layers

Each environment instance has an attribute called map, where are represented all the informations related to the map and the entities that can be found on the map (Ships, Shipyards and Dropoffs).

The map is a multi-layer (3 dimensional) matrix: the first 2 coordinates represent the cells of the map, the third gives access to different layers:

Layer 0: The Halite currently on the sea floor <br>
Layer 1: Whether a Ship exists at the layer <br>
Layer 2: The Halite currently on ships/shipyard/dropoff <br>
Layer 3: Whether a Shipyard or a Dropoff exists at the layer (Shipyard is 1, Dropoff is -1) <br>
Layer 4: Ownership of Ships, Shipyards and Dropoffs (always 1 for your entities, is used only in multiplayer)<br>

All halite values are integers that range between 0 and 1000.

In [7]:
# since the MAP_SIZE is 7, we have a 7x7x5 numpy array
env.map.shape

(7, 7, 5)

Now we look at all the layers for the turn zero (since we have not made any move yet, no ship is present and the whole thing is a bit boring, but we will return on that...)

In [12]:
# halite in the map, min = 0, max = 1000
env.map[:,:,0]

# NOTE: the values in which is present a shipyard (the central cell in the single player case) are always zero

array([[274, 826, 324, 362, 872, 274, 487],
       [ 99, 970, 889, 241, 811, 581, 283],
       [189, 828, 765, 191,  61,  17, 380],
       [964, 876, 312,   0, 871, 597, 202],
       [700, 544, 290, 576, 636, 487, 383],
       [793, 188, 176, 796,  48, 880,  77],
       [714, 334, 597, 698, 816, 228, 569]])

In [13]:
# ships' positions (1 where a ship is present, 0 otherwise)
env.map[:,:,1] 

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [14]:
# shows the halite carried from each ship in the position corresponding to the ship
# initially there is no ship, hence no halite carried either
env.map[:,:,2] 

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [15]:
# shipyard position (1 in the center)
env.map[:,:,3] 

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

The last layer is not implemented in the case of one player because is useless.

In [17]:
# finally we can print our current halite stored
# we are represented as player #0 in a list of players (of length 1 in this case)
print("Initial halite: ", env.player_halite[0])

Initial halite:  [5000.]


## Submitting actions

Being Halite a turn-based game, we need to submit to the game engine actions for all the entites that we control, that are the shipyard and maybe some ships. We call this submission `step` and it is implemented as a method of the environment class.

The shipyard has a binary choice: it either builds a ship (for 1000 halite) or it doesn't. This choice must be passed as an argument of the method with the keyword `makeship`.

Regarding the ships instead we must provide a matrix with the same shape of the map, containing in the positions where the ships are located the corresponding actions for that turn and -1 in all other cells.

The space action for the ships is composed by all integers between 0 and 4:
* **0**: stay still;
* **1**: move South;
* **2**: move North;
* **3**: move Est;
* **4**: move West.

In [19]:
# actions are represented as a matrix whose entries are -1 if no ship is in that position, 
#'a_i' if ship i is present in that position 
action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1) # no ship, no action

# the environment already has in memory the last state, thus we don't need to resubmit it
# the only things that we submit are the action matrix and the shipyard action (1 or True to spawn a ship, 0 otherwise)
shipyard_action = True # initially always choose to create a ship

# Turn #1
# NOTE: state is equivalent to env.map
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

print("Halite of the player:", halite[0]) # 1000 less than previous turn, since a ship was built
print("Is the episode finished? ", finish_flag)

# state_0 -> map_halite, state_1 -> ship_position, state_2 -> cargo_halite, state_3 -> shipyard_position
map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
shipy_pos_matrix = state[:,:,3]

Halite of the player:v [4000.]
Is the episode finished?  False


Now we can see that a ship has appeared in the layer 1 of the state:

In [20]:
ship_pos_matrix

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

Now we can make some moves to test all the actions and their effects

In [22]:
# turn #2
shipyard_action = False

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") # works just fine for one ship

move = 1 # move south

action_matrix[ship_mask] = move

In [23]:
action_matrix

array([[-1, -1, -1, -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1],
       [-1, -1, -1,  1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1]])

In [24]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [4000.]
Is the episode finished?  False


In [26]:
ship_pos_matrix # ship has moved south of 1 cell

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [27]:
ship_cargo_matrix # no halite carried by the ship 

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

Before moving let's have a look at the halite in the neighborhood

In [28]:
map_halite

array([[274, 826, 324, 362, 872, 274, 487],
       [ 99, 970, 889, 241, 811, 581, 283],
       [189, 828, 765, 191,  61,  17, 380],
       [964, 876, 312,   0, 871, 597, 202],
       [700, 544, 290, 576, 636, 487, 383],
       [793, 188, 176, 796,  48, 880,  77],
       [714, 334, 597, 698, 816, 228, 569]])

The level of halite in each cell is initialized randomly and since we didn't fixed any seed in the generator each of us will see a different result. In my case there is a rich cell one position est w.r.t. the shipyard and I can get there in 2 moves (e.g. right and then up); in this way we will also see all the possible actions.

In [29]:
# Turn #3
shipyard_action = False

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") 

move = 3 # move east

action_matrix[ship_mask] = move

In [30]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [4000.]
Is the episode finished?  False


In [31]:
ship_pos_matrix # ship has moved east of 1 cell

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [32]:
ship_cargo_matrix # still no halite carried by the ship 

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [33]:
# Turn #4
shipyard_action = False

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") 

move = 2 # move north

action_matrix[ship_mask] = move

In [34]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [4000.]
Is the episode finished?  False


In [35]:
ship_pos_matrix # now we are in position to gather some halite

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

In [36]:
# Turn #5
shipyard_action = False

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") 

move = 0 # stay still

action_matrix[ship_mask] = move

In [37]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [4000.]
Is the episode finished?  False


In [38]:
ship_cargo_matrix # now the ship has harvested the 25% of the halite (in my case 218)

array([[  0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0, 218,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0,   0,   0]])

Now we want to take the ship back to the shipyard in order to store the carried halite

In [39]:
# Turn #6
shipyard_action = False

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") 

move = 4 # move west

action_matrix[ship_mask] = move

In [40]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [4218.]
Is the episode finished?  False


As you can see the halite carried is immediately stored in the shipyard (there is no need to stay still for one turn in the shipyard to do that).

## Ship sinking

It's forbidden to have two or more ships on the same cell: in such a case ALL ships will sunk. To see this for example we can order the shipyard to produce one more ship while we order the first one to stay still.

In [41]:
# Turn #6
shipyard_action = True

action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)

ship_mask = ship_pos_matrix.astype("bool") 

move = 0 # move west

action_matrix[ship_mask] = move

In [42]:
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)

map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
ship_cargo_matrix = state[:,:,2]
shipy_pos_matrix = state[:,:,3]

print("Halite of the player:", halite[0]) # same halite amount
print("Is the episode finished? ", finish_flag)

Halite of the player: [3218.]
Is the episode finished?  False


In [43]:
ship_pos_matrix # no more ship present

array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0]])

And that is pretty all for what you need to know about the environment and game engine!