# LuxAI Season 1 (2021): Random Agent

The purpose of this notebook was to get acquainted with the problem, map the input and output spaces, reduce their complexity.

I massage the observations and convert them to a vector space for future use with RL agents. For now, just a random agent to check interface.

â€”[Jaime](http://jaime.rs)

In [None]:
!pip install kaggle-environments -U
from kaggle_environments import make
# create the environment. You can also specify configurations for seed and loglevel as shown below. If not specified, a random seed is chosen. 
# loglevel default is 0. 
# 1 is for errors, 2 is for match warnings such as units colliding, invalid commands (recommended)
# 3 for info level, and 4 for everything (not recommended)
# set annotations True so annotation commands are drawn on visualizer
# set debug to True so print statements get shown
# env = make("lux_ai_2021", configuration={"seed": 562124210, "loglevel": 2, "annotations": True}, debug=True)
# run a match between two simple agents, which are the agents we will walk you through on how to build!
# steps = env.run(["simple_agent", "simple_agent"])
# if you are viewing this outside of the interactive jupyter notebook / kaggle notebooks mode, this may look cutoff
# render the game, feel free to change width and height to your liking. We recommend keeping them as large as possible for better quality.
# you may also want to close the output of this render cell or else the notebook might get laggy
# env.render(mode="ipython", width=1200, height=800)

In [None]:
%%writefile agent.py
# for kaggle-environments
from lux.game import Game
from lux.game_map import Cell, RESOURCE_TYPES, Position
from lux.constants import Constants
from lux.game_constants import GAME_CONSTANTS
from lux import annotate
import math
import sys

import numpy as np
import pprint

In [None]:
# Some helper functions (not used by agent, just for EDA)

def obj_dir(obj):
    """
    >>> obj_dir(game_state.players[0])
    """
    print(type(obj))
    print()
    for a in dir(obj):
        if not a.startswith('__'):
            print(a, getattr(obj, a))
        

def find_by_type(game_state, type_name='resource'):
    """
    type_name (list): 'resource', 'citytile', 'road'
    """
    resource_tiles: list[Cell] = []
    width, height = game_state.map_width, game_state.map_height
    for y in range(height):
        for x in range(width):
            cell = game_state.map.get_cell(x, y)
            if getattr(cell, type_name):
                resource_tiles.append(cell)
    return resource_tiles

# the next snippet finds the closest resources that we can mine given position on a map
def find_closest_resources(pos, player, resource_tiles):
    closest_dist = math.inf
    closest_resource_tile = None
    for resource_tile in resource_tiles:
        # we skip over resources that we can't mine due to not having researched them
        if resource_tile.resource.type == Constants.RESOURCE_TYPES.COAL and not player.researched_coal(): continue
        if resource_tile.resource.type == Constants.RESOURCE_TYPES.URANIUM and not player.researched_uranium(): continue
        dist = resource_tile.pos.distance_to(pos)
        if dist < closest_dist:
            closest_dist = dist
            closest_resource_tile = resource_tile
    return closest_resource_tile

In [None]:
%%writefile -a agent.py
# Helper functions used by agent

def get_researched_mask(research_points):
    """Return a mask of available resources"""
    mask = [Constants.RESOURCE_TYPES.WOOD]
    if research_points >= GAME_CONSTANTS['PARAMETERS']['RESEARCH_REQUIREMENTS']['URANIUM']:
        mask += [Constants.RESOURCE_TYPES.COAL, Constants.RESOURCE_TYPES.URANIUM]
    elif research_points >= GAME_CONSTANTS['PARAMETERS']['RESEARCH_REQUIREMENTS']['COAL']:
        mask += [Constants.RESOURCE_TYPES.COAL]
    return mask

def closest_node(node, nodes):
    """Credit @glmcdona 
    https://www.kaggle.com/glmcdona/lux-ai-deep-reinforcement-learning-ppo-example
    https://codereview.stackexchange.com/questions/28207/finding-the-closest-point-to-a-list-of-points
    """
    dist_2 = np.sum((nodes - node)**2, axis=1)
    return np.argmin(dist_2)

def min_distance(node, nodes):
    assert node.shape == (2,), print(node.shape)
    assert nodes.shape[1] == 2
    dist_2 = np.sum((nodes - node)**2, axis=1)
    return np.min(dist_2)

def rank_resources(resources: np.ndarray, 
                   units: np.ndarray, 
                   top_k: int = None, 
                   d_thd: int = None) -> tuple:
    """Given a list of units, rank the list of resources by proximity to any unit.
    
    e.g. 
    
    >>> units = np.array([[  4,  27, 100,   0,   0,   0,   0,   0],])

    >>> resources = np.array([[  4,  27, 238,   0,   0,   0,   0,   0],
                              [  4,  11, 209,   0,   0,   0,   0,   0],
                              [  4,  28, 238,   0,   0,   0,   0,   0],
                              [  5,  11, 126,   0,   0,   0,   0,   0],
                              [  3,  28,  32,   0,   0,   0,   0,   0],])

    >>> d, r = rank_resources(resources, units)
    
    If top_k is provided, return only the top K resources by distance
    If d_thd is provided, return only the resources within thd distance
    If both are provided, top K takes precedence
    """
    distances = [min_distance(resource, units[:, :2]) for resource in resources[:, :2]] # TODO: this will break when I add team/unit type to input vectors
    to_be_sorted = np.c_[distances, resources]
    done_sorted = to_be_sorted[np.argsort(to_be_sorted[:, 0])]
    if top_k:
        return done_sorted[:top_k, 0], done_sorted[:top_k, 1:]
    elif d_thd:
        return done_sorted[np.where(done_sorted[:, 0] <= d_thd)][:, 0], done_sorted[np.where(done_sorted[:, 0] <= d_thd)][:, 1:]
    return done_sorted[:, 0], done_sorted[:, 1:]  # distances, resources

In [None]:
%%writefile -a agent.py

# from dataclasses import dataclass
from typing import Tuple

class Agent:
    UNIT_ACTION_RANGE: Tuple[int] = (-1, 4)
    CITYTILE_ACTION_RANGE: Tuple[int] = (-1, 2)
    
    def __init__(self, team=0):
        self.team = team
    
    def get_observation(self, game_state: Game):
        """
        Returns:
            - units
            - citytiles
            - resources
            - research
          
        NOTES:
        Observations/updates:
            "rp 1 25"                       research_points player amount
            "r wood 0 5 500"                resource type pos.x pos.y amount
            "r coal 6 0 379"
            "r uranium 6 6 323"             
            "u 0 1 u_28 10 10 0 40 0 0"     unit type team id pos.x pos.y cooldown cargo.wood cargo.coal cargo.uranium
            "c 0 c_1 210 23"                city team id fuel light_upkeep
            "ct 0 c_1 3 9 6"                citytile team city.id pos cooldown
            "ccd 0 4 6"                     cellcooldown pos.x pos.y cooldown  (i.e. road or citytile, citytiles have ccd=6)
        """        
        # CITY: 'cityid', 'citytiles', 'fuel', 'get_light_upkeep', 'light_upkeep', 'team'
        # CITYTILE: 'build_cart', 'build_worker', 'can_act', 'cityid', 'cooldown', 'pos', 'research', 'team'
        
        # Collate units and citytiles that can act this turn, in the appropriate vector format
        v_units = []
        v_citytiles = []
        v_research = []
        
        # TODO: ignoring opponent for now
        # for team, player in enumerate(game_state.players[:1]): 
        player = game_state.players[self.team]
        self._active_citytiles = [(city, citytile) for city in player.cities.values() for citytile in city.citytiles if citytile.can_act()]
        self._active_units = [unit for unit in player.units if unit.can_act()]
            
        # Research
        v_research.append(player.research_points)

        # Units
        for unit in self._active_units:
            v_units.append([
                              # team, 
                              # unit.type,  # TODO: will need this once I incorporate carts
                              unit.pos.x, unit.pos.y,
                              unit.cargo.wood, unit.cargo.coal, unit.cargo.uranium,
                              0, 0, 0,  # citytiles padding
            ])

        # Citytiles
        for city, citytile in self._active_citytiles:
            v_citytiles.append([
                                  # team, 
                                  # 2,  # i.e. "made up" unit type
                                  citytile.pos.x, citytile.pos.y,
                                  0, 0, 0,  # units padding
                                  int(city.cityid.split('_')[-1]), city.fuel, city.light_upkeep
            ])
        
        
        # Doing this here so that it is outside the "for player in players" loop, 
        # to collate both players (once I incorporate opponent)
        v_research = np.array(v_research, dtype=np.int16)
        v_units = np.array(v_units, dtype=np.int16)
        v_citytiles = np.array(v_citytiles, dtype=np.int16)

        # Resources relative to this player (based on what's researched, what's nearest to units)
        researched_resources = get_researched_mask(player.research_points)
        v_r = []
        for r in game_state.map.map:
            for cell in r:
                crs = [0, 0, 0]
                if cell.resource and (cell.resource.type in researched_resources):
                    crs[researched_resources.index(cell.resource.type)] = cell.resource.amount
                    v_r.append([
                                  # -1,  # team (impartial) 
                                  # 3,  # "made up" type
                                  cell.pos.x, cell.pos.y,
                                  *crs,
                                  0, 0, 0,  # citytile padding
                                  # cell.road
                                 ])
        v_resources = np.array(v_r, dtype=np.int16)

        # Check that units are still alive, otherwise no need to sort resources
        # TODO: (although citytiles could still make more units later on)
        if not v_units.shape == (0,):
            distances, v_resources = rank_resources(v_resources, v_units, top_k=10)
            # TODO: not using distances further for now (only for ranking resources)
        
        return v_research, v_units, v_citytiles, v_resources
    
#     def _get_action_vectors(self):
#         pass
    
#     def _convert_to_actions(self, action_vectors):
#         pass
    
    def get_actions(self, game_state: Game) -> list:    
        """
        Actions:
             "m u_18 n"                 move u_18 north
             "?"                        pillage road
             "t u_14 u_24 coal 2000"    transfer 
             "bcity u_17"               build citytile by u_17

             "bw 1 7"                   build worker in the citytile at (1, 7)
             "bc 3 9"                   build cart in the citytile at (3, 9)
             "r 4 13"                   research in the citytile at (4, 13) 
        """
        player = game_state.players[self.team]
        v_research, v_units, v_citytiles, v_resources = self.get_observation(game_state)
        
        # Get encoded actions vector
        action_vector = []
        for unit in self._active_units:
            # TODO: random actions for now
            action_vector.append(np.random.randint(*self.UNIT_ACTION_RANGE))
            # TODO: if not unit.can_build(game_map) and chosen action is 'bcity': penalise
            # TODO: if moving onto an occupied position: penalise (once this is part of observation)
        for _, citytile in self._active_citytiles:
            action_vector.append(np.random.randint(*self.CITYTILE_ACTION_RANGE))
            
        # NOTE: the values (e.g. "move south") separated by spaces are method, argument
        action_map = {'unit': {-1: 'build_city',  # TODO: this is for readability, it could just be a list with "build_city" last
                                0: 'DO_NOTHING',
                                1: 'move n',
                                2: 'move s',
                                3: 'move e',
                                4: 'move w',  # TODO: add other actions
                              },  
                     'citytile': {-1: 'research', 
                                   0: 'DO_NOTHING',
                                   1: 'build_worker',
                                   #2: 'build_cart'
                                 }
                     }
        
        actions = []
        # TODO: vectorise this operation for speed
        # Get actual actions from encoded actions
        for i, unit in enumerate(self._active_units):
            if action_vector[i]:  # i.e. if the action chosen is not "DO_NOTHING"
                s = action_map['unit'][action_vector[i]].split()
                if len(s) > 1:
                    action = getattr(unit, s[0])(*s[1:])
                else:
                    action = getattr(unit, s[0])()
                actions.append(action)
                
        for j, cc in enumerate(self._active_citytiles, start=len(self._active_units)):
            citytile = cc[1]  # city = cc[0]
            if action_vector[j]:  # i.e. if the action chosen is not "DO_NOTHING"
                s = action_map['citytile'][action_vector[j]].split()
                if len(s) > 1:
                    action = getattr(citytile, s[0])(*s[1:])
                else:
                    action = getattr(citytile, s[0])()
                actions.append(action)

        return actions

In [None]:
%%writefile -a agent.py

game_state = None
myagent = None
resources_, units_, citytiles_ = None, None, None  # for debugging later
def agent(observation, configuration):
    global game_state
    global myagent
    # for debugging later
#     global units_
#     global citytiles_
#     global resources_

    ### Do not edit ###
    if observation["step"] == 0:
        game_state = Game()
        game_state._initialize(observation["updates"])
        game_state._update(observation["updates"][2:])
        game_state.id = observation.player
    else:
        game_state._update(observation["updates"])

    ###################################################################
    
    if observation['step'] == 0:
        myagent = Agent(team=observation.player)
    
    actions = myagent.get_actions(game_state)
    
    ###################################################################
    # DEBUGGING
#     if not game_state.turn % 60:
#         print(np.c_[distances[:10], all_cells[:10]])
    
#     if game_state.turn == 6:
#         print(np.c_[distances[:10], all_cells[:10]])
#         print('Research points:', player.research_points)
#         print()
#         print(f'Units {all_units.shape}:')
#         print(all_units)
#         print()
#         print(f'City tiles {all_citytiles.shape}:')
#         print(all_citytiles)
#         print()
#         print(f'Resources {all_cells.shape}:')
#         print(all_cells)
#         print()
#         print(f'Distances {distances.shape}:')
#         print(distances)
    ###################################################################
    
    return actions

In [None]:
env = make("lux_ai_2021", configuration={
    #"seed": 562124210, 
    "loglevel": 2, 
    "annotations": True}, debug=True)
steps = env.run([agent, "simple_agent"])
env.render(mode="ipython", width=1200, height=800)

In [None]:
# env = make("lux_ai_2021", configuration={
#     #"seed": 562124210, 
#     "loglevel": 2, 
#     "annotations": True}, debug=True)

# # Training agent in first position (player 1) against the default random agent.
# trainer = env.train([None, "random"])

# obs = trainer.reset()
# for _ in range(100):
#     env.render()
#     action = 0 # Action for the agent being trained.
#     obs, reward, done, info = trainer.step(action)
#     if done:
#         obs = trainer.reset()

In [None]:
# pad units and citytiles for input
# units_ = np.pad(units_, ((0,10), (0,0)), 'constant', constant_values=-1)
# citytiles_ = np.pad(citytiles_, ((0,10), (0,0)), 'constant', constant_values=-1)

We have something that survives! We are now ready to submit something to the leaderboard. The code below compiles all we have built so far into one file that you can then submit to the competition leaderboard

## Create a submission
Now we need to create a .tar.gz file with main.py (and agent.py) at the top level. We can then upload this!

In [None]:
!tar -czf submission.tar.gz *

## Submit
Now open the /kaggle/working folder and find submission.tar.gz, download that file, navigate to the "MySubmissions" tab in https://www.kaggle.com/c/lux-ai-2021/ and upload your submission! It should play a validation match against itself and once it succeeds it will be automatically matched against other players' submissions. Newer submissions will be prioritized for games over older ones. Your team is limited in the number of succesful submissions per day so we highly recommend testing your bot locally before submitting.

## CLI Tool

There's a separate CLI tool that can also be used to run matches. It's recommended for Jupyter Notebook users to stick with just this notebook, and all other users including python users to follow the instructions on https://github.com/Lux-AI-Challenge/Lux-Design-2021

The other benefit however of using the CLI tool is that it generates much smaller, "stateless" replays and also lets you run a mini leaderboard on multiple bots ranked by various ranking algorithms

## Additional things to check out

Make sure you check out the Bot API at https://github.com/Lux-AI-Challenge/Lux-Design-2021/tree/master/kits

This documents what you can do using the starter kit files in addition to telling you how to use the annotation debug commands that let you annotate directly on a replay (draw lines, circle things etc.)

You can also run the following below to save a episode to a JSON replay file. These are the same as what is shown on the leaderbaord and you can upload the replay files to the online replay viewer https://2021vis.lux-ai.org/


For a local (faster) version of the replay viewer, follow installation instructions here https://github.com/Lux-AI-Challenge/Lux-Viewer-2021

In [None]:
import json
replay = env.toJSON()
with open("replay.json", "w") as f:
    json.dump(replay, f)

## Suggestions / Strategies

There are a lot of places that could be improved with the agent we have in this tutorial notebook. Here are some!

- Using the build city action to build new cities and thus build new units
- Having cities perform research each turn to unlock new resources
- Writing collision-free code that lets units move smoothly around and through each other when navigating to targets
- Mining resources near your opponent's citytiles so they have less easy access to resources
- Using carts to deliver resources from far away clusters of wood, coal, uranium to a city in need
- Sending worker units over to the opponent's roads and pillaging them to slow down their agent
- Optimizing over how much to mine out of forests before letting them regrow so you can build more cities and get sustainable fuel