
   
<img src="halite_graphic.png" alt="HALITE" width="200" height="200">

# Halite Modeling Framework
*NP+RS Turley*
*7/31/2020*

Kaggle's Halite competition invites competitors to submit Python code to compete in a 4-person space mining battle. For our entry, we want to create a framework that is flexible enough to describe the range of submitted strategies. By observing our opponents actions, we can then approximate their strategy to predict their next moves. We could also use this framework to define the optimal strategies of our own practice bots...or maybe even our competitive entries.

### Outline  

  **A. Modeling Framework Overview**  
  **B. Ship Tactics and Paths**  
  **C. State Variables**  
  **D. Expectations**  
  **E. Formal Strategy**  



## A. Modeling Framework Overview


Our modeling framework suggests how the team will select actions each turn given their information. It is helpful to see the actions as the result of a framework where the team's **Strategy** assigns **Tactics** to each ship, with a **Path** function determining which actions best accomplish the tactical objective. 

For each ship, there are normally many ways to accomplish the tactical objective, so we need to determine the best path for the ship to take.

**Ship Tactic &#10142; Path(Location, State, Expectations) &#10142; Next Ship Action**
* The **Path** function for a given tactic determines the best actions to take from the current location (given the state and expectations of other players) and suggests the best next action
  * the **Location** is the ship's current position on the board
  * The **Next Ship Action** will be one of the 6 actions defined in the game [N, S, E, W, Mine, Convert]

**Strategy(State,Expectations) &#10142; Tactics**  

* The **Strategy** is a mapping from the **state** variables describing the game and the **expectations** of other players' actions to assign tactics to each ship and shipyard
  * The **State** is a set of key variables defined from past and present game information that we feel are especially relevant
  * The **Expectations** are the probabilities assigned to future actions taken by other players
  * The **Tactics** define the short-term objective


Note that shipyards can only receive one action from the strategy (Spawn), so there is no need for tactics and paths.


## B. Ship Tactics

**Basic Tactics**  

We will define 6 basic ship-level tactics

|           | Tactic      | Goal                                   |
| --------- | ----------- | -------------------------------------- |
| &#10102;  | Mine Halite | Travel to and mine a specific cell on the board   |
| &#10103;  | To Shipyard | Travel to closest friendly shipyard    |
| &#10104;  | Evade       | Avoid nearby enemy ships               |
| &#10105;  | Attack Ship | Intercept and attack enemy ship        |
| &#10106;  | Attack Yard | Travel to and attack enemy shipyard    |
| &#10107;  | Convert     | Travel to cell and convert to shipyard |


**Complex Tactics**

A more complex strategy could instruct advanced tactics that involve multiple ships, but they are not currently in scope. However, we will want to consider them in designing our own strategies. Examples of these include sending clusters of ships in an attack, luring enemy ships to a "honey pot" while other ships lie in wait, blockades of enemy shipyards, etc.




**Paths**
  
Once each ship has received its assigned tactical objective, it needs to select specfic actions from the six defined in Halite: move *North*, move *South*, move *East*, move *West*, stay and *Mine*, or *Convert* to shipyard. The ships **path** considers state variables and expectations to suggest the best sequence of subsequent ship actions to accomplish the objective. 

Since movement is only allowed in the four cardinal directions (N/S/E/W) there will typically be multiple equidistant paths to the same destination. Consider the image below showing three paths with equal Manhattan distances.

![manhattan_dist](manhattan_dist.png)

The path function suggests whether the red, yellow or blue (or another!) path are the best to take to a given destination.

The path selection will consider:
* shortest route (fewest actions)
* avoid crashes with friendly ships
* avoid interception from "lean" enemy ships (carrying less halite)
* encourage interception of "fat" enemy ships (carrying more halite)
* stay close to friendly lean ships
* include randomness to make movement less predictable

The path function weighs each factor in a manner defined by the strategy. It is important to note that the path considers a sequence of steps--but the output is a set of probabilities for the next action. In a deterministic case, the output could simply be an action, but we want to leave open the ability to purposefully add randomness or allow uncertainty to propagate in the model.

The ship's actions in subsequent turns may be as the path imagined, but a change in tactic or updated expectations could lead to to a path adjustment.

*Path score formula*

Input: StartPoint, EndPoint
Parameters:
Output: probabilities for next action

## C. State Variables

The state variables should include all of the information that allow the strategy to decide the optimal tactics for each ship/shipyard. They are also important inputs of the path function. Their definition is a key contribution to the model. 

Of course, the state variables could simply be a raw list of all game info, with the full player board and all past moves. However, the work of defining the relevant variables from the the board is itself a modeling contribution. For better or worse, it channels (and potentially limits) the informtion a strategy employs. With intelligent state variables, we can more easily build strategies manually and interpret strategies built by machines. 

We propose the list below that will form the Python dictionary State


| Category              |    | State Variable                        | Python Attribute Name
| --------------------- | -- | ------------------------------------- | -----------------
| Game Status           | 1  | Turns left in game                    | `State['StepsLeft']`
|                       | 2  | Total board halite                    | `State['TotalHalite']`
| For each team         | 3  | Halite stored                         | `State['Team#']['TeamHalite']`
|                       | 4  | Halite in cargo                       | `State['Team#']['TeamCargo']`
|                       | 4  | Number Ships                          | `State['Team#']['TeamShips']`
|                       | 5  | Number of shipyards                   | `State['Team#']TeamShipyards`
|                       | 6  | Cells controlled                      | `State['Team#']TeamCellControl`
|                       | 7  | Halite in cells controlled            | `State['Team#']TeamHaliteControl`
|                       | 8  | Shipyard kills                        | `State['Team#']TeamShipyardKills`
|                       | 9  | Ship kills                            | `State['Team#']TeamShipKills`
| For each ship         | 10 | last two moves                        | `State['Team#']['Ship#']PastActions`
|                       | 11 | last tactic probabilities             | `State['Team#']['Ship#']Tactic` 
|                       | 13 | distance to shipyard of each team     | `State['Team#']['Ship#']YardDistTeam#`
|                       | 15 | distance to key destinations?         | 
|                       | 16 | 2 move grid of dangerous ships        | `State['Team#']['Ship#']['YardDistTeam#EvadeGrid']`
|                       | 17 | path of fat target ships              | `State['Team#']['Ship#']['YardDistTeam#InterceptGrid']`
| For each shipyard     | 18 | enemies at distance one               | `State['Team#']['Shipyard#']['ShipyardEnemyD1']`
|                       | 19 | enemies at distance two               | `State['Team#']['Shipyard#']['ShipyardEnemyD2']`
|                       | 19 | friends at distance zero              | `State['Team#']['Shipyard#']['ShipyardFriendD0']`
|                       | 20 | friends at distance one               | `State['Team#']['Shipyard#']['ShipyardFriendD1']`
|                       | 21 | friends at distance two               | `State['Team#']['Shipyard#']['ShipyardFriendD2']`
| For the Board         | 22 | halite on cell                        | `State['Board']['HaliteD0']`
|                       | 23 | halite in one move radius             | `State['Board']['HaliteD1']`
|                       | 24 | halite in two move radius             | `State['Board']['HaliteD2']`
|                       | 25 | ship from team[#] at distance=0       | `State['Board']['ShipD0']['Team#']`
|                       | 26 | ship from team[#] at distance=1       | `State['Board']['ShipD1']['Team#']`
|                       | 27 | ship from team[#] at distance=2       | `State['Board']['ShipD2']['Team#']`

The concept of "key locations" reduces the number of cells of interest that the strategy and tactics will consider. They are defined to include cells with a minimal halite value (e.g. halite >4) that are the current location of a ship or meet any of the six following criteria:
1. eight highest halite cells on the board
2. four highest halite cells on board with no ships in 3-move radius
3. highest halite cell in one move for each ship
4. the highest halite cell within three moves of each ship
5. the 2nd highest halite cell within 3 moves of each ship (not in the same direction as 1st)
6. the 3rd highest halite cell within 3 moves of each ship (not in the same directions as 1st or 2nd)

If a cell meets one of these six criteria, it will not be removed from the list of key locations unless it does not meet the criteria for two subsequent turns.

## D. Expectations


There are four potential types of expectations we can consider:
* momentum - assume that enemy ships are going to continue in the direction they moved last turn
* default(s) - use the actions that would be submitted by the most popular/successful publicly available bot (currently Optimus Mining)
* random - assume that the enemy is equally likely to make any action
* strategic learning - use this same framework and learn the enemies parameters from their observed actions

*Strategic learning requires a recursive relationship (i.e. team A's strategy depends what they believe Team B believes about what what they believe...). We may omit this or at least limit it to just one level of recursion.*

**Random**

Random movement is a ridiculous expectation, but since we are modeling the expecttions of other teams, we need to include the possibility that they have simplistic views of enemy ship movement. Also, if the full expectatoin is a weighted mix of these approaches, adding weight to the expectation of random movement proxies for general uncertainty.

**Momentum**

The momentum expectation simply expects that the ship will continue in the direction implied by the last two turns, or if mining, reversing direction.

| Last Two Moves     | Predictions       |                                  |
| ------------------ | ----------------- | -------------------------------- |
| &#8594; , &#8594;  | &#8594; , &#8594; | Continue path                    |
| &#8593; , &#8594;  | &#8594; , &#8593; | Continue path (toward diagonal)  |
| &#8595; , &#8594;  | &#8594; , &#8595; | Continue path (toward diagonal)  |
| &#8592; , &#8594;  | &#8594; , &#8594; | Continue reverse direction       |
| &#10226;, &#8594;  | &#8594; , &#8594; | Continue path after mining       |
| &#8594;, &#10226;  | &#8689; , &#8689; | Go toward their base             |
| &#10226;, &#10226; | &#8689; , &#8689; | Go toward their base             |

And with uncertainty, we can consider diagonal predictions as having 75/25 and then 50/50 uncertainty for the first and second move and for the "go toward their base" expectation, the movement can be weighted across all directions with the most going in the two directions that are towards the closest base.


**Default**

A large number of bots in the competition are based on the most successful public notebook(s). Since these are public, we could simply run their action decisions as the expectation, and the weight on the default is essentially a view that we know who we are up against!

**Strategic learning**

It may not be worth the computational cost, but the highest levels of play we might want to form expectations of opponents based on their own strategic model.



## E. Formal Strategy

*The following is version 1.0 and will certainly change over time!

### Strategic Parameters/Functions

* Strategy Parameters
  * Shipyard spawning
  * 
  
* Strategy Functions
  * Ship value function

**Strategy Parameters**

| Category              |    | Parameter                             | Python Name                             | Default  | Suggested
| --------------------- | -- | ------------------------------------- | --------------------------------------- | -------- | --------
| Shipyard spawning     | 1  | Target number of ships                | `PARAM.ShipyardSpawnTarget`             |  26      | 100
|                       | 2  | Adjustment if step>200                | `PARAM.ShipyardSpawnTarget200adj`       |  -2      |  0
|                       | 2  | Adjustment if step>300                | `PARAM.ShipyardSpawnTarget300adj`       |  -2      |  0
|                       | 2  | Adjustment if step>350                | `PARAM.ShipyardSpawnTarget350adj`       |  -2      |  0
|                       | 2  | Adjustment if step>380                | `PARAM.ShipyardSpawnTarget380adj`       |  -16     |  0
|                       | 2  | Scale calculated ship value         | `PARAM.ShipyardSpawnHarvestRate`          |  0.0     |  1.0





**Ship Value Function**
A key feature in any strategy concerns the value of a ship. In order to make decisions regarding spawning, attack, sacrifice, etc., we need some concept of the value of a ship as measured in halite. Roughly speaking, the value of the ship is equal to the amount of halite it can mine over its expected life. 

We could derive a formula or calibrate this to actual game data as a function of the following factors:

Factors that affect ship value
* steps left in game
* total halite on the board
* halite in reachable distance on board
* location of the ship
* ability to defend loss of shipyard
* expected life before being destroyed by enemy
* ability to destroy enemy shipyard (+500)
* ability to protect home base (+500)

Plan: go to high-level games (score about 1100?) and build dataset on the halite mined/stolen and delivered by every ship in the game as a function of predictor variables--> you've got a model.

**Shipyard**

Assume agent will create ships if has enough halite and current ships is less than target_ships
Target ships is a descending function of the steps, an increasing function of the total halite, and a decreasing function of the number of steps left
The agent will also create a new ship to defend a shipyard from being destroyed
The agent will not spawn a new ship if a an existing ship is going to shipyard
(note that an existing ship should not go to shipyard if priority is ship construction!)

The basic target for number of ships simply sets a value that decreases as the steps approach the end of the game

Target_Ships = `PARAM.ShipTarget` + `PARAM.ShipTarget200adj` $\times$ (if `State['StepsLeft']` < 200) 
                 + `PARAM.ShipTarget300adj` $\times$ (if `State['StepsLeft']` < 100) + `PARAM.ShipTarget350adj` $\times$ (if `State['StepsLeft']` < 50)
                 + `PARAM.ShipTarget390adj` $\times$ (if `State['StepsLeft']` < 10)

(*Can we do a simpler, continuous version without so many parameters that is approximately the same?*)

The second limit is to produce a ship if its expected value is greater than 500. 


Defensive and strategic considerations: 
If probability of enemy base attack is greater than 25% and enemy has >0 halite, spawn a ship
If adjacent enemy ship has no halite and probabilty of base attack next turn is greater than 50%, spawn new ship

Don't spawn if in last few turns!

Shipyard Actions
Spawn a ship in the shipyard if