# Tutorial 3: State Representation and Encoding

We have seen in tutorial 2 how we can reduce the state complexity of the game by restricting the depth of field of the agent, quantizing the halite levels and providing some high-level meta informations.

The goal of this third tutorial is to explain how to concretely built this representation. This will be achieved in two steps:
1. Extract from the game engine the features that represent our state;
2. Encode this features in a single scalar, that will be used as index to access the rows of the matrix of the Q-values Q(s,a).

To make it simpler, we will always refer to the case of a $7 \times 7$ map, but the result will hold for every reasonable size.

## Feature extraction

The features we want to extract are:
1. ship position;
2. halite carried by the ship;
3. halite in the cell occupied by the ship;
4. halite in each of the adjacent cells w.r.t. the ship position;
5. direction richest in halite.

To see if the implementation is working, we will initialize the environment and create a single ship.

In [3]:
import sys
sys.path.insert(0, "../Environment/")
import halite_env as Env

In [4]:
import numpy as np
import matplotlib.pyplot as plt

In [6]:
# parameters for a single player in a 7x7 map
NUM_PLAYERS = 1
MAP_SIZE = 7

# instantiate class
env = Env.HaliteEnv(num_players=NUM_PLAYERS, map_size=MAP_SIZE)

In [7]:
action_matrix = np.full((MAP_SIZE,MAP_SIZE), -1)
shipyard_action = True
state, halite, finish_flag, _ = env.step(action_matrix, makeship = shipyard_action)
map_halite = state[:,:,0]
ship_pos_matrix = state[:,:,1]
shipy_pos_matrix = state[:,:,3]

### Ship position

The idea is that the second layer of the state is a matrix whose indexes are the x and y coordinates of the map. In the case of a single ship, there is only one element different from zero and we want to retrieve its indexes.

In [82]:
# matrix to scalar encoding

def one_to_index(V,L):
    """
    Parameters
    ----------
    V: LxL matrix with one entry = 1 and the others = 0
    L: linear dimension of the square matrix
    
    Assign increasing integers starting from 0 up to L**2 to an LxL matrix row by row.
    
    Returns
    -------
    integer corresponding to the non-zero element of V.
    """
    
    return np.arange(L**2).reshape((L, L))[V.astype(bool)]

In [29]:
# just to show how the encoding of the position works
np.arange(7**2).reshape((7, 7))

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39, 40, 41],
       [42, 43, 44, 45, 46, 47, 48]])

In [28]:
#position_encoded of the ship
pos_enc = one_to_index(ship_pos_matrix, MAP_SIZE)
print("Encoded position of the ship: ", pos_enc)

Encoded position of the ship:  [24]


What if instead we want to access to a 7x7 matrix using this encoded position? First we need to obtain the x and y indexes and then use them to access the matrix.

In [83]:
# 2D decoding

def decode(v_enc, L):
    """
    Parameters
    ----------
    v_enc: scalar between 0 and L**2 - 1, is the encoding of a position (x,y)
    L    : linear dimension of the square matrix
    
    Assign increasing integers starting from 0 up to L**2 to an LxL matrix row by row.
    
    Returns
    -------
    numpy array containg the row and the column corresponding to the matrix element of value v_enc.
    """
    V = np.arange(0,L**2).reshape((L,L))
    v_dec = np.array([np.where(v_enc == V)[0][0],np.where(v_enc == V)[1][0]])
    return v_dec

In [33]:
decode(pos_enc, MAP_SIZE) # remember that starts counting from 0

array([3, 3])

And what if I need to encode an (x,y) position? This is easier than before, since we first create the matrix V with the entries enumerated (we shall call V the "encoding matrix"), than return the value V[x,y].

In [77]:
# 2D encoding

def encode(v_dec, L):
    """
    Parameters
    ----------
    v_dec: list or numpy array of two integers between 0 and L
    L    : linear dimension of the square matrix
    
    Assign increasing integers starting from 0 up to L**2 to an LxL matrix row by row.
    
    Returns
    -------
    integer corresponding to the element (v_dec[0],v_dec[1]) of the encoding matrix.
    """
    V = np.arange(0,L**2).reshape((L,L))
    v_enc = V[v_dec[0],v_dec[1]] 
    return v_enc

In [41]:
# stacking all this together

#position_encoded of the ship
pos_enc = one_to_index(ship_pos_matrix, MAP_SIZE)
print("Encoded position of the ship: ", pos_enc)
#position_decoded of the ship
pos_dec = decode(pos_enc, MAP_SIZE)
print("Decoded position of the ship: ", pos_dec)

#position_encoded of the ship
shipy_enc = one_to_index(shipy_pos_matrix, MAP_SIZE)
print("Encoded position of the shipyard: ", shipy_enc)
#position_decoded of the ship
shipy_dec = decode(shipy_enc, MAP_SIZE)
print("Decoded position of the shipyard: ", shipy_dec)

Encoded position of the ship:  [24]
Decoded position of the ship:  [3 3]
Encoded position of the shipyard:  [24]
Decoded position of the shipyard:  [3 3]


### Halite informations

We can use the decoded position of the ship to access layers 0 (map_halite) and 2 (halite carried) and retireve the information about the halite. Then we quantize those values choosing three intervals, that will correspond to what the learning agent will effectively observe.

We shall call the combination of all this information halite vector:
$halite\_vector = (C, O, S, N, E, W)$ halite (where C stands for the halite carried by the ship and O for the cell occupied by the ship );

In [95]:
def halite_quantization(halite_vec, q_number = 3):
    """
    Creates q_number thresholds [t0,t1,t2] equispaced in the log space.
    Maps each entry of halite_vec to the corresponding level:
    if h <= t0 -> level = 0
    if t0 < h <= t1 -> level = 1
    else level = 2

    Parameters
    ----------
    halite_vec : numpy array which elements are numbers between 0 and 1000
    q_number : number of quantization levels

    Returns
    -------
    level : quantized halite_vec according to the q_number thresholds
    """
    # h can either be a scalar or a matrix 
    tresholds = np.logspace(1,3,q_number) # [10, 100, 1000] = [10^1, 10^2, 10^3]
    h_shape = halite_vec.shape
    h_temp = halite_vec.flatten()
    mask = (h_temp[:,np.newaxis] <= tresholds).astype(int)
    level = np.argmax(mask, axis = 1)
    return level.reshape(h_shape)

In [61]:
halite_example = np.array([0,1,9,10,11,99,100,101,1000])
halite_quantization(halite_example)

array([0, 0, 0, 0, 1, 1, 1, 2, 2])

We now present the fuction that takes as input the state from the game engine and returns a vector of 6 elements, each between 0 and 2, representing the quantized halite vector.

In [94]:
def get_halite_vec_dec(state, q_number = 3, map_size = 7):
    """
    Parameters
    ----------
    state: [map_size,map_size,>=3] numpy array, which layers are:
            Layer 0: map halite, 
            Layer 1: ship position, 
            Layer 2: halite carried by the ships (a.k.a. cargo)
    q_number : number of quantization levels
    map_size : linear size of the squared map
    
    Returns
    -------
    quantized halite vector [𝐶,𝑂,𝑆,𝑁,𝐸,𝑊], numpy array of shape (6,)
    (where C stands for the halite carried by the ship and O for the cell occupied by the ship)
    """
    def halite_quantization(halite_vec, q_number = 3):
        """
        Creates q_number thresholds [t0,t1,t2] equispaced in the log space.
        Maps each entry of halite_vec to the corresponding level:
        if h <= t0 -> level = 0
        if t0 < h <= t1 -> level = 1
        else level = 2
        
        Parameters
        ----------
        halite_vec : numpy array which elements are numbers between 0 and 1000
        q_number : number of quantization levels

        Returns
        -------
        level : quantized halite_vec according to the q_number thresholds
        """
        # h can either be a scalar or a matrix 
        tresholds = np.logspace(1,3,q_number) # [10, 100, 1000] = [10^1, 10^2, 10^3]
        h_shape = halite_vec.shape
        h_temp = halite_vec.flatten()
        mask = (h_temp[:,np.newaxis] <= tresholds).astype(int)
        level = np.argmax(mask, axis = 1)
        return level.reshape(h_shape)

    pos_enc = one_to_index(state[:,:,1], map_size)
    pos_dec = decode(pos_enc, map_size) # decode position to access matrix by two indices
    
    ship_cargo = state[pos_dec[0],pos_dec[1],2]
    cargo_quant = halite_quantization(ship_cargo).reshape(1)[0] # quantize halite
    
    map_halite = state[:,:,0]
    halite_quant = halite_quantization(map_halite) # quantize halite
    
    halite_vector = []
    halite_vector.append(cargo_quant)
    halite_vector.append(halite_quant[pos_dec[0], pos_dec[1]])
    halite_vector.append(halite_quant[(pos_dec[0]+1)%map_size, pos_dec[1]])
    halite_vector.append(halite_quant[(pos_dec[0]-1)%map_size, pos_dec[1]])
    halite_vector.append(halite_quant[pos_dec[0], (pos_dec[1]+1)%map_size])
    halite_vector.append(halite_quant[pos_dec[0], (pos_dec[1]-1)%map_size])

    return np.array(halite_vector)

In [98]:
# Example of output
hal_vec_dec = get_halite_vec_dec(state)
hal_vec_dec

array([0, 0, 2, 2, 2, 2])

### Direction richest in halite

Basically we have to implement what is represented in this image:
<img src="Support_material/high-level-features.png">

The only variable is how far to seek for halite, that is the linear dimension of the squares. In our case it is fixed to 3, but with some effort every other odd number could work.

In [76]:
# we will use this function
help(np.roll)

Help on function roll in module numpy:

roll(a, shift, axis=None)
    Roll array elements along a given axis.
    
    Elements that roll beyond the last position are re-introduced at
    the first.
    
    Parameters
    ----------
    a : array_like
        Input array.
    shift : int or tuple of ints
        The number of places by which elements are shifted.  If a tuple,
        then `axis` must be a tuple of the same size, and each of the
        given axes is shifted by the corresponding number.  If an int
        while `axis` is a tuple of ints, then the same value is used for
        all given axes.
    axis : int or tuple of ints, optional
        Axis or axes along which elements are shifted.  By default, the
        array is flattened before shifting, after which the original
        shape is restored.
    
    Returns
    -------
    res : ndarray
        Output array, with the same shape as `a`.
    
    See Also
    --------
    rollaxis : Roll the specified axis backwa

In [68]:
# now suppose that the ship is in [2,2], whereas the shipyard is at the center of the map, i.e. [3,3]
example = np.roll(map_halite, shift = (1,1) , axis =  (0,1)) #in this way we simulate the ship to be in (2,2)
print("This is what we should get: \n", example)

This is what we should get: 
 [[258 185 480 169 860 945 189]
 [445 325 479 544 560  37 993]
 [380  86  74 514 397 324 635]
 [638 617 705 490 780 618 626]
 [551 385 567 166   0 223 447]
 [ 25 809 953 414 576 775  76]
 [988 785 101 311 153 506 880]]


In [69]:
pos_dec = [2,2]
shift = (shipy_dec[0]-pos_dec[0],shipy_dec[1]-pos_dec[1])
centered_h = np.roll(map_halite, shift = shift, axis = (0,1))
print("Result: \n",centered_h)

Result: 
 [[258 185 480 169 860 945 189]
 [445 325 479 544 560  37 993]
 [380  86  74 514 397 324 635]
 [638 617 705 490 780 618 626]
 [551 385 567 166   0 223 447]
 [ 25 809 953 414 576 775  76]
 [988 785 101 311 153 506 880]]


In [93]:
def roll_and_crop_v0(M, shift, axis, border = 1, center = (3,3)):
    """
    Shift matrix and then crops it around the center keeping a border.
    
    Inputs
    ------
    M : squared matrix in numpy array
        Matrix to be rolled and cropped
    shift : int or tuple of ints
        The number of places by which elements are shifted.  If a tuple,
        then `axis` must be a tuple of the same size, and each of the
        given axes is shifted by the corresponding number.  If an int
        while `axis` is a tuple of ints, then the same value is used for
        all given axes.
    axis : int or tuple of ints, optional
        Axis or axes along which elements are shifted.  By default, the
        array is flattened before shifting, after which the original
        shape is restored.
    border : int
        Border around central cell (after the shift) to be cropped.
        The resulting area is of 2*border+1 x 2*border+1
        
    Parameters
    ----------
    M_cut : numpy matrix of shape (2*border+1,2*border+1)
    """
    M_temp = np.roll(M, shift = shift, axis = axis)
    M_crop = M_temp[center[0]-border:center[0]+border+1, center[1]-border:center[1]+border+1]
    return M_crop


In [91]:
# try to return just the 3x3 area around the ship
around_ship = roll_and_crop_v0(centered_h, shift = 0, axis = 0, border=1)
print("3x3 neighborhood of the ship: \n", around_ship, '\n')

# whereas if we wanted to get a 5x5 area, we just increase the border
around_ship2 = roll_and_crop_v0(centered_h, shift = 0, axis = 0, border=2)
print("5x5 neighborhood of the ship: \n", around_ship2, '\n')

3x3 neighborhood of the ship: 
 [[ 74 514 397]
 [705 490 780]
 [567 166   0]] 

5x5 neighborhood of the ship: 
 [[325 479 544 560  37]
 [ 86  74 514 397 324]
 [617 705 490 780 618]
 [385 567 166   0 223]
 [809 953 414 576 775]] 



In [67]:
# we actually need to do this shifting by two in all cardinal directions w.r.t. the map centered around the ship
mean_cardinal_h = []
perm = [(a,sh) for a in [0,1] for sh in [-2,2]]
for a,sh in perm:
    print("Map shifted in direction: (%d,%d)\n"%(sh,a), roll_and_cut_v0(centered_h, shift = sh, axis = a))
    mean_h = np.mean(roll_and_cut_v0(centered_h, shift = sh, axis = a), axis = (0,1))
    print("Mean halite in direction: (%d,%d)"%(sh,a), mean_h, '\n')
    mean_cardinal_h.append(mean_h)

mean_cardinal_h = np.array(mean_cardinal_h)
halite_direction = np.argmax(mean_cardinal_h) #+ 1
print("Action suggested to reach the nearest and richest halite deposit: ", halite_direction)

This is what we should get: 
 [[258 185 480 169 860 945 189]
 [445 325 479 544 560  37 993]
 [380  86  74 514 397 324 635]
 [638 617 705 490 780 618 626]
 [551 385 567 166   0 223 447]
 [ 25 809 953 414 576 775  76]
 [988 785 101 311 153 506 880]]
Result: 
 [[258 185 480 169 860 945 189]
 [445 325 479 544 560  37 993]
 [380  86  74 514 397 324 635]
 [638 617 705 490 780 618 626]
 [551 385 567 166   0 223 447]
 [ 25 809 953 414 576 775  76]
 [988 785 101 311 153 506 880]]
Neighborhood of the ship: 
 [[ 74 514 397]
 [705 490 780]
 [567 166   0]] 

Map shifted in direction: (-2,0)
 [[567 166   0]
 [953 414 576]
 [101 311 153]]
Mean halite in direction: (-2,0) 360.1111111111111 

Map shifted in direction: (2,0)
 [[480 169 860]
 [479 544 560]
 [ 74 514 397]]
Mean halite in direction: (2,0) 453.0 

Map shifted in direction: (-2,1)
 [[397 324 635]
 [780 618 626]
 [  0 223 447]]
Mean halite in direction: (-2,1) 450.0 

Map shifted in direction: (2,1)
 [[380  86  74]
 [638 617 705]
 [551 385 56

In this way we obtain this high level function that outputs a number between 0 and 3.

In [96]:
def get_halite_direction(state, map_size = 7):
    """
    Returns the direction richest in halite given the ship position.
    Works only for a single ship.
    
    Parameters
    ----------
    state: [map_size,map_size,>=3] numpy array
        Layer 0: map halite
        Layer 1: ship position 
        Layer 2: halite carried by the ships (a.k.a. cargo)
    map_size : linear size of the squared map
    
    Returns
    -------
    h_dir : int
        Dictionary to interpret the output:
        {0:'S', 1:'N', 2:'E', 3:'W'}
        
    """
    def roll_and_crop(M, shift, axis, border = 1, center = (3,3)):
        """
        Shift matrix and then crops it around the center keeping a border.

        Inputs
        ------
        M : squared matrix in numpy array
            Matrix to be rolled and cropped
        shift : int or tuple of ints
            The number of places by which elements are shifted.  If a tuple,
            then `axis` must be a tuple of the same size, and each of the
            given axes is shifted by the corresponding number.  If an int
            while `axis` is a tuple of ints, then the same value is used for
            all given axes.
        axis : int or tuple of ints, optional
            Axis or axes along which elements are shifted.  By default, the
            array is flattened before shifting, after which the original
            shape is restored.
        border : int
            Border around central cell (after the shift) to be cropped.
            The resulting area is of 2*border+1 x 2*border+1

        Parameters
        ----------
        M_cut : numpy matrix of shape (2*border+1,2*border+1)
        """
        M_temp = np.roll(M, shift = shift, axis = axis)
        M_crop = M_temp[center[0]-border:center[0]+border+1, center[1]-border:center[1]+border+1]
        return M_crop

    map_halite = state[:,:,0] # matrix with halite of each cell of the map
    
    pos_enc = one_to_index(state[:,:,1], map_size) # ship position
    pos_dec = decode(pos_enc, map_size) # decode position to access matrix by two indices
    
    shipy_enc = one_to_index(shipy_pos_matrix, map_size) # shipyard position
    shipy_dec = decode(shipy_enc, map_size) #position_decoded 
    
    shift = (shipy_dec[0]-pos_dec[0],shipy_dec[1]-pos_dec[1])
    centered_h = np.roll(map_halite, shift = shift, axis = (0,1)) #centers map_halite on the ship
    
    mean_cardinal_h = []
    # this could be generalized to wider areas, like 5x5, but 3x3 it's enough for a 7x7 map
    perm = [(a,sh) for a in [0,1] for sh in [-2,2]] # permutations of shifts and axis to get the 4 cardinal directions
    for a,sh in perm:
        mean_h = np.mean(roll_and_crop(centered_h, shift = sh, axis = a), axis = (0,1))
        mean_cardinal_h.append(mean_h)

    mean_cardinal_h = np.array(mean_cardinal_h)
    halite_direction = np.argmax(mean_cardinal_h) #+ 1 # take the direction of the 3x3 most rich zone
    
    return halite_direction

## Encoding state

We now are in possess of the following features:

- $pos\_enc \in [0,48]$ (ship position encoded);
- $halite\_vector = (C, O, S, N, E, W)$ halite (where C stands for the halite carried by the ship and O for the cell occupied by the ship );
- $halite\_direction \in [0,3]$ (action to take to go towards the nearest and richest halite deposit).

The idea is to first encode the vector of halite, then to form a 3D tensor of $pos\_enc \times halvec\_enc \times haldir$ and encode it in a 1D array. This final array $s\_enc$ will form the rows of the Q(s,a) table and should assume values between 0 and 142.883.

In [99]:
# Encoding halite vector
hal_vec_dec = get_halite_vec_dec(state)
hal_vec_dec

array([0, 0, 2, 2, 2, 2])

In [100]:
# valid for encoding and decoding an array of length L whose entries can all assume only 
# the same integer values from 0 to m
def encode_vector(v_dec, L = 6, m = 3):
    """
    Encodes a vector of L integers ranging from 0 to m-1.
    
    Parameters
    ----------
    v_dec: list or numpy array of L integers between 0 and m
    L    : length of the vector
    
    Assign increasing integers starting from 0 up to m**L to an m-dimensional matrix "row by row".
    
    Returns
    -------
    integer corresponding to the element (v_dec[0],v_dec[1],...,v_dec[L-1]) of the encoding tensor.
    """
    T = np.arange(m**L).reshape(tuple([m for i in range(L)]))
    return T[tuple(v_dec)]

def decode_vector(v_enc, L = 6, m = 3):
    """
    Decodes an encoding for a vector of L integers ranging from 0 to m-1.
    
    Parameters
    ----------
    v_enc: scalar between 0 and m**L - 1, is the encoding of a position (x1,x2,...,xL)
    L    : length of the vector
    
    Assign increasing integers starting from 0 up to m**L to an m-dimensional matrix "row by row".
    
    Returns
    -------
    numpy array containg the indexes corresponding to the tensor element of value v_enc.
    """
    T = np.arange(m**L).reshape(tuple([m for i in range(L)]))
    return np.array([np.where(v_enc == T)[i][0] for i in range(L)])

In [102]:
# This is a simple check to see if the encoding and decoding procedure works
print("Original vector: ", halvec_dec)
v_enc = encode_vector(halvec_dec)
print("Encoded vector: ", v_enc)
v_dec = decode_vector(v_enc)
print("Decoded vector: ", v_dec)

Original vector:  [0 0 2 2 2 2]
Encoded vector:  80
Decoded vector:  [0 0 2 2 2 2]


Now we have again a different case: we have to encode a vector with 3 integer entries, each of them with different (but known) ranges. 

The solution is a simple hybrid of those seen before.

In [103]:
# 3D encoding and decoding for arbitrary lengths of the three axis

def encode3D(v_dec, L1, L2, L3):
    """
    Encodes a vector of 3 integers of ranges respectively L1, L2 and L3,
    e.g. the first entry must be an integer between 0 and L1-1.
     
    Parameters
    ----------
    v_dec: list or numpy array of two integers between 0 and L
    L1   : range od the first dimension
    L2   : range od the second dimension
    L3   : range od the third dimension
    
    Assign increasing integers starting from 0 up to L1*L2*L3 to an L1xL2xL3 3D-matrix "row by row".
    
    Returns
    -------
    integer corresponding to the element (v_dec[0],v_dec[1],v_dec[2]) of the encoding 3D-matrix.
    """
    V = np.arange(0,L1*L2*L3).reshape((L1,L2,L3))
    v_enc = V[tuple(v_dec)] 
    return v_enc

def decode3D(v_enc, L1, L2, L3):
    """
    Decodes an encoding for a vector of 3 integers of ranges respectively L1, L2 and L3.
    
    Parameters
    ----------
    v_enc: scalar between 0 and L**2 - 1, is the encoding of a position (x,y)
    L    : linear dimension of the square matrix
    
    Assign increasing integers starting from 0 up to L**2 to an LxL matrix row by row.
    
    Returns
    -------
    numpy array containg the row and the column corresponding to the matrix element of value v_enc.
    """
    V = np.arange(0,L1*L2*L3).reshape((L1,L2,L3))
    v_dec = np.array([np.where(v_enc == V)[0][0],np.where(v_enc == V)[1][0], np.where(v_enc == V)[2][0]])
    return v_dec

In [105]:
print("Ship position encoded: ",pos_enc[0])
hal_vec_enc = encode_vector(hal_vec_dec)
print("Halite vector encoded: ",hal_vec_enc)
print("Halite direction: ", halite_direction)
s_dec = np.array([pos_enc[0], hal_vec_enc, halite_direction] )

Ship position encoded:  24
Halite vector encoded:  80
Halite direction:  1


In [106]:
# How to use encode3D

H_LEV = 3 # halite levels
N_CELLS = MAP_SIZE**2 # number of cells in a square map
N_STATES = N_CELLS*H_LEV**6*4
N_ACTIONS = 5 # no dropoffs, 1 action for staying still, 4 for moving in the cardinal directions
print("Total number of states to be experienced: ", N_STATES)

print("Original decoded state: ", s_dec)
s_enc = encode3D(s_dec, L1 = N_CELLS, L2 = H_LEV**6, L3 = N_ACTIONS-1)
print("Encoded state: ", s_enc)
s_dec_2 = decode3D(s_enc, L1 = N_CELLS, L2 = H_LEV**6, L3 = N_ACTIONS-1)
print("New decoded state: ", s_dec)

Total number of states to be experienced:  142884
Original decoded state:  [24 80  1]
Encoded state:  70305
New decoded state:  [24 80  1]


For sake of concision, we will condense all this functions in a single one, at least for encoding the state that the game engine outputs at each step.

In [107]:
def encode_state(state, map_size = 7, h_lev = 3, n_actions = 5, debug = False):
    """
    Encode a state of the game in a unique scalar.
    
    Parameters
    ----------
     state   : [map_size,map_size,>=3] numpy array
        Layer 0: map halite
        Layer 1: ship position 
        Layer 2: halite carried by the ships (a.k.a. cargo)
    map_size : int, linear size of the squared map
    h_lev    : int, number of quantization levels of halite
    n_actions: int, number of actions that the agent can perform 
    deubg    : bool, verbose mode to debug
    
    Returns
    -------
    s_enc : int, unique encoding of the partial observation of the game state
    """
    pos_enc = one_to_index(state[:,:,1], map_size)[0] # ship position
    if debug:
        print("Ship position encoded in [0,%d]: "%(map_size**2-1), pos_enc)
    
    halvec_dec = get_halite_vec_dec(state, q_number = 3, map_size = map_size) 
    halvec_enc = encode_vector(halvec_dec) # halite vector
    if debug:
        print("Halite vector encoded in [0,%d]: "%(h_lev**6 -1), halvec_enc)
    
    haldir = get_halite_direction(state, map_size = map_size) # halite direction
    if debug:
        print("Halite direction in [0,3]: ", haldir)
    
    s_dec = np.array([pos_enc, halvec_enc, haldir])
    if debug:
        print("Decoded state: ", s_dec)
    s_enc = encode3D(s_dec, L1 = map_size**2, L2 = h_lev**6, L3 = n_actions-1)
    if debug:
        print("State encoded in [0, %d]: "%(map_size**2*h_lev**6*(n_actions-1)), s_enc, '\n')
    
    return s_enc

In [108]:
encode_state(state, map_size = 7, h_lev = H_LEV, n_actions = N_ACTIONS, debug = True)

Ship position encoded in [0,48]:  24
Halite vector encoded in [0,728]:  80
Halite direction in [0,3]:  3
Decoded state:  [24 80  3]
State encoded in [0, 142884]:  70307 



70307

*Congratulations!* You've made it through the toughest tutorial of the series.

In the next one we will see the implementation of the Q-learning with tabular methods for the Halite challenge.