<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#First-train" data-toc-modified-id="First-train-1">First train</a></span><ul class="toc-item"><li><span><a href="#Goal" data-toc-modified-id="Goal-1.1">Goal</a></span></li><li><span><a href="#Imports" data-toc-modified-id="Imports-1.2">Imports</a></span></li><li><span><a href="#Data" data-toc-modified-id="Data-1.3">Data</a></span><ul class="toc-item"><li><span><a href="#Code" data-toc-modified-id="Code-1.3.1">Code</a></span></li><li><span><a href="#Save-to-npz" data-toc-modified-id="Save-to-npz-1.3.2">Save to npz</a></span></li><li><span><a href="#Load-matches-for-training" data-toc-modified-id="Load-matches-for-training-1.3.3">Load matches for training</a></span></li></ul></li><li><span><a href="#Model" data-toc-modified-id="Model-1.4">Model</a></span></li><li><span><a href="#Train" data-toc-modified-id="Train-1.5">Train</a></span></li><li><span><a href="#Play" data-toc-modified-id="Play-1.6">Play</a></span></li></ul></li></ul></div>

# First train

## Goal

The goal is to do the first training using the whole dataset. Later I will move this to script.

## Imports

In [1]:
# Use this to reload changes in python scripts
%load_ext autoreload
%autoreload 2

In [2]:
import os
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

In [3]:
import json
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow.keras as keras
from kaggle_environments import make
import pandas as pd
from tqdm.notebook import tqdm

from luxai.utils import render_game_in_html, set_random_seed
from luxai.cunet import cunet_model, cunet_luxai_model, config
from luxai.input_features import make_input, expand_board_size_adding_zeros, crop_board_to_original_size
from luxai.output_features import (
    create_actions_mask, create_output_features,
    UNIT_ACTIONS_MAP, CITY_ACTIONS_MAP)
from luxai.actions import (
    create_actions_for_cities_from_model_predictions,
    create_actions_for_units_from_model_predictions)

Loading environment football failed: No module named 'gfootball'


In [4]:
plt.plot()
plt.close('all')
plt.rcParams["figure.figsize"] = (30, 5)  
mpl.rcParams['lines.linewidth'] = 3
mpl.rcParams['font.size'] = 16

## Data

### Code

In [5]:
def load_match_from_json(filepath, player):
    with open(filepath, 'r') as f:
        match = json.load(f)
    
    board, features, unit_output, city_output = [], [], [], []
    for step in range(len(match) - 1):
        observation = match[step][0]['observation']
        if player:
            observation.update(match[step][player]['observation'])
        actions = match[step+1][player]['action'] # notice the step + 150
        if actions is None: # this can happen on timeout
            continue

        ret = make_input(observation)
        active_units_to_position, active_cities_to_position, units_to_position = ret[2:]
        if active_units_to_position or active_cities_to_position:
            board.append(ret[0])
            features.append(ret[1])
            unit_actions_mask = create_actions_mask(active_units_to_position, observation)
            city_actions_mask = create_actions_mask(active_cities_to_position, observation)
            unit_actions, city_actions = create_output_features(actions, units_to_position, observation)
            unit_output.append(np.concatenate([unit_actions, unit_actions_mask], axis=-1))
            city_output.append(np.concatenate([city_actions, city_actions_mask], axis=-1))

    board = np.array(board, dtype=np.float32)
    features = np.array(features, dtype=np.float32)
    unit_output = np.array(unit_output, dtype=np.float32)
    city_output = np.array(city_output, dtype=np.float32)
    #print('%i/%i' % (len(board), len(match) - 1)) #this shows how many states did not have available actions
    return dict(board=board, features=features, unit_output=unit_output, city_output=city_output)


def save_match_to_npz(filepath, match):
    os.makedirs(os.path.dirname(filepath), exist_ok=True)
    np.savez_compressed(filepath, **match)
    
    
def load_match_from_npz(filepath):
    return dict(**np.load(filepath))

In [6]:
def load_best_n_matches(n_matches):
    matches = []
    for episode_id, player in tqdm(zip(df.EpisodeId[:n_matches], df.Index[:n_matches]), total=n_matches, desc='Loading matches'):
        npz_filepath = os.path.join(matches_cache_npz_dir, '%i_%i.npz' % (episode_id, player))

        if os.path.exists(npz_filepath):
            match = load_match_from_npz(npz_filepath)
        else:
            json_filepath = os.path.join(matches_json_dir, '%i.json' % episode_id)
            match = load_match_from_json(json_filepath, player)
            save_match_to_npz(npz_filepath, match)

        matches.append(match)
    return matches

In [7]:
def combine_data_for_training(matches):
    inputs = [
        np.concatenate([expand_board_size_adding_zeros(match['board']) for match in matches]),
        np.concatenate([match['features'] for match in matches]),
    ]
    print('Inputs shapes', [x.shape for x in inputs])
    outputs = [
        np.concatenate([expand_board_size_adding_zeros(match['unit_output']) for match in matches]),
        np.concatenate([expand_board_size_adding_zeros(match['city_output']) for match in matches]),
    ]
    print('Outputs shapes', [x.shape for x in outputs])
    return inputs, outputs

In [8]:
def load_train_and_test_data(n_matches, test_fraction):
    matches = load_best_n_matches(n_matches=n_matches)
    
    test_matches = [match for idx, match in enumerate(matches) if not idx%test_fraction]
    train_matches = [match for idx, match in enumerate(matches) if idx%test_fraction]
    
    print('Train matches: %i' % len(train_matches))
    train_data = combine_data_for_training(train_matches)
    print('Test matches: %i' % len(test_matches))
    test_data = combine_data_for_training(test_matches)
    
    return train_data, test_data

### Save to npz

In [9]:
matches_json_dir = '/home/gbarbadillo/luxai_ssd/matches_20211014/matches_json'
matches_cache_npz_dir = '/home/gbarbadillo/luxai_ssd/matches_20211014/matches_npz'

In [10]:
df = pd.read_csv('/mnt/hdd0/Kaggle/luxai/agent_selection.csv')
df.sort_values('FinalScore', ascending=False, inplace=True)
df.reset_index(drop=True, inplace=True)
df.head()

Unnamed: 0,Id,EpisodeId,Index,Reward,State,SubmissionId,InitialConfidence,InitialScore,UpdatedConfidence,UpdatedScore,FinalScore
0,69945543,27424471,1,90009.0,2,23032370,36.889864,1560.38928,36.5225,1566.517103,1818.288755
1,69923394,27413397,0,650053.0,2,23032370,38.435234,1536.093066,38.058785,1541.342982,1818.288755
2,69849883,27376641,1,410038.0,2,23032370,87.761971,1111.109255,83.368953,1142.533822,1818.288755
3,69847811,27375605,1,150010.0,2,23032370,144.366002,871.859256,135.55263,896.074161,1818.288755
4,69847037,27375218,1,130011.0,2,23032370,185.0,702.140727,170.0,788.476153,1818.288755


Loading data from all the matches from json files will take around an hour, thus we are going to save the features to npz file so we can reduce that time down to 10 minutes.

However we could not load all the dataset into memory due to its size. I have computed that if we normalize the board size to 32x32 each match will take 56MB of RAM memory.
Thus loading 1000 files will take 56 GB of ram.

In [8]:
matches = []
for episode_id, player in tqdm(zip(df.EpisodeId.values, df.Index.values), total=len(df)):
    npz_filepath = os.path.join(matches_cache_npz_dir, '%i_%i.npz' % (episode_id, player))
    if os.path.exists(npz_filepath):
        continue
    else:
        json_filepath = os.path.join(matches_json_dir, '%i.json' % episode_id)
        match = load_match_from_json(json_filepath, player)
        save_match_to_npz(npz_filepath, match)

  0%|          | 0/12791 [00:00<?, ?it/s]

### Load matches for training

In [11]:
train_data, test_data = load_train_and_test_data(n_matches=400, test_fraction=20)

Loading matches:   0%|          | 0/400 [00:00<?, ?it/s]

Train matches: 380
Inputs shapes [(124808, 32, 32, 24), (124808, 1, 13)]
Outputs shapes [(124808, 32, 32, 11), (124808, 32, 32, 4)]
Test matches: 20
Inputs shapes [(6244, 32, 32, 24), (6244, 1, 13)]
Outputs shapes [(6244, 32, 32, 11), (6244, 32, 32, 4)]


## Model

In [12]:
# Unet parameters
config.INPUT_SHAPE = [32, 32, 24] #[512, 128, 1]
config.FILTERS_LAYER_1 = 32 # 16
config.N_LAYERS = 3 # 6
config.ACT_LAST = 'sigmoid' # sigmoid
# Condition parameters
config.Z_DIM = 13 # 4
config.CONTROL_TYPE = 'dense' # dense
config.FILM_TYPE = 'simple' # simple
config.N_NEURONS = [16] # [16, 64, 256]
config.N_CONDITIONS = config.N_LAYERS # 6 this should be the same as the number of layers
# Other
config.LR = 1e-3 # 1e-3


model = cunet_luxai_model(config)

2021-10-20 14:08:12.571906: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-20 14:08:13.212813: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-10-20 14:08:13.212857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22308 MB memory:  -> device: 0, name: GeForce RTX 3090, pci bus id: 0000:17:00.0, compute capability: 8.6
2021-10-20 14:08:13.213843: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment vari

## Train

In [13]:
model.fit(x=train_data[0], y=train_data[1], validation_data=test_data, epochs=5)

2021-10-20 14:08:13.386796: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 12269125632 exceeds 10% of free system memory.
2021-10-20 14:08:20.497328: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 5623349248 exceeds 10% of free system memory.
2021-10-20 14:08:25.022528: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 12269125632 exceeds 10% of free system memory.
2021-10-20 14:08:30.018972: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 5623349248 exceeds 10% of free system memory.
2021-10-20 14:08:33.142412: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/5


2021-10-20 14:08:35.009905: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2021-10-20 14:08:35.584278: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2021-10-20 14:08:35.585397: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2021-10-20 14:08:35.585436: W tensorflow/stream_executor/gpu/asm_compiler.cc:77] Couldn't get ptxas version string: Internal: Couldn't invoke ptxas --version
2021-10-20 14:08:35.586572: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2021-10-20 14:08:35.586666: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
2021-10-20 14:08:36.115753: I tensorflow/stream_executor/cuda/c

Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f45ac1638b0>

In [20]:
model.fit(x=train_data[0], y=train_data[1], validation_data=test_data, epochs=100)

2021-10-20 12:40:27.467665: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 12269125632 exceeds 10% of free system memory.


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100


Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100

KeyboardInterrupt: 

Training metrics improve but validation metrics worsen, this is called overfitting. I could add data augmentation.

In [14]:
model.save('model.h5', include_optimizer=False)

## Play

In [11]:
os.environ['CUDA_VISIBLE_DEVICES'] = ''

In [12]:
model = keras.models.load_model('model.h5', compile=False)

2021-10-20 17:36:29.314356: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-10-20 17:36:29.314392: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: africanus
2021-10-20 17:36:29.314398: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: africanus
2021-10-20 17:36:29.314528: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 460.91.3
2021-10-20 17:36:29.314551: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 460.91.3
2021-10-20 17:36:29.314557: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 460.91.3
2021-10-20 17:36:29.314839: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions i

In [13]:
def agent_with_model(observation, configuration):
    ret = make_input(observation)
    board, features = ret[:2]
    preds = model.predict([
        expand_board_size_adding_zeros(np.expand_dims(board, axis=0)),
        np.expand_dims(features, axis=0)])
    preds = [crop_board_to_original_size(pred, observation) for pred in preds]
    active_units_to_position, active_cities_to_position, units_to_position = ret[2:]
    actions = create_actions_for_units_from_model_predictions(
        preds[0][0], active_units_to_position, units_to_position)
    actions += create_actions_for_cities_from_model_predictions(preds[1][0], active_cities_to_position)
    return actions

In [14]:
env = make("lux_ai_2021", debug=True, configuration={'width': 12, 'height': 12, 'seed': 1})
set_random_seed(7)
game_info = env.run([agent_with_model, '/mnt/hdd0/MEGA/AI/22 Kaggle/luxai/agents/working_title/agent.py'])
render_game_in_html(env)

2021-10-20 17:36:30.011090: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
[1020/173641.034010:ERROR:file_io_posix.cc(152)] open /home/gbarbadillo/.config/Code/exthost Crash Reports/pending/7c3fb021-aea7-4aad-8fd4-1200088bfc4e.lock: File exists (17)
[1020/173641.034123:ERROR:file_io_posix.cc(152)] open /home/gbarbadillo/.config/Code/exthost Crash Reports/pending/0adc3a1e-9ae8-429c-95a0-c316bed0b62a.lock: File exists (17)


Opening in existing browser session.


It's weird but the agent likes to stay at home, which does not have too much sense. Maybe I have to give bigger weight to 1s than to 0s, cause the dataset is unbalanced. Or to use focal loss.

I need a way to debug the agent, but first I need to be able to test that agent works correctly.