### VERSIONS
* **V1**: One Net (m1) for all `delta` (2 epochs) with one-step neighbors
* **V2**: One Net per (m2) `delta` (2 epochs) with one-step neighbors
* **V3**: One Net per (m2) `delta` (10 epochs) with one-step neighbors
* **V4**: One Net per (m2) `delta` (10 epochs) with **two**-step neighbors
* **V5**: One Conv Net per (c1) `delta` (15 epochs) with **two**-step neighbors

## Quick FE explanation

Here we are given a $n \times n$ grid, where the next state of cells is determined by its neighbors present state. So therefore reversely, the previous state of a cell can be infered by the present state of its neighbors. 

That is for a given cell $(i, j)$, we can possibly deduce its past state by its one-step neighbors defined by:
$$ \mathcal{V}^{(1)}(i, j) = \left \lbrace (i', j') \mid i-1 \leq i' \leq i+1 \text{ and } j-1 \leq j' \leq j+1 \right \rbrace  $$

Similarly the state of cell $\delta$ steps backward can be known from its $\delta$-step neighbor defined as follow:

$$ \mathcal{V}^{(\delta)}(i, j) = \left \lbrace (i', j') \mid i-\delta \leq i' \leq i+\delta \text{ and } j-\delta \leq j' \leq j+\delta \right \rbrace  $$

## How to apply ?

That is:
* for the grid with `delta==1`, we can use the one-step neighors state as features 
* for the grid with `delta==2`, we can use the two-step neighors state as features
* ...
* for the grid with `delta==5`, we can use the 5-step neighors state as features

**NB**: In this notebook, we will use the one-step neighborhood for every `delta` for a quick probe of LB.


In [None]:
import pandas as pd
from tqdm import tqdm
import gc
import numpy as np
import matplotlib.pyplot as plt

#TF
import tensorflow as tf
import tensorflow.keras.layers as L
import tensorflow.keras.models as M

## Useful functions

In [None]:
def get_features(delta=1):
    """
    Simple code to generate delta-neighbors states
    """
    x_tr = []
    x_te = []
    nsteps = len(NEIGHBORS)
    
    #for idx in tqdm(range(nsteps)):
    for idx in range(nsteps):
        right, up = NEIGHBORS[idx]
        cols = [f"stop_{i + right + 25*up}" if (i + right + 25*up >=0 and i+ right + 25*up <625)\
                else "pad" for i in range(625)]
        new_cols = [f"n{idx}_{i}" for i in range(625)]
        x_tr.append(tr.loc[tr.delta==delta, cols].values[:,:, np.newaxis])
        x_te.append(te.loc[te.delta==delta, cols].values[:,:, np.newaxis])
    #==================#
    x_tr = np.concatenate(x_tr, axis=2)
    x_te = np.concatenate(x_te, axis=2)
    y_tr = tr.loc[tr.delta==delta, INIT_COLS].values
    gc.collect()
    return x_tr, x_te, y_tr
#======================================
def make_model(nh=9):
    """
    Simple MLP architecture
    """
    
    nn = 100
    inp = L.Input(name="grid", shape=(625,nh))
    d1 = L.Dense(nn, name='d1', activation="relu")(inp)
    d2 = L.Dense(nn, name='d2', activation="relu")(d1)
    preds = L.Dense(1, name='preds', activation="sigmoid")(d2)
    
    model = M.Model(inp, preds, name="ANN")
    model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model
#=======================
def make_conv(nh=1, h=2, nf=128, ks=5):
    inp = L.Input(name="input", shape=(25, 25, nh))
    
    for idx in range(h):
        if idx==0:
            x = L.Conv2D(nh, ks, padding="same", activation="relu", name=f"conv{idx+1}")(inp)
        else:
            x = L.Conv2D(nh, ks, padding="same", activation="relu", name=f"conv{idx+1}")(x)
        x = L.BatchNormalization(name=f"norm{idx+1}")(x)
    #
    
    x = L.Conv2D(1, ks, padding="same", activation="sigmoid", name="last_conv")(x)
    preds = L.Reshape(name="preds", target_shape=(625, 1))(x)
    
    model = M.Model(inp, preds, name="CNN")
    model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model
#====================================================#
def line_search(y_true, y_prob, start, end, breaks):
    """
    Line search of probability threshod for maximal accuracy
    """
    space = np.linspace(start, end, breaks)
    best_threshold = 0.
    best_score = 0.
    
    for threshold in space:
        acc = np.mean( y_true == (y_prob > threshold) )
        if acc > best_score:
            best_score = acc
            best_threshold = threshold
    print(f"threshold: {best_threshold}, score: {best_score}")
    return best_threshold, best_score
#===========================================#

## Learning Phase (One net per delta)

In [None]:
%%time
tr = pd.read_csv("../input/conways-reverse-game-of-life-2020/train.csv")
te = pd.read_csv("../input/conways-reverse-game-of-life-2020/test.csv")
tr["pad"] = -1
te["pad"] = -1
INIT_COLS = [f"start_{i}" for i in range(625)]
STOP_COLS = [f"stop_{i}" for i in range(625)]

In [None]:
NF = 2
NEIGHBORS = [(right, up) for up in range(-NF, NF+1) for right in range(-NF, NF+1)]
nh = len(NEIGHBORS)

In [None]:
grid = te[STOP_COLS].values
delta_te = te['delta'].values
middle = (nh - 1) // 2

In [None]:
%%time
CONV = True
for idx in range(5):
    print("===================================\n")
    print(f"DELTA = {idx + 1}")
    x_tr, x_te, y_tr = get_features(delta=idx+1)
    if CONV:
        x_tr = x_tr.reshape((-1, 25, 25,nh ))
        x_te = x_te.reshape((-1, 25, 25,nh ))
        net = make_conv(nh=nh, h=2, nf=128, ks=5)
    else:
        net = make_model(nh)
    net.fit(x_tr, y_tr, batch_size=100, epochs=15)
    if not CONV:
        print("naive accuracy",(y_tr==x_tr[:,:,middle]).mean())
    prob_tr = net.predict(x_tr, batch_size=50, verbose=1)[:,:,0]
    bt, bs = line_search(y_tr, prob_tr, 0.3, 0.6, 7)
    preds = net.predict(x_te, batch_size=50, verbose=1)[:,:,0]
    grid[delta_te==idx+1] = (preds>bt).astype(int)
    del x_tr, x_te, y_tr
    gc.collect()
#==========================

## Submission

In [None]:
sub = te[['id']].copy()
tp = pd.DataFrame(grid, columns=INIT_COLS)
sub = sub.join(tp)
del tp
gc.collect()

In [None]:
sub.head()

In [None]:
sub.to_csv("submission.csv", index=False)