# Game of life EDA
In this 2D grid environement that is the game of life, cell values are influenced by neighboring cells. This was originally meant to model natural interactions between neighboring animals, persons, plants..
The fascinating aspect of the game of life is the relative simplicity of the world's rules that yet result in complex and elaborate patterns.
In this notebook, we will explore the data, plot grids and delta distribution.

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
from tqdm.notebook import tqdm

In [None]:
train_csv = pd.read_csv("../input/conways-reverse-game-of-life-2020/train.csv")

In [None]:
train_csv.head()

## Grid display
We will first print a few grids to see what they look like.

In [None]:
col = train_csv.columns
start_col_filtered = [c for c in col if "start_" in c]
end_col_filtered = [c for c in col if "stop_" in c]
start_grids = train_csv[start_col_filtered].to_numpy()
stop_grids = train_csv[end_col_filtered].to_numpy()

In [None]:
def get_grid(list_of_cells, shape=(25,25), plot=True):
    mat = list_of_cells.reshape(shape)
    if (plot):
        sns.heatmap(mat)
    return mat
    
def plot_grids(idx):
    start_cells, end_cells = start_grids[idx], stop_grids[idx]
    start_mat = get_grid(start_cells, plot=False)
    end_mat = get_grid(end_cells, plot=False)
    
    fig, axes = plt.subplots(1,2, figsize=(25,10))
    sns.heatmap(start_mat, ax=axes[0])
    axes[0].set_title("Starting board")
    sns.heatmap(end_mat, ax=axes[1])
    axes[1].set_title("Ending board")
    
idx = 5
plot_grids(idx)

idx = 2
plot_grids(idx)

We notice, in these two plots, some similarities betwee the starting and ending board. We will then plot a change detection heatmap to see better how those interactions play out.
Given a pixel $p$ defined by coordinates $(x,y)$, it is a vector of size 2 written as $(p_{start}, p_{end})$ where $p_{start}, p_{end} \in \{0,1\}$. Given this, 4 possible observations exist:
 - $p = (0,0)$ -> the cell was **empty in both start and end board**. We attribute it the color black and value 0;
 - $p = (1,0)$ -> the cell was **taken in the start board** but **absent in the end board**. We attribute it the color red and value 1;
 - $p = (0,1)$ -> the cell was **empty in the start board** but **taken in the end board**. We attribute it the color green and value 2;
 - $p = (1,1)$ -> the cell was **taken in both start and end board**. We attribute it the color yellow and value 3.


In [None]:
def get_diff(idx, plot=True):
    start_cells, end_cells = start_grids[idx], stop_grids[idx]
    start_grid = get_grid(start_cells, plot=False)
    end_grid = get_grid(end_cells, plot=False)
    
    img = np.zeros((25,25,3))
    img[:,:,0] = start_grid #every red pixel was pixel present in start grid and absent in end grid
    img[:,:,1] = end_grid #every green pixel was pixel present in end grid and absent in start grid
    # every black pixel was absent in both images
    # every yellow pixel was present in both images
    diff_mat = np.zeros((25,25))
    diff_mat = start_grid + 2*end_grid
    if plot:
        plt.imshow(img)
    
    return diff_mat
    
diff_mat = get_diff(3)

In [None]:
values = []
for idx in tqdm(range(len(start_grids))):
    diff_mat = get_diff(idx, plot=False)
    values.extend(list(diff_mat.flatten()))

In [None]:
plt.hist(values)

We now plot an histogram of these 4 values and notice that, apart from empty pixels, the rarest scenario is to have pixels illuminated at start and at the end. 