<h1> Symm4ml Computational Essay Final Draft</h1>
<h3> Author: Simeon Radev</h3>
<h3> Date: 12 May 2023</h3>

<h2>Introduction</h2>

<p> Recent work has been done on using small neural networks to model cellular automaton (CA)-like behavior, whereby such networks learn a set of "rules", represented as model parameters, to generate some target. 

Such models are already capable of generating a target pattern from a single seed pixel input, however, in their vanilla version, they cannot generate the same target in a different orientation without being explicitly retrained to do so. 

The goal of this project would therefore be to train a cellular automaton-like model that can learn the rules for generating a target, and can perform some simple transformations without having to retrain the model on such transformations. The motivation for this is that without such equivariance, the orientation of the model is a property of the grid space instead of the configuration of cells/pixels/nodes inside this space. Therefore this model is not fully self-organizing yet, however with such equivariant versions of the network, the orientation of the model would indeed become a property of the configuration of cells/pixels/nodes. </p>

<h2>Related Work</h2>
<p>The most direct influence for this project comes from <a href="https://distill.pub/2020/growing-ca/">this Distill paper</a>. They pioneer the idea of using a small, single neural network that is applied repeatedly over a series of discrete time steps to grow the final target pattern. The relevant model parameters are then updated accordingly via backpropagation-through-time. 

A <a href="https://proceedings.neurips.cc/paper/2021/hash/af87f7cdcda223c41c3f3ef05a3aaeea-Abstract.html">2021 NeurIPS paper</a> then extends this idea to graphical structures. There, a graph is distorted and the distortion is used as a seed input into the model. Then, via repeated applications of a single, graph-based, CA-like model, the distorted graph is made to converge to some target shape. While this model performs well for its task, it is still not equivariant and so any major transformation of the target would require that the model be retrained. </p>

<p>Below we include a recreation of the experiments from the Distill blog, re-written in PyTorch from the original implementation in Tensorflow:</p>

<p style="color:red">TODO: Discuss more specifically their implementation and include graphic of model. Also, include briefly their follow-up with the isotropic version and how the methodology is different.</p>

<h2>Background</h2>
<p>The goal of this project is to make an equivariant version of either a 
two-dimensional or three-dimensional cellular automaton-like network, using 
ideas from the two main papers cited above. In the three-dimensional case 
this could obviously done with the e3nn library and standard concepts of 
equivariance in three-dimensional space covered in the class. 

If, however, a two-dimensional route is pursued, the equivariant model could potentially be implemented without such additional library. I am still unclear on how this would be done and would need to spend more time ideating this as well as discussing with others. </p>

<p style="color:red">TODOL: Include more math and discuss the group that will be used. This is the place for theoretical overview.<p>

<h2>Methods</h2> The models that would need to be built would be similar to the ones described in the above papers, with equivariance being additionally implemented. 

<p>Ideally, I wanted to do this in JAX, so below is a non-working re-implementation of the same PyTorch pipeline above. For some reason, I cannot run it on GPU with PyTorch imported, so there is an accompanying notebook dedicated just for the JAX version (which should at least run). The code from that notebook is reproduced here for ease of access however.</p>

<p style="color:red">
    TODO: This is the bulk. If time permits, include some individual cells of the experimentation from the playground 
    (showcasing how the library works on small examples). 
    <br>
    Then, include the code implementations below (just for model training and class definition). Experiments will be later.
</p>

In [None]:
import os
os.environ['FFMPEG_BINARY'] = 'ffmpeg'

# PyTorch dependencies
import torch
import torch.nn as nn
import torch.nn.functional as F

# Equivariant library
import e2cnn
from e2cnn import gspaces

# Other dependencies 
import numpy as np
import matplotlib.pyplot as plt
import tqdm
import math
import PIL.Image, PIL.ImageDraw
import time

# Notebook dependencies
from IPython.display import clear_output, Image

# Import necessary functions from personal utils file
from utils import load_emoji, imshow, get_living_mask, to_rgba, visualize_batch,\
    plot_loss, create_filename, save_ca_model, load_ca_model, save_loss_log, load_loss_log, simulate_model


# For file reloading (remove before submission)
import importlib
import sys
importlib.reload(sys.modules['utils'])
from utils import load_emoji, imshow, get_living_mask, to_rgba, visualize_batch,\
    plot_loss, create_filename, save_ca_model, load_ca_model, save_loss_log, load_loss_log, simulate_model


# Get access to GPU
device_id = 0
device = torch.device(f'cuda:{device_id}' if torch.cuda.is_available() else 'cpu')
print('device is {}'.format(device))