# Models for ICLR 2019 paper

### A rotation-equivariant convolutional neural network model of primary visual cortex
*Alexander S. Ecker, Fabian H. Sinz, Emmanouil Froudarakis, Paul G. Fahey, Santiago A. Cadena, Edgar Y. Walker, Erick Cobos, Jacob Reimer, Andreas S. Tolias, Matthias Bethge*

https://openreview.net/forum?id=H1fU8iAqKX

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
import tensorflow as tf, numpy as np, os, sys
p = !pwd
p = os.path.dirname(os.path.dirname(p[0]))
if p not in sys.path:
    sys.path.append(p)

In [None]:
# TO DO: remove dependency on database to load data
from cnn_sys_ident.mesonet.data import MultiDataset
data_key = dict(data_hash='cfcd208495d565ef66e7dff9f98764da')
data = (MultiDataset() & data_key).load_data()

In [None]:
from cnn_sys_ident.architectures.models import BaseModel, CorePlusReadoutModel
from cnn_sys_ident.architectures.training import Trainer

## Control: Feature space generalizes to unseen neurons

To show that our network learns common features of V1 neurons, we excluded half of the neurons when fitting the network. We then fixed the rotation-equivariant convolutional core and trained only the readout (spatial mask and feature weights) for the other half of the neurons. 

In terms of implementation, we insert a stop_gradient between the convolutional core and the readout for half of the neurons, which is done in the class for the readout (`SpatialXFeatureJointL1TransferReadout`).

In [None]:
from cnn_sys_ident.architectures.cores import StackedRotEquiHermiteConv2dCore
from cnn_sys_ident.architectures.readouts import SpatialXFeatureJointL1TransferReadout

In [None]:
base = BaseModel(
    data,
    log_dir='iclr2019-checkpoints-repro',
    log_hash='b8f78ead705cb02d09c01f9701067ba2'
)
core = StackedRotEquiHermiteConv2dCore(
    base,
    base.inputs,
    num_rotations=8,
    upsampling=2,
    shared_biases=False,
    filter_size=[13, 5, 5],
    num_filters=[16, 16, 16],
    stride=[1, 1, 1],
    rate=[1, 1, 1],
    padding=['SAME', 'SAME', 'SAME'],
    activation_fn=['soft', 'soft', 'none'],
    rel_smooth_weight=[1, 0.5, 0.5],
    rel_sparse_weight=[0, 1, 1],
    conv_smooth_weight=0.0112711,
    conv_sparse_weight=0.0492937,
)
readout = SpatialXFeatureJointL1TransferReadout(
    base,
    core.output,
    k_transfer=2,
    positive_feature_weights=False,
    readout_sparsity=0.020616,
    init_masks='rand',
)
model = CorePlusReadoutModel(base, core, readout)
trainer = Trainer(base, model)

In [None]:
iter_num, val_loss, test_corr = trainer.fit(
    val_steps=50, learning_rate=0.002, batch_size=256, patience=5)

In [None]:
trainer.compute_test_corr()