## Example Notebook 

### Requirements

- nnfabrik, neuralpredictors, and nnvision from the sinzlab repository have to be installed/cloned

- we are using the pytorch image called `sinzlab/pytorch:v3.8-torch1.4.0-cuda10.1-dj0.12.4`
    - docker image can be found here: https://github.com/sinzlab/pytorch-docker or https://hub.docker.com/r/sinzlab/pytorch/dockerfile
    - there, the complete list of packages to be installed can be found.
    - torch >= 1.4 is required.
</br>
</br>
- All individual pickle files and the image-pickle file have to be present. The can be found on the GPU server under /var/lib/nova/sinz-shared/data


# Imports

make sure to install all required packages. dependencies are listed in the dockerfile above. If necessary, install packages within the environment


In [None]:
import datajoint as dj
import os

dj.config['enable_python_native_blobs'] = True
dj.config['nnfabrik.schema_name'] = "mburg_nnvision_monkey_demo"

dj.config["display.limit"] = 50
# set external store based on env vars
if "stores" not in dj.config:
    dj.config["stores"] = {}
dj.config["stores"]["minio"] = {  # store in s3
    "protocol": "s3",
    "endpoint": os.environ.get("MINIO_ENDPOINT", "DUMMY_ENDPOINT"),
    "bucket": "nnfabrik",
    "location": "dj-store",
    "access_key": os.environ.get("MINIO_ACCESS_KEY", "FAKEKEY"),
    "secret_key": os.environ.get("MINIO_SECRET_KEY", "FAKEKEY"),
}

from nnfabrik.main import my_nnfabrik
my_nnf = my_nnfabrik("mburg_nnvision_monkey_demo", use_common_fabrikant=False, use_common_seed=False)

from nnfabrik.utility.dj_helpers import CustomSchema
schema = CustomSchema(dj.config.get("nnfabrik.schema_name", "mburg_nnvision_monkey_demo"))

from nnfabrik.templates.trained_model import TrainedModelBase
from nnfabrik.main import *

from os import listdir
from os.path import isfile, join

import matplotlib.pyplot as plt
import numpy as np

from nnfabrik import builder

# NNfabrik intro: Using the builder to build the dataloader objects, models, trainer

In [None]:
# here's where the data is on the server:
os.listdir('/data')

In [None]:
#### loading monkey data

basepath = '/data/monkey/toliaslab/CSRF19_V1'
neuronal_data_path = os.path.join(basepath, 'neuronal_data/')
neuronal_data_files = [neuronal_data_path+f for f in listdir(neuronal_data_path) if isfile(join(neuronal_data_path, f))]
image_file = os.path.join(basepath, 'images/CSRF19_V1_images.pickle')
image_cache_path = os.path.join(basepath, 'images/individual')


In [None]:
# Specifying the dataset function: its defined in nnvision/datasets, and has to present in the __init__.py there.
dataset_fn = 'nnvision.datasets.monkey_static_loader'
dataset_config = dict(dataset='CSRF19_V1',
                               neuronal_data_files=neuronal_data_files,
                               image_cache_path=image_cache_path,
                               crop=102,
                               subsample=1,
                               seed=1000,
                               time_bins_sum=6,
                               batch_size=128,)


In [None]:
dataloaders = builder.get_data(dataset_fn, dataset_config)

## Dataloaders

NNfabrik expects dataloaders to be nested dictionarys with actual PyTorch DataLoader Objects at the second Tier. The First Tier will be "train", "validation", and "test". The second Tier will be "session_key": DataLoader. So each dataset is either comprised of one or multiple sessions, with a session ID as the dictionary key to its dataloader.
Let's have a look:

In [None]:
dataloaders

In [None]:
# here's a random image. The images are cropped to be 20x20, so it trains fast for demo purposes.
some_image = dataloaders["train"][list(dataloaders["train"].keys())[0]].dataset[:].inputs[0,0,::].cpu().numpy()
plt.imshow(some_image, cmap='gray')

In [None]:
# get first data_key
first_session_ID = list((dataloaders["train"].keys()))[0]


In [None]:
a_dataloader = dataloaders["train"][first_session_ID]
a_dataloader

In [None]:
inputs, targets = next(iter(a_dataloader))
print("image_dimensions:", inputs.shape)
print("number of neurons of that session: ", targets.shape)
print("total training batches: ", len(a_dataloader))

In [None]:
# input image dimension
input_shape = dataloaders["train"][first_session_ID].dataset[:].inputs.shape
print(input_shape)
# total images = 16064
# dims: N x C x W x H

In [None]:
# outpt dimensions: neuronal firing rates'
output_shape = dataloaders["train"][first_session_ID].dataset[:].targets.shape

print(output_shape)
# Output: 14 Neurons, with N spikes over 60 ms

## Model Building

Models are built using the neuralpredictors repo from sinzlab. They consist of a convolutional core (with a user-specified number of layers), and a readout (spatial transformer readout, described in Sinz et al, 2018, NeurIPS).

### Building a model

In [None]:

model_fn = 'nnvision.models.se_core_full_gauss_readout'
model_config = {'pad_input': False,
                'stack': -1,
               'depth_separable': True,
               'input_kern': 20,
               'gamma_input': 11.2,
               'gamma_readout': 0.33,
               'hidden_dilation': 1,
               'hidden_kern': 5,
               'n_se_blocks': 0,
               'hidden_channels': 32}
model = builder.get_model(model_fn, model_config, dataloaders=dataloaders,seed=1000)
print(model)

Above is the model description. Each session has its own readout. The Readout learns an x,y position between -1 and 1, relative to image space, and reads out from that point in feature space. THat means that the effective receptive field size of a unit in the last hidden layer will also be the receptive field size of the neuron. The x/y coordinates can be accessed like this:


In [None]:
model.core.features.layer0.conv.weight.data.shape

#### Show example readout positions after initialization

In [None]:
# x/y position (after random initialization, becuse the model isnt trained yet) for each neuron in that session
model.readout[first_session_ID].grid

## Building a Trainer

the trainer is taking care of the whole training process. when the trainer is built, itÄ's a function with the configuration already initialized

In [None]:
trainer_fn = 'nnvision.training.nnvision_trainer'
trainer_config = dict(max_iter=1, 
                      lr_decay_steps=4, 
                      tolerance=0.0005, 
                      patience=5,
                      verbose=False, 
                      lr_init=0.003,
                      avg_loss=False,
                      device='cuda')

trainer = builder.get_trainer(trainer_fn, trainer_config)

## Train a model

In [None]:
score, output, model_state = trainer(model=model, dataloaders=dataloaders, seed=1000)

# Part Two: NNfabrik and DataJoint

Instead of using the builder to get the data/model/and trainer, we can use datajoint to manage that process for us.
There are Model, Dataset, and Trainer Tables. And each combination in those tables should in principle lead to a fully trained model.
For completeness, there is also a Seed table that stores the random seed, and a Fabrikant table, that stores the name and contact details of the creator (=Fabrikant).


In [None]:
my_nnf.Fabrikant()

In [None]:
# change this entry to reflect your datajoint username
my_nnf.Fabrikant().insert1(dict(fabrikant_name='mburg',
                         email="first.last@uni-goettingen.de",
                         affiliation='eckerlab',
                         dj_username="mburg"))

In [None]:
my_nnf.Fabrikant()

In [None]:
my_nnf.Seed().insert([{'seed':1000}])
my_nnf.Seed()

###  add entries for dataset, model, and trainer, with their corresponding configurations

#### Dataset

In [None]:
# adds the dataset_function and dataset config that we defined above to the datase table
my_nnf.Dataset().add_entry(dataset_fn, dataset_config, dataset_comment='CSRF_V1', dataset_fabrikant='mburg')
my_nnf.Dataset()

#### Model

In [None]:
model_fn = 'nnvision.models.se_core_full_gauss_readout'
model_config = {'pad_input': False,
                'stack': -1,
               'depth_separable': True,
               'input_kern': 20,
               'gamma_input': 11.2,
               'gamma_readout': 0.33,
               'hidden_dilation': 1,
               'hidden_kern': 5,
               'n_se_blocks': 0,
               'hidden_channels': 32,
               'gauss_type': 'isotropic'}                
my_nnf.Model().add_entry(model_fn, model_config, model_comment='isotropic', model_fabrikant='mburg')
my_nnf.Model()

In [None]:
my_nnf.Trainer.delete()

In [None]:
trainer_fn = 'nnvision.training.nnvision_trainer'
trainer_config = dict(max_iter=2, 
                      lr_decay_steps=4, 
                      tolerance=0.0005, 
                      patience=5,
                      verbose=False, 
                      lr_init=0.0045,
                      avg_loss=False,
                      device='cuda')

my_nnf.Trainer().add_entry(trainer_fn, trainer_config, trainer_comment="max_iter: 2", trainer_fabrikant='mburg')
my_nnf.Trainer()

In [None]:
(my_nnf.Trainer&"trainer_hash='e225a557fe8039df3717ddbe4686caa9'")

####  The TrainedModel is a template, which can be found in nnfabrik.template.py

the trained model table is taking care of model training, and stores the model state in a part table. For further analyses of the trained model, one can either overwrite the TrainedModel definition by inheriting from the Base template class, or by attaching other tables to trained model.

In [None]:
# creating the simples TrainedModel class
@my_nnf.schema
class TrainedModel(TrainedModelBase):
    nnfabrik = my_nnf

In [None]:
TrainedModel()

as primary keys, it has the hashes of all the configurations, and it stores the score, and the output (which are defined in the respective trainer)

## Lets populate

## Plot graph of all tables in schema

This is mainly intended as workaround for bug in dj version 0.12.9 and 0.13.0 https://github.com/datajoint/datajoint-python/issues/902

In [None]:
dj.ERD(schema)

In [None]:
TrainedModel().populate(display_progress=True)

In [None]:
model_hash = TrainedModel().fetch1("model_hash")

In [None]:
# now if you want to build the model again, we can use the .load_model() function of the trained model table.
# To use the load model function, the table needs to be restricted to one Entry. 
# for example: restricting with a key:
some_key = dict(model_hash=model_hash)
TrainedModel&some_key

# How to Load a Model

In [None]:
dataloader, model = (TrainedModel & some_key).load_model()

In [None]:
# that is the trained model, with the state dict loaded and all. lets set to eval and start using it
model.eval()

# Parameter Extension

In [None]:
# There's also the parameter extension, so that you can restrict with the confif objects as well.

In [None]:
from nnfabrik.utility.dj_helpers import create_param_expansion, make_definition
ModelExpanded = create_param_expansion('nnvision.models.se_core_full_gauss_readout', Model,fn_field='model_fn', config_field='model_config')
ModelParams = schema(ModelExpanded)

In [None]:
ModelParams()

In [None]:
ModelParams.populate()

In [None]:
ModelParams()

In [None]:
# for example:
TrainedModel*ModelParams&"hidden_kern=5"

In [None]:
# Now you can just use that for building the model:
dataloaders, model = (TrainedModel & (ModelParams & "hidden_kern=5")).load_model()