# Demo containing all the steps from loading data to submission using the `hub` package

### This notebook will show how to
- how to use the cloud based DataLoaders
- how to train the baseline models with these dataloaders
- how to make a submission

**Importantly**
 - In this approach, we provide the data entirely over the cloud, such that models can be trained with ease

In [2]:
import numpy as np

import torch
from torchvision import transforms
from sklearn.model_selection import train_test_split

import warnings
warnings.filterwarnings('ignore')

# the package that we host our data on for cloud based access without the need to download our data at all
import hub

### 1. Load the train and test datasets and create the corresponding dataloader

- Here we will provide the data from 7 mice for our competition. As of now, we just provide toy data over the cloud

In [3]:
dataset_id = 'mouse1'

# get the data from Activeloop
train_dataset_train = hub.load(f"hub://mohammadbashiri/npc-{dataset_id}-train")
train_dataset_val = hub.load(f"hub://mohammadbashiri/npc-{dataset_id}-train")
test_dataset = hub.load(f"hub://mohammadbashiri/npc-{dataset_id}-test")

# split the trainset into train and validation (i.e. modify the index of the corresponding dataset)
n_training_samples = len(train_dataset_train)
train_indices, val_indices = train_test_split(np.arange(n_training_samples), train_size=0.7)
train_samples_mask = np.isin(np.arange(n_training_samples), train_indices)
train_dataset_train.index.values[0].value = tuple(np.where(train_samples_mask)[0].tolist())
train_dataset_val.index.values[0].value = tuple(np.where(~train_samples_mask)[0].tolist())

# create the dataloaders
train_dataloader = train_dataset_train.pytorch(batch_size=16, shuffle=True, transform={'inputs': transforms.ToTensor(), 'targets': None, 'image_ids': None, 'trial_indices': None})
val_dataloader = train_dataset_val.pytorch(batch_size=16, shuffle=False, transform={'inputs': transforms.ToTensor(), 'targets': None, 'image_ids': None, 'trial_indices': None})
test_dataloader = test_dataset.pytorch(batch_size=16, shuffle=False, transform={'inputs': transforms.ToTensor(), 'image_ids': None, 'trial_indices': None})

# Combine the dataloaders into a single object (dict)
dataloaders = {"train": {dataset_id: train_dataloader},
               "validation": {dataset_id: val_dataloader},
               "test": {dataset_id: test_dataloader}}

Opening dataset in read-only mode as you don't have write permissions.
hub://mohammadbashiri/npc-mouse1-train loaded successfully.
This dataset can be visualized at https://app.activeloop.ai/mohammadbashiri/npc-mouse1-train.
Opening dataset in read-only mode as you don't have write permissions.
hub://mohammadbashiri/npc-mouse1-train loaded successfully.
This dataset can be visualized at https://app.activeloop.ai/mohammadbashiri/npc-mouse1-train.
Opening dataset in read-only mode as you don't have write permissions.
hub://mohammadbashiri/npc-mouse1-test loaded successfully.
This dataset can be visualized at https://app.activeloop.ai/mohammadbashiri/npc-mouse1-test.


In [4]:
# dataloaders have the same behavior as in demo notebooks 1-3
dataloaders

{'train': {'mouse1': <torch.utils.data.dataloader.DataLoader at 0x7f33b2c5d610>},
 'validation': {'mouse1': <torch.utils.data.dataloader.DataLoader at 0x7f33b361e040>},
 'test': {'mouse1': <torch.utils.data.dataloader.DataLoader at 0x7f33b3367bb0>}}

---

### 2. Initialize model

In [4]:
from sensorium.models import stacked_core_full_gauss_readout

model_config = {'pad_input': False,
              'stack': -1,
              'layers': 4,
              'input_kern': 9,
              'gamma_input': 6.3831,
              'gamma_readout': 0.0076,
              'hidden_dilation': 1,
              'hidden_kern': 7,
              'hidden_channels': 64,
              'depth_separable': True,
              'init_sigma': 0.1,
              'init_mu_range': 0.3,
              'gauss_type': 'full',
               }

model = stacked_core_full_gauss_readout(dataloaders, seed=1, **model_config)

### 3. Train the model

In [5]:
from sensorium.training import standard_trainer

In [6]:
# using a trainer with only 2 iterations for simplicity

trainer_config = {'max_iter': 2,
                  'verbose': False,
                  'lr_decay_steps': 4,
                  'avg_loss': False,
                  'lr_init': 0.009}

score, output, state_dict = standard_trainer(model, dataloaders, seed=1, **trainer_config)

Epoch 1: 100%|██████████| 9/9 [00:05<00:00,  1.65it/s]


[001|00/05] ---> 0.014943685638824901


Epoch 2: 100%|██████████| 9/9 [00:05<00:00,  1.62it/s]


[002|00/05] ---> 0.017162172343622446
Restoring best model! 0.017162 ---> 0.017162


---

### 4. Prepare the submission file

In [7]:
from sensorium.utility.submission import generate_submission_file

In [8]:
generate_submission_file(trained_model=model,
                                    test_dataloader=dataloaders["test"][dataset_id],
                                    path='./submission_files/hub_demo_submission_file.csv')

File saved.


In [9]:
import pandas as pd
pd.read_csv('./submission_files/hub_demo_submission_file.csv')

Unnamed: 0,trial_indices,image_ids,prediction,neuron_ids
0,0,0,"[0.1881781816482544, 0.24576568603515625, 0.20...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
1,1,0,"[0.2929092049598694, 0.30753374099731445, 0.26...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
2,2,0,"[0.2957261800765991, 0.34191060066223145, 0.22...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
3,3,0,"[0.29291629791259766, 0.3131585121154785, 0.27...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
4,4,0,"[0.3207061290740967, 0.3631242513656616, 0.226...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
...,...,...,...,...
95,95,9,"[0.17506706714630127, 0.20271086692810059, 0.2...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
96,96,9,"[0.29671674966812134, 0.3007974624633789, 0.21...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
97,97,9,"[0.25589656829833984, 0.2131522297859192, 0.13...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."
98,98,9,"[0.2952677607536316, 0.3151891827583313, 0.175...","[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,..."


### Ok wait, how do I submit an entry to the competition again?

simply register a new user on http://sensorium2022.net/ with a valid email address, upload the .csv file. That's already it 👍

---