# How we evaluate the competition entries

### This notebook will show how we calculate the competition scores on our website

Here, we illustrate in detail how we use the submission file to compute the scores of the competition. We do this by taking one of the datasets that is not part of the competition (i.e. one of the "pre-training" datasets). All of the "pre-training" datasets also have a test set (i.e. the `live_test` set), and the responses of all neurons to these test set images are present. Thus, you can test your model by training on one or all of the "pre-training" recordings and see how the model performs on the test set.

Here, we use one of the sets to show how we extract the responses from the test set into a `ground_truth` file. And how we use the submitted files and the ground truth file to calculate the scores.

In detail, this notebooks includes these steps:
- we first load a pretrained model
- we select a pre-training dataset, which is not part of the competition, and treat the "test" set in it as if it was part of the competition track
- this example then illustrates the complete process how the ground truth responses are extracted, and how the scores are getting calculated between ground truth and the submitted responses

### Imports

In [5]:
import collections.abc
#hyper needs the four following aliases to be done manually.
collections.Iterable = collections.abc.Iterable
collections.Mapping = collections.abc.Mapping
collections.MutableSet = collections.abc.MutableSet
collections.MutableMapping = collections.abc.MutableMapping

In [6]:
import torch
import numpy as np
import pandas as pd
import os

import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

from nnfabrik.builder import get_data, get_model, get_trainer

In [7]:
%cd ../../

c:\Users\hp\sensorium


## Get dataloader and model

In [8]:
#get dataloader
basepath = "notebooks/data/"
filenames = [os.path.join(basepath, file) for file in os.listdir(basepath) if ".zip" in file ]

dataset_fn = 'sensorium.datasets.static_loaders'
dataset_config = {'paths': filenames,
                 'normalize': True,
                 'include_behavior': False,
                 'include_eye_position': False,
                 'batch_size': 128,
                 'scale':.25,
                 }

dataloaders = get_data(dataset_fn, dataset_config)

# get model
model_fn = 'sensorium.models.stacked_core_full_gauss_readout'
model_config = {'pad_input': False,
  'layers': 4,
  'input_kern': 9,
  'gamma_input': 6.3831,
  'gamma_readout': 0.0076,
  'hidden_kern': 7,
  'hidden_channels': 64,
  'depth_separable': True,
  'grid_mean_predictor': {'type': 'cortex',
   'input_dimensions': 2,
   'hidden_layers': 1,
   'hidden_features': 30,
   'final_tanh': True},
  'init_sigma': 0.1,
  'init_mu_range': 0.3,
  'gauss_type': 'full',
  'shifter': False,
  'stack': -1,
}

model = get_model(model_fn=model_fn,
                  model_config=model_config,
                  dataloaders=dataloaders,
                  seed=42,)

# load model weights
model.load_state_dict(torch.load("notebooks/model_tutorial/model_checkpoints/pretrained/generalization_model.pth"));
model.eval();

KeyboardInterrupt: 

---

# How we calculate the competition scores behind the scenes

- we are withholding the ground truth neuronal responses to the test set images in the actual competition
- here we show 
 - how we extract the ground truth responses from the demo dataset (where the test set responses are present)
 - how the metrics of the competition are calculated

The following steps are necessary:
1. pick a dataset
2. generate a file that contains the ground truth responses to the test set
3. generate a submission file that contains the predictions
4. Calculate the performance metrics based on these 2 files

# !! Important !!

Our grund truth file is storing **standardized responses**, meaning the responses of each neuron normalized by its own STD. Our dataloader is automatically normalizing the images and responses, and we encourage you to use our DataLoader and our submission API

In [3]:
# import the API from the competition repo
from sensorium.utility import submission

### 1) Example Dataset:'21067-10-18' from the pre-training recordings

In [4]:
filename = ['../data/static21067-10-18-GrayImageNet-94c6ff995dac583098847cfecd43e7b6.zip',]

dataset_name = "21067-10-18"

### 2) Generate Ground Truth File

In [16]:
# we load the dataset which contains the held-out "test" responses, and save them in the .csv format
# for the demo dataset that we provide here, these "test" responses are present

submission.generate_ground_truth_file(filename=filename,
                                      path='./ground_truth_files/',
                                      tier="test")

NameError: name 'submission' is not defined

##### Inspect the Ground Truth File

In [15]:
%cd ./hp/sensorium

c:\Users\hp\sensorium


In [14]:
%ls

 Volume in drive C is Windows
 Volume Serial Number is 46C8-E9DC

 Directory of c:\Users

18-08-2022  03:15    <DIR>          .
23-08-2022  14:51    <DIR>          hp
18-08-2022  16:43    <DIR>          Public
               0 File(s)              0 bytes
               3 Dir(s)  36,525,690,880 bytes free


In [9]:
import pandas as pd

In [11]:
data = pd.read_csv('notebooks/submission_tutorial/ground_truth_files/ground_truth_file_test.csv')
len(data.loc[0].at["responses"])

FileNotFoundError: [Errno 2] No such file or directory: 'notebooks/submission_tutorial/ground_truth_files/ground_truth_file_test.csv'

### 3) Generate Submission file

In [7]:
# generate the submission file
submission.generate_submission_file(trained_model=model, 
                                    dataloaders=dataloaders,
                                    data_key=dataset_name,
                                    path='./submission_files/',
                                    device="cuda",
                                    tier="test")

Submission file saved for tier: live_test. Saved in: ./submission_files/submission_file_live_test.csv


##### Inspect content of submission file

In [8]:
import pandas as pd
pd.read_csv('./submission_files/submission_file_live_test.csv')

Unnamed: 0,trial_indices,image_ids,prediction,neuron_ids
0,126,2214,"[0.24674898386001587, 0.23912948369979858, 0.4...","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
1,297,2214,"[0.24674898386001587, 0.23912948369979858, 0.4...","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
2,597,2214,"[0.24674898386001587, 0.23912948369979858, 0.4...","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
3,852,2214,"[0.24674898386001587, 0.23912948369979858, 0.4...","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
4,908,2214,"[0.24674898386001587, 0.23912948369979858, 0.4...","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
...,...,...,...,...
993,2752,3487,"[0.040638625621795654, 0.11597681045532227, 0....","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
994,3039,3487,"[0.040638625621795654, 0.11597681045532227, 0....","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
995,4312,3487,"[0.040638625621795654, 0.11597681045532227, 0....","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."
996,4380,3487,"[0.040638625621795654, 0.11597681045532227, 0....","[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1..."


### 4) Evaluation

This is what is happening in the backend of our competition website

In [9]:
from sensorium import evaluate

In [10]:
# specify submission and ground truth file
ground_truth_file = './ground_truth_files/ground_truth_file_test.csv'
submission_file = './submission_files/submission_file_live_test.csv'

In [11]:
from sensorium import evaluate

In [12]:
# specify submission and ground truth file
ground_truth_file = './ground_truth_files/ground_truth_file_test.csv'
submission_file = './submission_files/submission_file_live_test.csv'

In [13]:
out = evaluate(submission_file, ground_truth_file)

In [14]:
print("Results for the SOTA model:")
for metric, value in out.items():
    print(f"{metric}: {np.round(value, 3)}")

Results for the SOTA model:
Single Trial Correlation: 0.286
Correlation to Average: 0.542
FEVE: 0.452


#### These scores are calcualted in the backend of our website
- we have two test sets, so these scores will be computed for our **live** test set, and our **final** test set
- the **live** scores will get published on the live leaderboard
- the **final** scores will be hidden, and we will release them to the public on Oct 22, after checking the scores carefully
- the **final** scores will then determine the winner of the competition