# Tutorial 3: How simulations define your predictions
The inverse problem has no unique solution as it is ill-posed. In order to solve it we need to constraint the space of possible solutions. While inverse solutions like minimum-norm estimates have an explicit constraint of minimum-energy, the constraints with esinet are implicit and mostly shaped by the simulations.

This tutorial aims the relation between simulation parameters and predictions.

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

# import mne
# import numpy as np
# from copy import deepcopy
# import matplotlib.pyplot as plt
import sys; sys.path.insert(0, '../')
from esinet import util
from esinet import Simulation
from esinet import Net
from esinet.forward import create_forward_model, get_info
plot_params = dict(surface='white', hemi='both', verbose=0)

## Create Forward model
First we create a template forward model which comes with the esinet package

In [2]:
info = get_info(sfreq=100)
fwd = create_forward_model(info=info)

[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    1.5s remaining:    1.5s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    1.5s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    1.5s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.1s remaining:    0.1s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.1s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.1s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.2s remaining:    0.2s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.2s finished


## Simulate
Next, we simulate two types of data: 
1. Data containing small sources with 15-25 mm in diameter.
2. Data containing large sources with 35-45 mm in diameter.

Note, that for publication-ready inverse solutions you should increase the number of training samples to 100,000.

In [4]:
n_samples = 10000
settings_small = dict(number_of_sources=(1, 10), extents=(15, 25))
settings_large = dict(number_of_sources=(1, 10), extents=(35, 45))

sim_small = Simulation(fwd, info, settings=settings_small).simulate(n_samples=n_samples)
sim_large = Simulation(fwd, info, settings=settings_large).simulate(n_samples=n_samples)


Simulating data based on sparse patches.


100%|██████████| 10000/10000 [01:01<00:00, 162.58it/s]
100%|██████████| 10000/10000 [00:01<00:00, 6025.52it/s]


source data shape:  (1284, 100) (1284, 100)


100%|██████████| 10000/10000 [02:34<00:00, 64.71it/s]


Simulating data based on sparse patches.


100%|██████████| 10000/10000 [01:35<00:00, 104.87it/s]
100%|██████████| 10000/10000 [00:05<00:00, 1837.24it/s]


source data shape:  (1284, 100) (1284, 100)


100%|██████████| 10000/10000 [03:53<00:00, 42.88it/s]


## Lets visualize the two types of simulations
The two brain plots should now look quite different, as one contains large and extended sources while the other contains tiny point-like sources.

In [5]:
idx = 0
brain = sim_small.source_data[idx].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small sources', 'title',
               font_size=14)

brain = sim_large.source_data[idx].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large sources', 'title',
               font_size=14)

## Train individual neural networks

In [6]:
model_type = 'LSTM'  # can be 'LSTM' or 'ConvDip', too
net_small = Net(fwd, verbose=True, model_type=model_type).fit(sim_small, epochs=10)
net_large = Net(fwd, verbose=True, model_type=model_type).fit(sim_large, epochs=10)

preprocess data
werks3
Model: "Contextualizer"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, None, 61)]   0           []                               
                                                                                                  
 FC1 (TimeDistributed)          (None, None, 200)    12400       ['Input[0][0]']                  
                                                                                                  
 dropout (Dropout)              (None, None, 200)    0           ['FC1[0][0]']                    
                                                                                                  
 LSTM1 (Bidirectional)          (None, None, 64)     44928       ['dropout[0][0]']                
                                                              

Now we have simulated two different types of source & eeg data and build two neural networks that each was trained on one of these simulations. Lets see how they perform within their own simulation type.

In [None]:
# Simulate some new, unseen test data    
n_test_samples = 2
sim_test_small = Simulation(fwd, info, settings=settings_small).simulate(n_samples=n_test_samples)
sim_test_large = Simulation(fwd, info, settings=settings_large).simulate(n_samples=n_test_samples)


brain = sim_test_small.source_data[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of small data', 'title',
               font_size=14)


brain = net_small.predict(sim_test_small)[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small-Net on small data', 'title',
               font_size=14)



brain = sim_test_large.source_data[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of large data', 'title',
               font_size=14)


brain = net_large.predict(sim_test_large)[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large-Net on large data', 'title',
               font_size=14)

Now we will use the large-net to predict the small simulation and vice versa.

In [None]:
brain = sim_test_small.source_data[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of small data', 'title',
               font_size=14)


brain = net_large.predict(sim_test_small)[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large-Net on small data', 'title',
               font_size=14)



brain = sim_test_large.source_data[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of large data', 'title',
               font_size=14)


brain = net_small.predict(sim_test_large)[0].plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small-Net on large data', 'title',
               font_size=14)

We now find that the Net which was trained on large simulations always tends to find large sources - even when confronted with data in which small sources were active. 

Conversely, the Net which was trained on simulations that contain small sources finds sparse sources when confronted with data containing large-source activity.

This demonstrates that our simulation settings function like priors. Further, it emphasizes the importance to state your priors and to motivate your choice.

In many cases we can't make a choice and we want to make as few assumptions into our models as possible. In this case we propose that you use large ranges in your settings to maximize the diversity of your training data.

A sample of a diverse setting is given in the next cell:

In [None]:
settings = {
    'number_of_sources': (1, 20),  # The range of simultaneously active sources.
    'extents': (1, 50),  # The range of source diameters in mm 
    'amplitudes': (1, 100),  # Defines the range of amplitudes (in arbitrary units)
    'shapes': 'both',  # Simulate both gaussian-shaped and flat sources
    'beta': (0, 3),  # Defines the distribution of the noise in terms of 1/f**beta
}