# Tutorial 3: How simulations define your predictions
The inverse problem has no unique solution as it is ill-posed. In order to solve it we need to constraint the space of possible solutions. While inverse solutions like minimum-norm estimates have an explicit constraint of minimum-energy, the constraints with esinet are implicit and mostly shaped by the simulations.

This tutorial aims the relation between simulation parameters and predictions.

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

import mne
import numpy as np
from copy import deepcopy
import matplotlib.pyplot as plt
import sys; sys.path.insert(0, '../')
from esinet import util
from esinet import Simulation
from esinet import Net
from esinet.forward import create_forward_model, get_info
plot_params = dict(surface='white', hemi='both', verbose=0)


## Create Forward model
First we create a template forward model which comes with the esinet package

In [13]:
info = get_info()
info['sfreq'] = 100
fwd = create_forward_model(info=info)

[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done   3 out of   8 | elapsed:    0.4s remaining:    0.8s
[Parallel(n_jobs=8)]: Done   5 out of   8 | elapsed:    1.3s remaining:    0.7s
[Parallel(n_jobs=8)]: Done   8 out of   8 | elapsed:    1.3s finished
[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done   3 out of   8 | elapsed:    0.1s remaining:    0.2s
[Parallel(n_jobs=8)]: Done   5 out of   8 | elapsed:    0.1s remaining:    0.0s
[Parallel(n_jobs=8)]: Done   8 out of   8 | elapsed:    0.6s finished
[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done   3 out of   8 | elapsed:    0.0s remaining:    0.1s
[Parallel(n_jobs=8)]: Done   5 out of   8 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=8)]: Done   8 out of   8 | elapsed:    0.1s finished


## Simulate
Next, we simulate two types of data: 
1. Data containing small sources with 15-25 mm in diameter.
2. Data containing large sources with 35-45 mm in diameter.

In [9]:
n_samples = 10000
settings_small = dict(n_sources=(1, 10), extents=(15, 25))
settings_large = dict(n_sources=(1, 10), extents=(35, 45))

sim_small = Simulation(fwd, info, settings=settings_small).simulate(n_samples=n_samples)
sim_large = Simulation(fwd, info, settings=settings_large).simulate(n_samples=n_samples)


  0%|          | 0/100 [00:00<?, ?it/s]

  0%|          | 0/100 [00:00<?, ?it/s]

  0%|          | 0/100 [00:00<?, ?it/s]

  0%|          | 0/100 [00:00<?, ?it/s]

  0%|          | 0/100 [00:00<?, ?it/s]

## Lets visualize the two types of simulations
The two brain plots should now look quite different, as one contains large and extended sources while the other contains tiny point-like sources.

In [4]:
brain = sim_small.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small sources', 'title',
               font_size=14)

brain = sim_large.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large sources', 'title',
               font_size=14)

## Train individual neural networks

In [11]:
net_small = Net(fwd, verbose=True).fit(sim_small)
net_large = Net(fwd, verbose=True).fit(sim_large)

(61, 1000)
Model: "net_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 61, 4)             16080     
_________________________________________________________________
flatten_1 (Flatten)          (None, 244)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 1284000)           314580000 
Total params: 314,596,080
Trainable params: 314,596,080
Non-trainable params: 0
_________________________________________________________________
(100, 1000, 61) (100, 1000, 1284)
Epoch 1/100


ValueError: in user code:

    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\training.py:855 train_function  *
        return step_function(self, iterator)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\training.py:845 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1285 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2833 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3608 _call_for_each_replica
        return fn(*args, **kwargs)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\training.py:838 run_step  **
        outputs = model.train_step(data)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\training.py:795 train_step
        y_pred = self(x, training=True)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1013 __call__
        input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
    C:\Users\lukas\virtualenvs\esienv\lib\site-packages\tensorflow\python\keras\engine\input_spec.py:267 assert_input_compatibility
        raise ValueError('Input ' + str(input_index) +

    ValueError: Input 0 is incompatible with layer net_5: expected shape=(None, None, 1000), found shape=(None, 1000, 61)


Now we have simulated two different types of source & eeg data and build two neural networks that each was trained on one of these simulations. Lets see how they perform within their own simulation type.

In [8]:
# Simulate some new, unseen test data    
n_test_samples = 1
sim_test_small = Simulation(fwd, info, settings=settings_small).simulate(n_samples=n_test_samples)
sim_test_large = Simulation(fwd, info, settings=settings_large).simulate(n_samples=n_test_samples)


brain = sim_test_small.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of small data', 'title',
               font_size=14)


brain = net_small.predict(sim_test_small).plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small-Net on small data', 'title',
               font_size=14)



brain = sim_test_large.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of large data', 'title',
               font_size=14)


brain = net_large.predict(sim_test_large).plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large-Net on large data', 'title',
               font_size=14)

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

(1, 61, 1)


TypeError: super(type, obj): obj must be an instance or subtype of type

Now we will use the large-net to predict the small simulation and vice versa.

In [7]:
brain = sim_test_small.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of small data', 'title',
               font_size=14)


brain = net_large.predict(sim_test_small).plot(**plot_params)
brain.add_text(0.1, 0.9, 'Large-Net on small data', 'title',
               font_size=14)



brain = sim_test_large.source_data.plot(**plot_params)
brain.add_text(0.1, 0.9, 'Ground Truth of large data', 'title',
               font_size=14)


brain = net_small.predict(sim_test_large).plot(**plot_params)
brain.add_text(0.1, 0.9, 'Small-Net on large data', 'title',
               font_size=14)

(1, 61, 1)
(1, 61, 1)


We now find that the Net which was trained on large simulations always tends to find large sources - even when confronted with data in which small sources were active. 

Conversely, the Net which was trained on simulations that contain small sources finds sparse sources when confronted with data containing large-source activity.

This demonstrates that our simulation settings function like priors. Further, it emphasizes the importance to state your priors and to motivate your choice.

In many cases we can't make a choice and we want to make as few assumptions into our models as possible. In this case we propose that you use large ranges in your settings to maximize the diversity of your training data.

A sample of a diverse setting is given in the next cell:

In [None]:
settings = {
    'number_of_sources': (1, 20),  # The range of simultaneously active sources.
    'extents': (2, 50),  # The range of source diameters in mm 
    'amplitudes': (1, 100),  # Defines the range of amplitudes (in arbitrary units)
    'shapes': 'both',  # Gives you both gaussian-shaped and flat sources
    'beta': (0, 1.5),  # Defines the distribution of the noise in terms of 1/f**beta
}