# MNEflow basic example

# 1. Importing data

### 1.1.from MNE epochs

If you use MNE-python, all you need is to provide your epochs file (or list of epoch files) to mneflow.produce_tfrecords


In [1]:
import numpy as np

import mne
mne.set_log_level(verbose='CRITICAL')
from mne.datasets import multimodal

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf
tf.get_logger().setLevel('ERROR')
tf.autograph.set_verbosity(0)

import mneflow
print(mneflow.__version__)


0.5.1


In [2]:
fname_raw = os.path.join(multimodal.data_path(), 'multimodal_raw.fif')
raw = mne.io.read_raw_fif(fname_raw)

cond = raw.acqparser.get_condition(raw, None)
# get the list of condition names
condition_names = [k for c in cond for k,v in c['event_id'].items()]
epochs_list = [mne.Epochs(raw, **c) for c in cond]
epochs = mne.concatenate_epochs(epochs_list)
epochs = epochs.pick_types(meg='grad')
print(epochs)

<Epochs |  940 events (all good), -0.0998976 - 0.499488 sec, baseline -0.0998976 – 0 sec, ~531.4 MB, data loaded,
 'Visual Upper right': 117
 'Visual Lower right': 129
 'Visual Lower left': 115
 'Visual Upper left': 133
 'Somato right': 107
 'Somato left': 118
 'Auditory right': 104
 'Auditory left': 117>


### Convert epochs to TFRecord format

In [3]:
path = 'C:\\data\\'
data_id = 'mne_sample_multimodal'

#Specify import options. See mneflow.MetaData docstring for description

import_opt = dict(path=path, #renamed from 'savepath'
                  data_id=data_id, #renamed from 'out_name'
                  input_type='trials',
                  target_type='int',
                  n_folds= 5,
                  test_set = 'holdout',
                  fs=600,
                  overwrite=True,
                  picks={'meg':'grad'},
                  scale=True,  # apply baseline_scaling
                  crop_baseline=True,  # remove baseline interval after scaling
                  decimate=None,
                  scale_interval=(0, 60),  # indices in time axis corresponding to baseline interval
                  )

#write TFRecord files and metadata file
meta = mneflow.produce_tfrecords(epochs, **import_opt)

processing epochs
Input shapes: X (n, ch, t) :  (940, 204, 361) y (n, [signal_channels], y_shape) :  (940, 1) 
 input_type :  trials target_type :  int segment_y :  False
Preprocessing:
Scaling to interval 0.0 - 60.0
n: 940
Splitting into: 6 folds x 156
Preprocessed: (940, 1, 301, 204) (940, 1) folds: 6 x 156
Preprocessed targets:  (940, 1)
Prepocessed sample shape: (1, 301, 204)
Target shape actual/metadata:  (8,) (8,)
Saving TFRecord# 0
Updating: meta.data
Updating: meta.preprocessing


## Other import options
### 1.2 Saved mne.epochs (*-epo.fif) files
Alternatively, if your epochs are saved to disk provide a str (or list of str) with path(s) to your -epo.fif files

e.g. this will work

```python
epochs.save('test_saved_epochs.fif')
meta = mneflow.produce_tfrecords('test_saved_epochs.fif',**opt)
```
### 1.3. Arrays in *.mat or *.npz format
if the first argument is str mneflow.produce_tfrecords can also accept *.mat or *.npz format

e.g.

```python
data_path = '.../data_path/'
filenames = [data_path +'sub' + str(i) + '-grad.npz' for i in range(1,4)]
meta = mneflow.produce_tfrecords(filenames,**opt)
```
In this case, specify iput_type='array', and also provide array_keys keyword argument

e.g. 

```python
array_keys={'X':'my_data_samples','y':'my_labels'}
```

### 1.4. Tuple of (data, labels)
Finally, if you have a more complex preprocessing pipeline, you can feed you data and labels as a tuple of arrays

```python
X = epochs.get_data()
y = epochs.events[:,2]
meta = mneflow.produce_tfrecords((X,y),**opt)
```


# 2. Initialize the dataset object using the generated metadata file

The dataset object includes several methods that allow experimenting with the dataset without the need to repeat the preprocessing or overwriting the TFRecord files each time.

For example, you can train the model using any subset of classes, channels, or reduce the sampling rate by decimating across the time domain.

See mneflow.Dataset docstring for more details.

In [4]:
dataset = mneflow.Dataset(meta, train_batch=100, class_subset=[0, 1, 2, 3, 4, 5, 6])

Using class_subset with 7 classes:
Subset ratio 0.88, Multiplier 1.14
Using class_subset with 7 classes:
Subset ratio 1.00, Multiplier 1.00
Updating: meta.data


# 3. Choose from already implemented models

MNEflow pipeline consists of three major parts:
1. dataset
2. computational graph

Each part has its own set of hyper-parameters and methods that can be tuned. See help for mneflow.Dataset
and mneflow.models.BaseModel for more details.
In this example will we use LF-CNN network


In [5]:
# specify model parameters
lfcnn_params = dict(n_latent=32, #number of latent factors
                  filter_length=17, #convolutional filter length in time samples
                  nonlin = tf.nn.relu,
                  padding = 'SAME',
                  pooling = 5,#pooling factor
                  stride = 5, #stride parameter for pooling layer
                  pool_type='max',
                  dropout = .5,
                  l1_scope = ["weights"],
                  l1=3e-3)

meta.update(model_specs=lfcnn_params)

"""Initialize model"""
model = mneflow.models.LFCNN(meta)
model.build()

Updating: meta.model_specs
Using class_subset with 7 classes:
Subset ratio 1.00, Multiplier 1.00
Using class_subset with 7 classes:
Subset ratio 1.00, Multiplier 1.00
Updating: meta.data
Setting reg for dmx, to l1
Built: dmx input: (None, 1, 301, 204)
Setting reg for tconv, to l1
Built: tconv input: (None, 1, 301, 32)
Setting reg for fc, to l1
Built: fc input: (None, 1, 61, 32)
Input shape: (1, 301, 204)
y_pred: (None, 7)
Initialization complete!


## Training and (cross-)validation modes

When training a model it is often useful to keep track of both: cross-validation and test-set performance.

Default training mode is 'single_fold'
Other training modes include n_fold cross-validation (mode='cv') and leave-one-subject-out cross-validation (mode='loso'). 

In 'cv' (cross-validation) mode with n_folds=5 and designated 'holdout' test set, test set performance is evaluated for each training fold and then averaged.

In 'loso' mode each input tfrecord file is treated as a fold. Thus, if the data from each of n subjects is saved in a spearate .tfrecord file, on each fold the model will be trained on n - 1 subjects and then tested on the held out subject. In this setting each 'validation' fold comprises combined data from all (n - 1) subjects, but not the held out subject.

See mneflow.models.BaseModel.train docstring for more details.


In [None]:
# train the model for 20 epochs and stop the training if validation loss does not decrease during 5 consequtive epochs. 

model.train(n_epochs=20, eval_step=50, early_stopping=3, mode='single_fold')

Updating: meta.train_params
Class weights:  None
Using class_subset with 7 classes:
Subset ratio 1.00, Multiplier 1.00
Epoch 1/20
50/50 - 4s - loss: 2.2755 - cat_ACC: 0.2279 - val_loss: 2.0011 - val_cat_ACC: 0.3333 - 4s/epoch - 84ms/step
Epoch 2/20
50/50 - 4s - loss: 1.8123 - cat_ACC: 0.4301 - val_loss: 1.6628 - val_cat_ACC: 0.5586 - 4s/epoch - 70ms/step
Epoch 3/20
50/50 - 5s - loss: 1.3205 - cat_ACC: 0.6578 - val_loss: 1.3672 - val_cat_ACC: 0.6847 - 5s/epoch - 91ms/step
Epoch 4/20
50/50 - 4s - loss: 0.9519 - cat_ACC: 0.7952 - val_loss: 1.1167 - val_cat_ACC: 0.7658 - 4s/epoch - 88ms/step
Epoch 5/20
50/50 - 5s - loss: 0.7026 - cat_ACC: 0.8820 - val_loss: 0.9899 - val_cat_ACC: 0.7748 - 5s/epoch - 91ms/step
Epoch 6/20
50/50 - 5s - loss: 0.5518 - cat_ACC: 0.9290 - val_loss: 0.9078 - val_cat_ACC: 0.8018 - 5s/epoch - 91ms/step
Epoch 7/20
50/50 - 4s - loss: 0.4707 - cat_ACC: 0.9541 - val_loss: 0.8699 - val_cat_ACC: 0.8288 - 4s/epoch - 87ms/step
Epoch 8/20
50/50 - 4s - loss: 0.4076 - cat_ACC: 

In [None]:
test_loss, test_acc = model.evaluate(meta.data['test_paths'])
print("Test set: Loss = {:.4f} Accuracy = {:.4f}".format(test_loss, test_acc))