Example offline analysis of [mindaffectBCI](https://github.com/mindaffect) savefile
--
This notebook shows how to do a quick post-hoc analysis of a test run using the  mindaffectBCI Makers-Kit.

In [None]:
import numpy as np
from analyse_datasets import debug_test_dataset, analyse_dataset, analyse_datasets
from offline.load_mindaffectBCI  import load_mindaffectBCI
import matplotlib.pyplot as plt
%matplotlib inline
%load_ext autoreload
%autoreload 2
plt.rcParams['figure.figsize'] = [12, 8] # bigger default figures

Specify the save file you wish to analyse.  If you want to analyse a file from the mindaffect Makers Kit, then you  can download the save file from the decoder configuration page in the `data` directory.  

You can find the decoder configuration website at http://10.0.0.5 if the decoder is in direction connection mode, or the decoder IP if not.  If you  don't now the decoders IP you can run this script to find it: 

`python3 -m mindaffectBCI.examples.utilities.discover_decoders`

In [None]:
# the  file we want to analyse
savefile = '../../resources/example_data/mindaffectBCI.txt'

Load the data
-------------------

load, doing some inital  pre-processing.
  * stopband = list of band-stop filters. 
    ((0,1),(25,-1)) means stop below 1 and above 25 Hz ->  band pass between 1 and 25hz
  * ofs = output sample rate, so will resample to (approx) 60hz
  
Note: The EEG data has shape number channels + 1 as it include an additional virtual 'time-stamp' channel.  Don't include this channel in any further analysis.

In [None]:
X, Y, coords = load_mindaffectBCI(savefile, stopband=((0,1),(25,-1)), ofs=60)
# output is: X=eeg, Y=stimulus, coords=meta-info about dimensions of X and Y
print("EEG: X({}){} @{}Hz".format([c['name'] for c in coords],X.shape,coords[1]['fs']))                            
print("STIMULUS: Y({}){}".format([c['name'] for c in coords[:1]]+['output'],Y.shape))

Analyse the data
--
The following code runs the standard initial analysis and data-set visualization, in one go with some standard analysis parameters:
 * tau_ms : the lenght of the modelled stimulus response   (in milliseconds)
 * evtlabs : the type of brain feaatures to transform the stimulus information into prior to fitting the model in this case 
     * 're' -> rising edge    
     * 'fe' -> falling edge
   see stim2event for  more   information on possible transformations
 * rank  : the rank of  the CCA model to fit
 * model : the type of model to fit. 'cca' corrospends to the Cannonical Correlation Analysis model.
           other options include: 
           * 'ridge' = ridge-regression, 
           * 'fwd' = Forward Modelling, 
           * 'bwd' = Backward Modelling, 
           * 'lr' = logistic-regression, 
           * 'svc' = support vector machine
   see `model_fitting.py` for how to add more models

This generates 6 visualizations:

 1.  *X+Y* : raw plot of  the 1st trials EEG and STIMULUS information

 2.  *Summary Statistics*: Summary statistics for the data with, 
      row 1:   Cxx : the covariance of the the EEG channel features
      row 2:   Cxy : the cross covariance of the EEG with each of the stimulus features, e.g. 're', 'fe'
      row 3:   Cyy : the auto-cross covariance of the stimulus features with the other (time-delayed) stimulus features

 3. *ERP* : plot of the average response for each stimulus feature over EEG  channels

 4. *Decoding Curve* : The decoder accumulates information during a trial to make it's predictions better. This plot shows the error of this accumulated prediction as a function of number of samples since trial start

 5. *Model*: plot of the fitted model, as a feature weighting over time-points and EEG channels

 6. *Fe* : Applying the model generates a *predicted* stimulus-feature "score" for each EEG time point. Where a higher score means the model thinks that stimulus-feature is more likely at this time point. This plot shows these scores for a sub-set of the trials.

 7. *Fy* : Combining Fe (the predicted stimulus),  with Y (the true-stimulus) and summing from the trial-start gives a score for each possible output, where again the higher score  indicates the model thinks this output is the more likely to be the *true* target.  
 This plot shows these accumulated scores for each output, *with the true output* draw with a thicker black line.  Thus, for a good model you should see that black line is the highest for most/all trials.


In [None]:
debug_test_dataset(X[...,:-1], Y, coords, tau_ms=100, evtlabs=('re','fe'), rank=1, model='cca')