# Score pooled data

### Content

+ [1. Notebook description](#1.-Notebook-Description)
+ [2. Specify models](#2.-specify-models)
+ [3. Run crossvalidations](#3.-run-crossvalidations)
+ [4. Export](#4.exports)

---

# 1. Notebook Description

Using the notebook that extracted the best meta parameters we can now instantiate a specific classifier instance and run a single 10-fold cross-validation on the pooled dataset that corresponds to the model's transformation scheme.

---

**Imports:**

In [None]:
from digits.data import select
from digits.transform.shaper import CSPFlatten, CSPWrap

from mne.decoding import CSP
from mne import create_info, EvokedArray
from mne.viz import plot_topomap, plot_layout, plot_montage
from mne.channels import make_eeg_layout

from sklearn.svm import SVC
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.model_selection import cross_val_score
from sklearn.pipeline import Pipeline


from itertools import combinations
import pandas as pd
from os import path

---

## 2. Specify models

A model has a name and consists of a transformation scheme and a classifier instance.

In [None]:
models = {
    'svc': ('short_lda_1.yaml', SVC(kernel='linear', C=1.274275e-06, cache_size=1024)),
    'lda': ('short_lda_1.yaml', LDA(shrinkage=0.0444444444444, solver='lsqr')),
    'ldacsp': ('short_lda_4.yaml', Pipeline([
            ('wrap', CSPWrap()),
            ('csp', CSP(n_components=6, reg='ledoit_wolf', transform_into='csp_space')),
            ('flat', CSPFlatten()),
            ('lda', LDA(solver='lsqr', shrinkage='auto'))
        ]))
}

In [None]:
basepath = '../../data/thomas/artcorr/imported'
scores = {}

## 3. Run crossvalidations

In [None]:
for clfname, (config_file, clf) in models.items():
    
    store = pd.HDFStore(path.join(basepath, config_file+'.h5'))
    samples = store['samples']
    targets = store['targets']
    
    for dix,(d1,d2) in enumerate(combinations(np.arange(10), 2)):
        print("running {} [{},{}]".format(clfname, d1,d2))

        tmp_samples, tmp_targets = select.fromtargetlist(samples, targets, [d1, d2])
        
        scores[(clfname, d1, d2)] = cross_val_score(clf, tmp_samples,
                                                    tmp_targets['label'], cv=10,
                                                    verbose=1, n_jobs=-1)
        
        print(len(scores))

## 4. Export

In [None]:
outfile = 'results_pooled_final.h5'
store = pd.HDFStore(outfile)
df_scores = pd.DataFrame(scores)
df_scores.columns.names = ['type', 'd1','d2']
df_scores.index.names = ['crossvalidation']
store['scores'] = df_scores
store.close()

#### long format for plotting

In [None]:
d1s, d2s, subjects, groups, scores, tests= [], [], [], [], [], []
for (group, d1, d2), values in data.iteritems():
    for score in values:
        groups.append(group)
        d1s.append(d1)
        d2s.append(d2)
        subjects.append('pooled')
        tests.append('cv')
        scores.append(score)

dflong = pd.DataFrame({'group': groups, 'd1': d1s, 'd2': d2s,
                       'subject': subjects, 'test': tests, 'score': scores})
dflong.to_csv('results_pooled_final.csv')

---

---