### Importing Packages 

In [1]:
import numpy as np
import pandas as pd
import scipy.io as sio
import matplotlib.pyplot as plt
import mne

%matplotlib inline

### Goal:

For each data set specific goals are given in the respective description. Technically speaking, each data set consists of **single-trials of spontaneous EEG activity, one part labeled (training data) and another part unlabeled (test data), and a performance measure.**

The goal is to infer labels for the test set from training data that maximize the performance measure for the true (but to the participant unknown) test labels. 


# Data set ‹self-paced 1s›

Data set provided by Fraunhofer-FIRST, Intelligent Data Analysis Group (Klaus-Robert Müller), and Freie Universität Berlin, Department of Neurology, Neurophysics Group (Gabriel Curio)
Correspondence to Benjamin Blankertz <benjamin.blankertz@tu-berlin.de>

This dataset was recorded from a normal subject during a no-feedback session. The subject sat in a normal chair, relaxed arms resting on the table, fingers in the standard typing position at the computer keyboard. The task was to press with the index and little fingers the corresponding keys in a self-chosen order and timing 'self-paced key typing'. The experiment consisted of 3 sessions of 6 minutes each. All sessions were conducted on the same day with some minutes break inbetween. Typing was done at an average speed of 1 key per second.

![download.jpg](attachment:download.jpg)

**Format of the data**

- Given are 416 epochs of 500 ms length each ending 130 ms before a keypress. 
- 316 epochs are labeled (0 for upcoming left hand movements and 1 for upcoming right hand movements), the remaining 100 epoches are unlabeled for competition purpose.
- Data are provided in the original 1000 Hz sampling and in a version downsampled at 100 Hz (recommended). 

**Files are provided in Matlab format (*.mat) containing variables:**
- clab: electrode labels
- x_train: training trials (time x channels x trials)
- y_train: corresponding labels (0: left, 1: right), 
- x_test: test trials (time x channels x trials)

**Zipped ASC II format (*.txt.zip).**
- Each of those files contains a 2-D matrix where each row (line) contains the data of one trial, beginning with all samples of the first channel. Channels are in the following order: (F3, F1, Fz, F2, F4, FC5, FC3, FC1, FCz, FC2, FC4, FC6, C5, C3, C1, Cz, C2, C4, C6, CP5, CP3, CP1, CPz, CP2, CP4, CP6, O1, O2).
- In the files containing training data the first entry in each row indicates the class (0: left, 1: right). 
- In the 1000 Hz version trials consist of 500 samples per channel and in the 100 Hz version they consist of 50 samples.

**Requirements and Evaluation** 

Please provide your estimated class labels (0 or 1) for every trial of the test data and give a description of the used algorithm. The performance measure is the classification accuracy (correct classified trials divided by the total number of test trials).

**Technical data**

The recording was made using a NeuroScan amplifier and a Ag/AgCl electrode cap from ECI. 28 EEG channels were measured at positions of the international 10/20-system (F, FC, C, and CP rows and O1, O2). Signals were recorded at 1000 Hz with a band-pass filter between 0.05 and 200 Hz.

**References**
Benjamin Blankertz, Gabriel Curio and Klaus-Robert Müller, Classifying Single Trial EEG: Towards Brain Computer Interfacing, In: T. G. Diettrich and S. Becker and Z. Ghahramani (eds.), Advances in Neural Inf. Proc. Systems 14 (NIPS 01), 2002.

### Loading in Data

In [2]:
# Make sure this juypter notebook is in the same folder as the data, if it isn't create a file path for it

raw1 = sio.loadmat('sp1s_aa.mat')
raw2 = sio.loadmat('sp1s_aa_1000Hz.mat')

**The data is in a dictionary therefore we will need to sort into arrays**, there's probably a faster more efficient way but this is solid for now

In [25]:
#raw1 data which is recorded at 100 Hz
data = list(raw1.items())
an_array = np.array(data, dtype = object)

temp3 = an_array[3]
clab_100 = temp3[1]

temp4 = an_array[4]
x_train_100 = temp4[1]

temp5 = an_array[5]
y_train_list = temp5[1]
y_train_100 = y_train_list[0]

temp6 = an_array[6]
x_test_100 = temp6[1]


#A2, feature extraction, chnage 3-d to 2-d, CSP, 
#pick channels that are valuable, frquency band, 

Learn about the shape and length of the data

In [22]:
for data in [clab_100, x_train_100, y_train_100, x_test_100]:
    
    print(len(data))
    print(data.shape)

1
(1, 28)
50
(50, 28, 316)
316
(316,)
50
(50, 28, 100)


"fMRI has also been coupled with scalp electroencephalography (EEG) to estimate current sources associated with finger movements, which were found not only in the central sulcus, but also in frontal **medial and parietal regions** " 

In [24]:
#Experimneting 
data_items = raw1.items()
data_list = list(data_items)
df = pd.DataFrame(data_list)

In [20]:
#raw2 data which is recorded at 1000 Hz
data_1000 = list(raw2.items())
an_array = np.array(data, dtype = object)

temp3 = an_array[3]
clab_1000 = temp3[1]

temp4 = an_array[4]
x_train_1000 = temp4[1]

temp5 = an_array[5]
y_train_list = temp5[1]
y_train_1000 = y_train_list[0]

temp6 = an_array[6]
x_test_1000 = temp6[1]

print(len(x_train_1000))

50


In [56]:
clab_channels = clab_100[0]

In [57]:
channels = []

for j in clab_channels:
    
    for i in j:
        channels.append(i)
        
print(channels)

['F3', 'F1', 'Fz', 'F2', 'F4', 'FC5', 'FC3', 'FC1', 'FCz', 'FC2', 'FC4', 'FC6', 'C5', 'C3', 'C1', 'Cz', 'C2', 'C4', 'C6', 'CP5', 'CP3', 'CP1', 'CPz', 'CP2', 'CP4', 'CP6', 'O1', 'O2']
