# EEG Project Data Exploration

EEG reflects the coordinated activity of millions of neurons near a non-invasive scalp electrode. Because these are scalp potentials, necessarily, they have relatively poor spatiotemporal resolution compared to other neural recording techniques. EEG is believed to be recording dipoles that are transmitted through the scalp.

For each subject, they record from 22 EEG electrodes while the user imagines performing one of four actions. Therefore, this is a classification task (with four outcome classes), where the EEG is used to determine what action the subject was imagining.

The data was processed so that it is possible to load the data with numpy. Further, trials with NaN's have been removed.

In [7]:
# imports
import numpy as np

In [8]:
X_test = np.load("project_data/project/X_test.npy")
y_test = np.load("project_data/project/y_test.npy")
person_test = np.load("project_data/project/person_test.npy")

X_train_valid = np.load("project_data/project/X_train_valid.npy")
y_train_valid = np.load("project_data/project/y_train_valid.npy")
person_train_valid = np.load("project_data/project/person_train_valid.npy")

### Shape of Data

In [9]:
print('Training/Valid data shape: {}'.format(X_train_valid.shape))
print('Test data shape: {}'.format(X_test.shape))
print('Training/Valid target shape: {}'.format(y_train_valid.shape))
print('Test target shape: {}'.format(y_test.shape))
print('Person train/valid shape: {}'.format(person_train_valid.shape))
print('Person test shape: {}'.format(person_test.shape))

Training/Valid data shape: (2115, 22, 1000)
Test data shape: (443, 22, 1000)
Training/Valid target shape: (2115,)
Test target shape: (443,)
Person train/valid shape: (2115, 1)
Person test shape: (443, 1)


This indicates that there are 2115 trials; each trial has corresponding EEG data from 22 electrodes over 1000 time bins. Please look at the dataset documentation to know more about the data. E.g., Table 2 lists what class labels (769, 770, 771, 772) correspond to. The person files correspond to the subject performing the task, ranging from 0-8 (inclusive) and may be useful should you want to see how well you can classify on individual subjects. In the original data / documentation, you may see that there are 25 channels. We have removed 3 of the channels (so that there are 22) because those 3 channels were for recording eye movements, not brain activity