# Interfacing with project data
We've written wrapper functions so that you don't need to do any complex file I/O with the project datasets. (Though if you're interested you're more than welcome to do so). The loading functions for the three datasets are provided below.

For each of these functions, you'll supply the path to where you've downloaded the datasets and the subject ID as well as any optional arguments specific to the project.

For the fMRI dataset, the function will get the data and the labels (and chunks). 

For the EEG datasets, the functions will return 3 variables: the data, the labels (and chunks), and the channel names. 

## clinical EEG Dataset

The clinical EEG dataset is taken from https://physionet.org/pn6/chbmit/, with references in the link.

The dataset was originally 24-48 hour continuous monitoring of patients with intractable seizures. We've clipped out seizure events, as well as the time periods 10 min before and after each event. Each seizure event (before, during and after) is denoted by a chunk.

In [6]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import sklearn
import os

In [7]:
def load_clinical_eeg_data(datapath, sub):
    # input arguments:
    # datapath (string): path to the root directory
    # sub (string): subject ID (e.g. chb01, chb02, etc)
    
    # output:
    # eegdata (numpy array): samples x channels data matrix
    # eegevents (pandas dataframe): labels and chunks
    # channel_names (list): names of the channels
    import pandas as pd
    alldata = pd.read_csv(os.path.join(datapath, 'train', sub + '.csv'))
    alldata.rename(columns={'Unnamed: 0': 'Index'})
    eegevents = alldata[['labels', 'chunks']]
    alldata.drop(['Unnamed: 0', 'labels', 'chunks'], axis=1, inplace=True)
    names = alldata.keys()
    return alldata.iloc[:].as_matrix(), eegevents, names

In [10]:
matrix, eegchunks, chan_name = load_clinical_eeg_data('', 'chb01')


from sklearn.decomposition import PCA

pca = PCA()
print matrix.shape
matrix = pca.fit_transform(matrix)
print matrix.shape

(462848, 23)
(462848, 23)
