To use it, you need to define 2 paths:
- project_path (path to the folder name project_eval in git) **in line 3**
- data_path (path to the data on your computer) **in line 11**

# Loading EEG eval

In [1]:
from importlib.util import spec_from_file_location, module_from_spec

In [2]:
import os

In [3]:
project_path = r'C:\Users\Antoine CHEHIRE\IdeaProjects\IFT6269_Project'

In [4]:
eeg_eval_path = os.path.join(project_path, 'Evaluation.py')
predictor_path = os.path.join(project_path, 'Predictors.py')

In [5]:
spec = spec_from_file_location('EEG eval', eeg_eval_path)
eeg_eval = module_from_spec(spec)
spec.loader.exec_module(eeg_eval)

spec = spec_from_file_location('Predictors', predictor_path)
preds = module_from_spec(spec)
spec.loader.exec_module(preds)

# Loading the filter

In [6]:
class RawData:
    name = "Raw data"
    
    def generate_features(self, time_series):
        """
        generate features from a time_series
        :param np.ndarray time_series: nb_of_observations x nb_of_sensors matrix
        :return np.ndarray feature_matrix: matrix of same shape
        """
        # We do nothing as we want raw data
        return time_series

In [7]:
algorithm = RawData()

# Loading predictor

In [8]:
pred = preds.LogReg()

No need to optimize parameters yet. It takes too long and doesn't improve results too much. So let's impose default sklearn.

In [9]:
pred.hyper_parameters_grid = {'C': [1]}

# Evaluating

In [10]:
eeg = eeg_eval.EEGEval()

In [11]:
data_path = r'D:\Scolaire\UdeM\IFT_6269\PROJECT\data\kaggle_small'

In [12]:
from time import time

In [13]:
t0 = time()
eeg.evaluate(data_path, algorithm, pred, cv_fold=1, verbose=2)
time() - t0

Generating features...
Scoring 1 out of 6...


  'precision', 'predicted', average, warn_for)
  'precision', 'predicted', average, warn_for)


Best params obtained: {'C': 1}
Scoring 2 out of 6...
Best params obtained: {'C': 1}
Scoring 3 out of 6...
Best params obtained: {'C': 1}
Scoring 4 out of 6...
Best params obtained: {'C': 1}
Scoring 5 out of 6...
Best params obtained: {'C': 1}
Scoring 6 out of 6...
Best params obtained: {'C': 1}


1391.637484550476

The warning means the classifier predicted only 0 as it yields better accuracy this way... So we may need to find a way to balance the data.

In [15]:
eeg.result

Unnamed: 0,Algo time,Accuracy 0,Precision 0,Recall 0,F1-score 0,Accuracy 1,Precision 1,Recall 1,F1-score 1,Accuracy 2,...,Recall 3,F1-score 3,Accuracy 4,Precision 4,Recall 4,F1-score 4,Accuracy 5,Precision 5,Recall 5,F1-score 5
Raw data - LogReg,10.9,97.41,0.0,0.0,0.0,97.41,0.0,0.0,0.0,97.41,...,0.0,0.0,97.41,0.0,0.0,0.0,97.41,0.0,0.0,0.0


algo time is not nul since we need to load the data from the data path in memory which takes a few minutes.

as we can see, there's an issue with the sparsity of the data...

# Saving it

Saving the result is crucial as it makes it easier to make comparisons of different models without running the whole pipeline as it takes ages to run.

In [None]:
path_to_save = os.path.join(project_path, 'Results')

In [None]:
file_name = eeg.resultult.index[0]

In [None]:
eeg.save_json(os.path.join(path_to_save, file_name+'.json'))

# Finer control (if necessary)

As the labels are not balanced, you may want to balance them manually to help the classifier

In [None]:
features = eeg.generate_features(data_path, algorithm)

You may modify the features. Though please note that the score function needs y_train and y_test as a vector.

Thus you still need to do as in the evaluation protocol:

In [None]:
y_train = features['y_train']
y_test = features['y_test']

# Let's say you want to see the scores for the 1st task:
j = 0
features['y_train'] = y_train[:, 0]
features['y_test'] = y_test[:, 0]

In [None]:
scores, best_params = eeg.score_features(features, pred, cv_fold=1, verbose=2)