# LOAD DATA AND PREPROCESSING OF SEED GER

For more information about ```mne```, please go to website: [https://mne.tools/stable/index.html](https://mne.tools/stable/index.html)

In [1]:
#Library Processing EEG
import mne

#Library Data Analisys
import pandas as pd
import numpy as np

#Library Tools System
import os
from tqdm import tqdm
import re
import time
import shutil

# No Warning
import warnings
warnings.filterwarnings("ignore")
mne.set_log_level("ERROR")

print(np.__version__)
print(mne.__version__)

1.23.5
1.3.1


Using the cnt files containing the data from the experiments into MNE raw objects (a specialized library, among other things, for the treatment of EEG).
The content is filtered by removing all frequency components below 0.5 Hz and above 55 Hz because those components are considered noise.
Each trial corresponds to the reading of 62 signals (one per channel) while a subject watches a video labeled with an emotion (positive, negative, or neutral).
Knowing that each of the MNE raw objects contains 62 signals in which the trials are stored one after another, it is necessary to divide these signals using the start and end times for each trial.

In [2]:
os.getcwd()

'C:\\Users\\macka\\TFM_WD\\ORI\\SEED_GER'

In [3]:
# Get a list of the names of all the files in the folder
os.chdir('C:\\Users\\macka\\TFM_WD\\ORI\\SEED_GER')
ruta='01-EEG-raw/'
archivos = os.listdir(ruta)
names=[]
# Appends in the names list just the cnt files
for archivo in archivos:
    if archivo.endswith(".cnt"):
        names.append(archivo)
        
os.chdir(ruta)

In [6]:
# To measure the execution time of this code
start_time = time.time()  # Record the start time

# Information about this Dataset (when a trial starts and when ends in each file, the label of the trial, etc
trial_num=['Trial-1','Trial-2','Trial-3','Trial-4','Trial-5','Trial-6','Trial-7','Trial-8','Trial-9','Trial-10',
            'Trial-11','Trial-12','Trial-13','Trial-14','Trial-15','Trial-16','Trial-17','Trial-18','Trial-19',
           'Trial-20']

emotions= ['POSITIVE', 'NEUTRAL', 'NEGATIVE', 'NEGATIVE', 'NEUTRAL', 'POSITIVE', 'NEGATIVE', 'NEUTRAL',
           'POSITIVE', 'POSITIVE', 'NEUTRAL', 'NEGATIVE', 'NEUTRAL', 'POSITIVE', 'NEUTRAL', 'NEGATIVE', 
           'NEGATIVE', 'NEUTRAL', 'POSITIVE', 'POSITIVE']

start_seconds = [5, 166, 411, 861, 1114, 1287, 1454, 1620, 1878, 2135, 2310, 2502, 2709, 3028, 3162, 3290, 3656, 3823, 4110, 4366]
end_seconds = [136, 381, 831, 1084, 1257, 1423, 1589, 1848, 2105, 2280, 2472, 2677, 2998, 3131, 3259, 3626, 3792, 4079, 4336, 4538]

# Creation of the folders where the new files will be stored
output_folder_positive = 'SEED_GER/POSITIVE'
output_folder_neutral = 'SEED_GER/NEUTRAL'
output_folder_negative = 'SEED_GER/NEGATIVE'

if not os.path.exists(output_folder_positive):
    os.makedirs(output_folder_positive)
if not os.path.exists(output_folder_neutral):
    os.makedirs(output_folder_neutral)
if not os.path.exists(output_folder_negative):
    os.makedirs(output_folder_negative)

# Creation of the new files, separating the data of the original ones
for name in tqdm(names, desc="Processing", total=len(names),leave=False):
    
    # Data loading and elimination of non desired data
    eeg_raw = mne.io.read_raw_cnt(name, preload=True)
    useless_ch = ['M1', 'M2', 'VEO', 'HEO']
    eeg_raw.drop_channels(useless_ch)
    
    # Apply a band-pass filter to remove noise
    l_freq = 0.5  # low cut frequency
    h_freq = 55  # High cut frequency
    eeg_raw.filter(l_freq, h_freq)
    
    # Division of the EEG signal into diferent trial
    for trial, start, end, emotion in tqdm(zip(trial_num, start_seconds, end_seconds, emotions), desc=f"Processing {name}", total=len(trial_num),leave=False):
        
        # Trial-2 and Trial-19 are not valid
        if trial in ['Trial-2', 'Trial-19']:
            continue

        # Extraction of the trial data
        start_sample = start * eeg_raw.info['sfreq']
        end_sample = end * eeg_raw.info['sfreq']
        eeg_trial = eeg_raw.copy().crop(tmin=start_sample / eeg_raw.info['sfreq'], tmax=end_sample / eeg_raw.info['sfreq'])

        # Saving the trial in a fif file
        file_basename = os.path.splitext(os.path.basename(name))[0]
        output_filename = f"{file_basename}_{trial}_{emotion}_start_{start}_end_{end}.fif"
        if emotion == 'POSITIVE':
            output_path = os.path.join(output_folder_positive, output_filename)
        elif emotion == 'NEUTRAL':
            output_path = os.path.join(output_folder_neutral, output_filename)
        elif emotion == 'NEGATIVE':
            output_path = os.path.join(output_folder_negative, output_filename)
        eeg_trial.save(output_path, overwrite=True)

elapsed_time = time.time() - start_time
print(f"The code execution lasted {elapsed_time:.2f}")


Processing:   0%|          | 0/20 [00:00<?, ?it/s]
Processing 1_1.cnt:   0%|          | 0/20 [00:00<?, ?it/s][A
Processing 1_1.cnt:   5%|▌         | 1/20 [00:01<00:26,  1.40s/it][A
Processing 1_1.cnt:  15%|█▌        | 3/20 [00:02<00:15,  1.07it/s][A
Processing 1_1.cnt:  20%|██        | 4/20 [00:04<00:17,  1.10s/it][A
Processing 1_1.cnt:  25%|██▌       | 5/20 [00:05<00:17,  1.20s/it][A
Processing 1_1.cnt:  30%|███       | 6/20 [00:07<00:17,  1.26s/it][A
Processing 1_1.cnt:  35%|███▌      | 7/20 [00:08<00:16,  1.29s/it][A
Processing 1_1.cnt:  40%|████      | 8/20 [00:09<00:15,  1.33s/it][A
Processing 1_1.cnt:  45%|████▌     | 9/20 [00:11<00:14,  1.36s/it][A
Processing 1_1.cnt:  50%|█████     | 10/20 [00:12<00:13,  1.36s/it][A
Processing 1_1.cnt:  55%|█████▌    | 11/20 [00:14<00:12,  1.38s/it][A
Processing 1_1.cnt:  60%|██████    | 12/20 [00:15<00:10,  1.37s/it][A
Processing 1_1.cnt:  65%|██████▌   | 13/20 [00:16<00:09,  1.39s/it][A
Processing 1_1.cnt:  70%|███████   | 14/20 

Processing 4_3.cnt:  65%|██████▌   | 13/20 [00:15<00:08,  1.24s/it][A
Processing 4_3.cnt:  70%|███████   | 14/20 [00:16<00:07,  1.23s/it][A
Processing 4_3.cnt:  75%|███████▌  | 15/20 [00:17<00:06,  1.22s/it][A
Processing 4_3.cnt:  80%|████████  | 16/20 [00:19<00:04,  1.25s/it][A
Processing 4_3.cnt:  85%|████████▌ | 17/20 [00:20<00:03,  1.24s/it][A
Processing 4_3.cnt:  90%|█████████ | 18/20 [00:21<00:02,  1.25s/it][A
Processing 4_3.cnt: 100%|██████████| 20/20 [00:22<00:00,  1.04it/s][A
Processing:  60%|██████    | 12/20 [09:46<06:32, 49.10s/it]        [A
Processing 5_1.cnt:   0%|          | 0/20 [00:00<?, ?it/s][A
Processing 5_1.cnt:   5%|▌         | 1/20 [00:01<00:23,  1.23s/it][A
Processing 5_1.cnt:  15%|█▌        | 3/20 [00:02<00:13,  1.22it/s][A
Processing 5_1.cnt:  20%|██        | 4/20 [00:03<00:15,  1.03it/s][A
Processing 5_1.cnt:  25%|██▌       | 5/20 [00:05<00:15,  1.05s/it][A
Processing 5_1.cnt:  30%|███       | 6/20 [00:06<00:15,  1.11s/it][A
Processing 5_1.cnt: 

El código tardó 981.10 segundos en ejecutarse.




# Explanations

**Noise Filters**:

**Components of the signal of interest**: Depending on the analysis objective and the data characteristics, you may want to preserve certain frequency bands in the signal. For example, typical frequency bands in EEG analysis include delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), beta (12-30 Hz), and gamma (30-100 Hz). Make sure the cutoff frequencies do not remove the frequency bands of interest.

**Study Requirements**: Some studies may require analysis in specific frequency ranges. For example, in the study of brain activity related to emotions, alpha and beta frequency bands may be of particular interest. Adjust the cutoff frequencies according to the specific requirements of your study.