# Processing EEG for Stimuli Research

## 1. Importing Necessary Libraries
Explanation:
- `mne` is a library for EEG data processing
- `numpy` is a library for numerical processing
- `matplotlib` is a library for plotting
- `pandas` is a library for data processing
- `os` is a library for operating system related functions

In [25]:
import mne
import numpy as np
import matplotlib.pyplot as plt
import os
import pandas as pd
from tabulate import tabulate

from utils import get_psd_feature

## 2. Define Working Path

Working path is the path where the data is stored. The data is stored in the same folder as this notebook.

In [2]:
data_path = "EEG_EDF_Files_RSY"

## 3. Get List of Subject

**File Naming Convention**
`[subject_id]_[noise_type]_[read/nap]_EPOCX_[the_rest_of_timestamp].edf`

- `subject_id` is an integer starting from 1 as the subject number
- `noise_type` is either `brown` or `pink` or `white` or `silent` (for no noise)
- `read/nap` is showing the task that the subject is doing. `read` is for reading a book, `nap` is for taking a nap
- `the_rest_of_timestamp` is the timestamp of the recording based on EPOCX format


In [3]:
file_list = sorted(os.listdir(data_path))
print(f"File list: {file_list}")

File list: ['1_brown_nap.edf', '1_brown_read.edf', '1_pink_nap.edf', '1_pink_read.edf', '1_silent_nap.edf', '1_silent_read.edf', '1_white_nap.edf', '1_white_read.edf', '2_brown_nap.edf', '2_brown_read.edf', '2_pink_nap.edf', '2_pink_read.edf', '2_silent_nap.edf', '2_silent_read.edf', '2_white_nap.edf', '2_white_read.edf', '3_brown_nap.edf', '3_brown_read.edf', '3_pink_nap.edf', '3_pink_read.edf', '3_silent_nap.edf', '3_silent_read.edf', '3_white_nap.edf', '3_white_read.edf', '4_brown_nap.edf', '4_brown_read.edf', '4_pink_nap.edf', '4_pink_read.edf', '4_silent_nap.edf', '4_silent_read.edf', '4_white_nap.edf', '4_white_read.edf']


### 3.1. Create a Dataframe

In [4]:
data = []

for item in file_list:
    # split by underscore
    split_item = item.split('_')
    
    id_no = split_item[0]
    noise_type = split_item[1]
    task = split_item[2][:-4]
    full_path = os.path.join(data_path, item)
    
    # add to dataframe
    data.append({'subject': id_no, 'noise_type': noise_type, 'task': task, 'path': full_path})



In [27]:
df = pd.DataFrame(data)
print(tabulate(df, headers='keys', tablefmt='fancy_grid'))

╒════╤═══════════╤══════════════╤════════╤═════════════════════════════════════╕
│    │   subject │ noise_type   │ task   │ path                                │
╞════╪═══════════╪══════════════╪════════╪═════════════════════════════════════╡
│  0 │         1 │ brown        │ nap    │ EEG_EDF_Files_RSY/1_brown_nap.edf   │
├────┼───────────┼──────────────┼────────┼─────────────────────────────────────┤
│  1 │         1 │ brown        │ read   │ EEG_EDF_Files_RSY/1_brown_read.edf  │
├────┼───────────┼──────────────┼────────┼─────────────────────────────────────┤
│  2 │         1 │ pink         │ nap    │ EEG_EDF_Files_RSY/1_pink_nap.edf    │
├────┼───────────┼──────────────┼────────┼─────────────────────────────────────┤
│  3 │         1 │ pink         │ read   │ EEG_EDF_Files_RSY/1_pink_read.edf   │
├────┼───────────┼──────────────┼────────┼─────────────────────────────────────┤
│  4 │         1 │ silent       │ nap    │ EEG_EDF_Files_RSY/1_silent_nap.edf  │
├────┼───────────┼──────────

---

## 4. Processing Single Files

#### 4.1. Read the EDF

In [6]:
brown_nap_df = mne.io.read_raw_edf(df['path'][0], preload=True, verbose=False)
brown_nap_df = brown_nap_df.to_data_frame()

#### 4.2. Using Pre-configured Function
**Args:**
- `dataframe`: the dataframe that contains the information of the subject
- `freq_type`: the type of frequency that is going to be extracted. It can be `alpha`, `beta`, `delta`, `theta`, or `gamma`
- `fs`: the sampling frequency of the EEG data (Default: 256)
- `len_drop`: the length of the data that is going to be dropped from the beginning of the data (Default: 7680)
- `len_keep`: the length of the data that is going to be kept after dropping the beginning of the data (Default: 46080)
- `plot_psd`: whether to plot the PSD of the data or not (Default: False)
- `save_psd`: whether to save the PSD of the data or not (Default: False). The default saving folder is `output`
- `channel_drop`: the channels that are going to be dropped from the data (Default: `None`). Should be in list format and capital letters, for example: `['FP1', 'FP2']`

**Returns**:
- Output as a dictionary, containing:
    - `sum_raw`: sum_raw,
    - `avg_raw`: avg_raw,
    - `sum_filtered`: sum_filtered,
    - `avg_filtered`: avg_filtered,
    - `rel_pow`: rel_pow,

In [7]:
results = get_psd_feature(
    dataframe=brown_nap_df,
    freq_type='alpha',
)

In [8]:
# for every key in results, print the key and the value
for key, value in results.items():
    print(f"{key}: {value}")

sum_raw: 361839889.22784317
avg_raw: 1121.726764177656
sum_filtered: 8474707.797211967
avg_filtered: 26.272135377345872
rel_pow: 0.0234211540781111


# 5. Processing Batch Files