# ðŸ§  NeuMa EEG Preprocessing and Feature Extraction

**Objective:**  
Explore and preprocess the NeuMa (Neuromarketing) EEG dataset from OpenNeuro (ds004588).  
Weâ€™ll:
1. Download subject data using `openneuro-py`
2. Inspect dataset structure
3. Load EEG signals for one participant
4. Perform preprocessing (filtering, referencing)
5. Extract simple features (bandpower in alpha/beta ranges)
6. Prepare a clean feature table for downstream modeling (e.g., attention/purchase prediction)

#### Installing Dependencies

In [None]:
# ðŸ“¦ Install dependencies
%pip install openneuro-py mne numpy pandas matplotlib scipy torch scikit-learn

#### Importing Libraries

In [5]:
# ðŸ§© Imports
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mne import io, Epochs, pick_types, events_from_annotations
from mne.preprocessing import ICA
import mne
from scipy.signal import welch

## 1. Download dataset subset from OpenNeuro
Weâ€™ll only download **subject S01** for testing.

In [None]:
!openneuro-py download --dataset=ds004588 --include=sub-S01

## 2. Explore folder structure
Letâ€™s inspect what we just downloaded. This will help us locate the EEG files and any metadata.

In [42]:
import os

base_path = "ds004588"
for root, dirs, files in os.walk(base_path):
    level = root.replace(base_path, '').count(os.sep)
    indent = ' ' * 2 * (level)
    print(f"{indent}{os.path.basename(root)}/")
    subindent = ' ' * 2 * (level + 1)
    for f in files:
        if f.endswith(('.edf', '.bdf', '.vhdr', '.set')):
            print(f"{subindent}{f}")

ds004588/
  sub-S01/
    eye_tracker/
      sub-S01_task-unnamed_et.set
    eeg/
      sub-S01_task-unnamed_eeg.set


## 3. Load EEG data
Weâ€™ll use MNE to load the first subjectâ€™s EEG recording. The NeuMa dataset uses 21 dry electrodes (10â€“20 system).

In [57]:
# Use read_csv to read tsv files
eeg_channel_path = "ds004588/sub-S01/eeg/sub-S01_task-unnamed_channels.tsv"
eeg_tsv = pd.read_csv(eeg_channel_path, sep='\t')
pd.DataFrame(eeg_tsv)

# we can do the same for eye tracking if needed


Unnamed: 0,name,type,units
0,P3,EEG,uV
1,C3,EEG,uV
2,F3,EEG,uV
3,Fz,EEG,uV
4,F4,EEG,uV
5,C4,EEG,uV
6,P4,EEG,uV
7,Cz,EEG,uV
8,Pz,EEG,uV
9,Fp1,EEG,uV


### Available Features

In [58]:
pd.DataFrame(eeg_tsv['name']).T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,14,15,16,17,18,19,20,21,22,23
name,P3,C3,F3,Fz,F4,C4,P4,Cz,Pz,Fp1,...,O2,X3,X2,F7,F8,X1,A2,T6,T4,TRG


In [None]:
# Define the path to your .set file (assuming .fdt is in the same directory)
eeg_data_path = 'ds004588/sub-S01/eeg/sub-S01_task-unnamed_eeg.set'

# Load the EEGLAB dataset
raw_eeg = mne.io.read_raw_eeglab(eeg_data_path)

# Access the EEG data (usually stored in raw_eeg.get_data())
eeg_data = raw_eeg.get_data()

# Process or analyze 'eeg_data' as needed
print(f"Shape of EEG data: {eeg_data.shape}")

pd.DataFrame(eeg_data).T


Reading /Users/AnantGoyal/Library/CloudStorage/OneDrive-Personal/Documents/UIUC/1st Year/neurotech/ds004588/sub-S01/eeg/sub-S01_task-unnamed_eeg.fdt
Shape of EEG data: (24, 110555)


  raw_eeg = mne.io.read_raw_eeglab(eeg_data_path)


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,14,15,16,17,18,19,20,21,22,23
0,0.000187,-0.000878,0.000058,-0.000336,-0.000517,-0.000410,-0.000290,-0.000494,-0.000130,0.000351,...,-0.000254,-0.000008,-0.000014,0.000045,-0.000306,-0.000013,0.000112,-0.000473,-0.000650,0.0
1,0.000182,-0.000880,0.000061,-0.000340,-0.000524,-0.000410,-0.000294,-0.000496,-0.000129,0.000356,...,-0.000263,-0.000008,-0.000014,0.000044,-0.000308,-0.000013,0.000103,-0.000480,-0.000645,0.0
2,0.000186,-0.000875,0.000061,-0.000336,-0.000522,-0.000411,-0.000287,-0.000494,-0.000133,0.000359,...,-0.000266,-0.000008,-0.000013,0.000046,-0.000307,-0.000013,0.000110,-0.000479,-0.000645,0.0
3,0.000192,-0.000868,0.000068,-0.000333,-0.000520,-0.000405,-0.000283,-0.000489,-0.000120,0.000358,...,-0.000263,-0.000008,-0.000013,0.000047,-0.000312,-0.000013,0.000100,-0.000480,-0.000645,0.0
4,0.000190,-0.000875,0.000062,-0.000328,-0.000516,-0.000403,-0.000284,-0.000486,-0.000117,0.000359,...,-0.000255,-0.000008,-0.000014,0.000047,-0.000311,-0.000013,0.000111,-0.000473,-0.000637,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
110550,0.000034,-0.000620,0.000091,-0.000225,-0.000312,-0.000569,-0.000318,-0.000054,-0.000083,0.000222,...,-0.000106,-0.000008,-0.000014,0.000058,-0.000067,-0.000013,0.000023,-0.000347,-0.000469,0.0
110551,0.000033,-0.000616,0.000085,-0.000223,-0.000304,-0.000565,-0.000315,-0.000048,-0.000079,0.000212,...,-0.000099,-0.000008,-0.000014,0.000052,-0.000058,-0.000013,0.000034,-0.000355,-0.000458,0.0
110552,0.000017,-0.000623,0.000076,-0.000230,-0.000309,-0.000575,-0.000323,-0.000057,-0.000093,0.000208,...,-0.000115,-0.000008,-0.000014,0.000037,-0.000060,-0.000013,0.000033,-0.000358,-0.000473,0.0
110553,0.000028,-0.000614,0.000080,-0.000223,-0.000301,-0.000562,-0.000315,-0.000047,-0.000081,0.000213,...,-0.000101,-0.000008,-0.000013,0.000039,-0.000047,-0.000013,0.000033,-0.000341,-0.000455,0.0
