# EEG Data Preprocessing with MNE

This notebook performs comprehensive EEG data preprocessing using MNE-Python for Alzheimer's disease classification.

## Key Processing Steps
1. **Data Loading**: Import EEG data from OpenNeuro dataset
2. **Connectivity Analysis**: Calculate multiple connectivity measures (coherence, PLV, wPLI)
3. **Frequency Band Analysis**: Process standard EEG frequency bands (delta, theta, alpha, beta, gamma)
4. **Data Export**: Save processed features for downstream analysis

## Connectivity Measures
- **Coherence (COH)**: Measures linear dependency between signals in frequency domain
- **Phase Locking Value (PLV)**: Measures phase synchronization between signals
- **Weighted Phase Lag Index (wPLI)**: Robust measure of phase coupling

## Output
Processed connectivity matrices for each subject and frequency band, ready for machine learning models.

In [None]:
# Install required connectivity package
!pip install mne_connectivity

Collecting mne_connectivity
  Downloading mne_connectivity-0.6.0-py3-none-any.whl.metadata (10 kB)
Collecting netCDF4>=1.6.5 (from mne_connectivity)
  Downloading netCDF4-1.6.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.8 kB)
Collecting cftime (from netCDF4>=1.6.5->mne_connectivity)
  Downloading cftime-1.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.6 kB)
Collecting packaging (from mne>=1.6->mne_connectivity)
  Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Downloading mne_connectivity-0.6.0-py3-none-any.whl (107 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m107.2/107.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading netCDF4-1.6.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m42.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading packaging-24.0-py3-none-any.whl (53 k

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import mne
import mne_connectivity
import time
import os

In [None]:
# Define EEG frequency bands
Freq_Bands = {
    "delta": [1.0, 4.0], 
    "theta": [4.0, 8.0], 
    "alpha": [8.0, 13.0], 
    "beta": [13.0, 30.0], 
    "gamma": [30.0, 50.0]
}

# Configuration parameters
n_freq_bands = len(Freq_Bands)
min_freq = np.min(list(Freq_Bands.values()))
max_freq = np.max(list(Freq_Bands.values()))
sfreq = 500  # Sampling frequency

# Generate frequency points for analysis
freqs = np.linspace(min_freq, max_freq, int((max_freq - min_freq) * 4 + 1))

# Convert frequency bands to tuples for MNE functions
fmin = tuple([list(Freq_Bands.values())[f][0] for f in range(len(Freq_Bands))])
fmax = tuple([list(Freq_Bands.values())[f][1] for f in range(len(Freq_Bands))])

# Define connectivity methods to compute
connectivity_methods = ["coh", "plv", "wpli"]

In [None]:
# Setup output directories for connectivity measures
output_dirs = {
    "coh": "/kaggle/working/coh/",
    "plv": "/kaggle/working/plv/", 
    "wpli": "/kaggle/working/wpli/"
}

# Create output directories
for dir_path in output_dirs.values():
    os.makedirs(dir_path, exist_ok=True)

print("Processing connectivity measures for all subjects...")

# Process each subject
for subject in range(1, 89):
    file_path = f"/kaggle/input/epoched-cleaned/epochs_data/subject_{subject}.npy"
    
    if os.path.exists(file_path):
        print(f"Processing subject {subject}...")
        
        # Load EEG data
        data = np.load(file_path)
        
        # Calculate connectivity measures
        start_time = time.time()
        con_time = mne_connectivity.spectral_connectivity_time(
            data,
            freqs,
            method=connectivity_methods,
            sfreq=sfreq,
            mode="cwt_morlet",
            fmin=fmin,
            fmax=fmax,
            faverage=True,
            decim=10
        )
        
        # Save connectivity data for each method
        output_files = {
            "coh": os.path.join(output_dirs["coh"], f"subject_{subject}_coh.npy"),
            "plv": os.path.join(output_dirs["plv"], f"subject_{subject}_plv.npy"),
            "wpli": os.path.join(output_dirs["wpli"], f"subject_{subject}_wpli.npy")
        }
        
        # Save results
        for i, method in enumerate(connectivity_methods):
            np.save(output_files[method], con_time[i].get_data())
        
        processing_time = time.time() - start_time
        print(f"Subject {subject} completed in {processing_time:.2f} seconds")
        
    else:
        print(f"Subject {subject} file not found, skipping.")

print("All connectivity processing completed!")

only using indices for lower-triangular matrix
Connectivity computation...
   Processing epoch 1 / 19 ...
   Processing epoch 2 / 19 ...
   Processing epoch 3 / 19 ...
   Processing epoch 4 / 19 ...
   Processing epoch 5 / 19 ...
   Processing epoch 6 / 19 ...
   Processing epoch 7 / 19 ...
   Processing epoch 8 / 19 ...
   Processing epoch 9 / 19 ...
   Processing epoch 10 / 19 ...
   Processing epoch 11 / 19 ...
   Processing epoch 12 / 19 ...
   Processing epoch 13 / 19 ...
   Processing epoch 14 / 19 ...
   Processing epoch 15 / 19 ...
   Processing epoch 16 / 19 ...
   Processing epoch 17 / 19 ...
   Processing epoch 18 / 19 ...
   Processing epoch 19 / 19 ...
[Connectivity computation done]
98.99924898147583
Subject 1 file saved.
only using indices for lower-triangular matrix
Connectivity computation...
   Processing epoch 1 / 25 ...
   Processing epoch 2 / 25 ...
   Processing epoch 3 / 25 ...
   Processing epoch 4 / 25 ...
   Processing epoch 5 / 25 ...
   Processing epoch 6 / 