# Final bachelor's Project
## Section two: EEG Epilepsy Feature Extraction
### Author: Pouya Taghipour
### Studend ID: 9933014
### Supervisors: 
- Dr. Farnaz Ghassemi
- Dr. Fatemeh Zare
- Dr. Zahra Tabanfar
### Date: Summer 2024

# Feature Extraction for Seizure Prediction
In this section, we'll focus on extracting relevant features from the preprocessed EEG data. These features will be used for training machine learning models for seizure detection and prediction.

## Importing Necessary Libraries
First, we'll import the libraries required for feature extraction.

In [None]:
import numpy as np
import pandas as pd
import mne
from scipy.stats import entropy
from scipy.signal import welch

# Load preprocessed data
preprocessed_file_path = 'path_to_preprocessed_data.fif'
preprocessed_data = mne.io.read_raw_fif(preprocessed_file_path, preload=True)

## Feature Extraction Methods
### Time-Domain Features
#### Mean and Standard Deviation
Calculate the mean and standard deviation of the EEG signal, which are simple but effective statistical features.

In [None]:
def extract_time_domain_features(data):
    mean_values = np.mean(data, axis=1)
    std_values = np.std(data, axis=1)
    return mean_values, std_values

# Extracting time-domain features
mean_features, std_features = extract_time_domain_features(preprocessed_data.get_data())

### Frequency-Domain Features
#### Power Spectral Density (PSD)
Use the Welch method to calculate the power spectral density (PSD) of the EEG signal. This feature helps in identifying the distribution of power across different frequency bands.

In [None]:
def extract_psd_features(data, sfreq, nperseg=256):
    psd_features = []
    for channel_data in data:
        freqs, psd = welch(channel_data, sfreq, nperseg=nperseg)
        psd_features.append(psd)
    return np.array(psd_features)

# Extracting PSD features
psd_features = extract_psd_features(preprocessed_data.get_data(), preprocessed_data.info['sfreq'])


#### Band Power
Extract the power in specific frequency bands (e.g., delta, theta, alpha, beta) which are known to be informative for EEG analysis.

In [None]:
def bandpower(data, sfreq, band, window_sec=4, relative=False):
    band = np.asarray(band)
    low, high = band

    psd, freqs = welch(data, sfreq, nperseg=window_sec*sfreq)

    freq_res = freqs[1] - freqs[0]
    idx_band = np.logical_and(freqs >= low, freqs <= high)

    band_power = np.sum(psd[:, idx_band], axis=1) * freq_res

    if relative:
        band_power /= np.sum(psd, axis=1) * freq_res

    return band_power

# Extracting band power for different frequency bands
delta_power = bandpower(preprocessed_data.get_data(), preprocessed_data.info['sfreq'], [0.5, 4])
theta_power = bandpower(preprocessed_data.get_data(), preprocessed_data.info['sfreq'], [4, 8])
alpha_power = bandpower(preprocessed_data.get_data(), preprocessed_data.info['sfreq'], [8, 13])
beta_power = bandpower(preprocessed_data.get_data(), preprocessed_data.info['sfreq'], [13, 30])

### Non-linear Features
#### Entropy
Entropy measures the randomness in the EEG signal, which can be an important feature in distinguishing between normal and seizure states.

In [None]:
def extract_entropy(data):
    entropies = []
    for channel_data in data:
        ent = entropy(np.abs(channel_data))
        entropies.append(ent)
    return np.array(entropies)

# Extracting entropy features
entropy_features = extract_entropy(preprocessed_data.get_data())

## Combining Features
After extracting individual features, we can combine them into a feature matrix, which can be used as input for machine learning models.

In [None]:
# Combine all features into a single feature matrix
feature_matrix = np.column_stack((mean_features, std_features, delta_power, theta_power, alpha_power, beta_power, entropy_features))

# Convert to DataFrame for better handling
feature_df = pd.DataFrame(feature_matrix, columns=['Mean', 'Std', 'Delta Power', 'Theta Power', 'Alpha Power', 'Beta Power', 'Entropy'])

# Save the feature matrix to a CSV file
feature_df.to_csv('extracted_features.csv', index=False)

## Conclusion
We have successfully extracted several key features from the EEG data, including time-domain, frequency-domain, and non-linear features. These features are now ready to be used for training machine learning models to detect and predict seizures.