# Example EEG Feature Extraction
This notebook demonstrates the feature extraction pipeline used in the ["paper"](https://github.com/Arsu-Lab/Different-Algorithms-Uncover-Different-Patterns-BrainAge-Prediction/blob/main/paper/BIBM_2023.pdf). To run the code you will need to install following packages.

In [None]:
!pip install mne
!pip install numpy
!pip install pickle
!pip install mne_features

In [None]:
import mne as mne
import numpy as np
import pickle
import pandas as pd
from mne_features.feature_extraction import extract_features

### Loading the sample EEG data
First we load the preprocessed eeg data for eyes closed and eyes opened. The preprocessing steps applied are demonstrated in the ["example_eeg_preprocessing.ipynb"](https://github.com/Arsu-Lab/Different-Algorithms-Uncover-Different-Patterns-BrainAge-Prediction/blob/main/example_eeg_preprocessing.ipynb).

In [None]:
def load_pickle_object(filename):
    """Load a pickle object from a file."""
    try:
        with open(filename + ".pickle", "rb") as file:
            return pickle.load(file)
    except Exception as error:
        print("Error during unpickling object (Possibly unsupported):", error)

# Load EEG frequency band data for eyes closed and eyes open conditions
eeg_closed_eyes = load_pickle_object(r"data/example_eeg/example_subj_EC_preproc")
eeg_open_eyes = load_pickle_object(r"data/example_eeg/example_subj_EO_preproc")

### Splitting data into Epochs
Next we split the recordings into Epochs. Therefor we defined following function.

In [None]:
def create_epochs(eeg_data, duration, overlap):
    """Create fixed-length epochs from EEG data."""
    epochs = mne.make_fixed_length_epochs(
        eeg_data,
        duration=duration,
        overlap=overlap,
        preload=True,
        verbose=False
    )
    return np.array(epochs.get_data(units='uV'))

### Specifying feature extraction methods and frequency bands

In our research, we focus on the extraction of features from specific frequency bands of EEG data. To achieve this, we utilize the [mne_feature](https://mne.tools/mne-features/) library, which offers a variety of built-in methods for feature extraction. These methods are designed to analyze different aspects of EEG signals. Methods in 'whole_spectrum_only' implement there own band-pass filter and are only applied to the whole frequency spectrum of interest (0.5-30Hz).

In [None]:
# Define feature extraction configurations
feature_extraction_config = {
    'whole_spectrum_only': {
        'pow_freq_bands',
        'wavelet_coef_energy',
    },
    'methods': {
        'std', 'spect_slope', 'svd_fisher_info', 'hjorth_complexity',
        'hjorth_complexity_spect', 'ptp_amp', 'quantile', 'line_length',
        'zero_crossings', 'skewness', 'kurtosis', 'higuchi_fd',
        'samp_entropy', 'app_entropy', 'spect_entropy', 'mean', 'hurst_exp'
    }
}

# Define frequency bands of interest
freq_bands_interest = {
    'delta': [0.5, 4],
    'theta': [4, 8],
    'alpha': [8, 14],
    'beta': [14, 30],
    'whole_spec': [0.5, 30],
}

### Extracting Features
Following function iterates through each specified method, configuring unique parameters for each. These parameters include quantile values for the 'quantile' method, frequency band settings for 'pow_freq_bands' and 'energy_freq_bands', and frequency range settings for 'spect_slope'.

In [None]:
def get_parameters_for_methods(methods, freq_bands, band_key):
    """
    Create parameters for the feature extraction methods.

    :param methods: Set[str] - A set of method names indicating which feature extraction methods to use.
    :param freq_bands: Dict[str, List[float]] - A dictionary where keys are band names (e.g., 'delta', 'theta')
        and values are lists of two floats representing the frequency band range.
    :param band_key: str - The key from the freq_bands dictionary indicating the specific frequency band to process.

    :return: Dict[str, Any] - A dictionary of parameters configured for the feature extraction methods.
    """

    # Initialize a dictionary to hold the parameters
    parameters = dict()

    # Iterate over each method in the methods set
    for method in methods:
        # Configure parameters for the 'quantile' method
        if method == 'quantile':
            parameters['quantile__q'] = np.array([0.05, 0.25, 0.75, 0.95])

        # Configure parameters for the 'pow_freq_bands' method
        elif method == 'pow_freq_bands':
            parameters['pow_freq_bands__freq_bands'] = freq_bands
            parameters['pow_freq_bands__log'] = True
            parameters['pow_freq_bands__ratios_triu'] = False
            parameters['pow_freq_bands__ratios'] = None

        # Configure parameters for the 'energy_freq_bands' method
        elif method == 'energy_freq_bands':
            parameters['energy_freq_bands__freq_bands'] = freq_bands

        # Configure parameters for the 'spect_slope' method
        elif method == 'spect_slope':
            parameters['spect_slope__fmin'] = np.array([freq_bands[band_key][0]])
            parameters['spect_slope__fmax'] = np.array([freq_bands[band_key][1]])
            parameters['spect_slope__with_intercept'] = True

    # Return the configured parameters
    return parameters

The next function processes each frequency band, creating epochs from the EEG data. For the 'whole spectrum', it combines specific methods with general ones. It then utilizes the get_parameters_for_methods function to obtain the necessary parameters for feature extraction. Finally, it uses the extract_features function from the mne_features library to extract features.

In [None]:
def extract_features_from_eeg(eeg_data_dict, epoch_duration, epoch_overlap, feature_set=feature_extraction_config, freq_bands=freq_bands_interest):
    """
    Extract EEG features for each frequency band.

    :param eeg_data_dict: Dict[str, mne.io.Raw] - A dictionary with keys as frequency band names and values as mne.io.Raw objects.
    :param epoch_duration: float - Duration of each epoch in seconds.
    :param epoch_overlap: float - Overlap between consecutive epochs in seconds.
    :param feature_set: Dict[str, Set[str]] - A dictionary containing sets of feature extraction methods.
        The key 'whole_spectrum_only' contains methods used for the whole spectrum.
        The key 'methods' contains methods for all bands.
    :param freq_bands: Dict[str, List[float]] - A dictionary mapping frequency band names to their respective frequency ranges.

    :return: Dict[str, pd.DataFrame] - A dictionary where each key is a frequency band and each value is a DataFrame of extracted features.
    """

    # Extract methods for whole spectrum analysis and general methods from the feature_set
    whole_spec_methods = feature_set['whole_spectrum_only']
    extraction_methods = feature_set['methods']
    
    # Dictionary to store extracted features for each frequency band
    extracted_features = dict()

    # Iterate over each frequency band and its corresponding EEG data
    for band_key, eeg_data in eeg_data_dict.items():
        print(f"Extracting features in frequency band {band_key}")
        
        # Create epochs from the EEG data
        epochs = create_epochs(eeg_data.copy(), epoch_duration, epoch_overlap)

        # If processing the whole spectrum, combine the methods specific to it with general methods
        if band_key == 'whole_spec':
            extraction_methods = extraction_methods.union(whole_spec_methods)
        
        # Get the parameters for the selected feature extraction methods
        parameters = get_parameters_for_methods(extraction_methods, freq_bands, band_key)
        
        # Extract features using the mne_features library
        features = extract_features(
            X=epochs,  # EEG epochs
            sfreq=eeg_data.info['sfreq'],  # Sampling frequency
            selected_funcs=extraction_methods,  # Methods for feature extraction
            funcs_params=parameters,  # Parameters for the methods
            n_jobs=5,  # Number of jobs for parallel processing
            ch_names=eeg_data.info['ch_names'],  # Channel names
            return_as_df=True  # Return the result as a pandas DataFrame
        )

        # Store the extracted features in the dictionary
        extracted_features[band_key] = features

    # Return the dictionary containing features for each frequency band
    return extracted_features

### Computing the Features
The code performs EEG feature extraction for two conditions: eyes closed and eyes open, using extract_features_from_eeg. It processes EEG data for each state, specifying epoch duration and overlap, and applies selected feature extraction methods. The extracted features are then stored, renamed according to their state and frequency band, concatenated into one DataFrame, and averaged, providing a comprehensive set of features for each condition.

In [None]:
# Extract features for eyes closed and eyes open conditions
print("Extracting features from eyes closed EEG")
features_closed_eyes = extract_features_from_eeg(
    eeg_closed_eyes, 
    epoch_duration=3.9, epoch_overlap=0, 
    feature_set=feature_extraction_config, 
    freq_bands=freq_bands_interest
)
print("\nExtracting features from eyes opened EEG")
features_open_eyes = extract_features_from_eeg(
    eeg_open_eyes, 
    epoch_duration=1.9, epoch_overlap=0, 
    feature_set=feature_extraction_config, 
    freq_bands=freq_bands_interest
)

eeg_features = {
    'EC': features_closed_eyes,
    'EO': features_open_eyes
}

print("\nAveraging features across epochs\n")

# Initialize an empty list to store the feature data
subject_features = []

# Iterate over each state ('EC' for eyes closed, 'EO' for eyes open) and its corresponding features
for state, features in eeg_features.items():
    # Iterate over each frequency band within the state and its corresponding feature data
    for band, feature_data in features.items():
        # Rename the columns of the feature data to include the state and frequency band
        feature_cols = [f"{state}_{band}_{col}" for col in feature_data.columns.values]
        feature_data.columns = feature_cols

        # Append the renamed feature data to the list
        subject_features.append(feature_data)

# Concatenate all feature dataframes along the columns to create a single dataframe
subject_features = pd.concat(subject_features, axis=1)

# Compute the mean of the features across all epochs
subject_features = subject_features.mean()

print("--- Feature Extraction Finished ---\n")
print(subject_features)