# Explanation of Pipelines for Mental Imagery BCI

Below is a detailed explanation of each of the pipelines mentioned. For each pipeline, I'll describe:

- **Preprocessing Steps**: Any data preprocessing or transformations applied before feature extraction.
- **Feature Extraction**: Methods used to extract features from the EEG data.
- **Classifier**: The machine learning algorithm used for classification.
- **Notes**: Any additional information or considerations.

---

## 1. Common Spatial Patterns (CSP) + Logistic Regression (`CSP_LogReg`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Common Spatial Patterns (CSP)**:
  - **Purpose**: CSP is a feature extraction method that projects multi-channel EEG data into a low-dimensional spatial subspace that maximizes the variance for one class while minimizing it for the other.
  - **How It Works**: CSP computes spatial filters that optimize the discriminability between two classes by maximizing the variance of one class while minimizing the variance of the other.
  - **Output**: CSP transforms the EEG data into a set of features (usually the log-variance of the filtered signals), resulting in a feature matrix suitable for classification.

### Classifier

- **Logistic Regression**:
  - A linear classifier that models the probability of class membership using the logistic function.
  - **Advantages**: Simple, interpretable, and performs well with linearly separable data.

### Notes

- **CSP Assumptions**: Works best when data is bandpass filtered to relevant frequency bands (e.g., mu and beta rhythms).
- **Binary Classification**: CSP is typically used for binary classification tasks.

---

## 2. Power Spectral Density (PSD) + Support Vector Machine (SVM) (`PSD_SVM`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Power Spectral Density (PSD)**:
  - **Purpose**: PSD measures the power of the EEG signal at different frequency components.
  - **How It Works**: Using methods like Welch's method or multitaper, the EEG data is transformed into the frequency domain, and the power at each frequency bin is calculated.
  - **Output**: A feature vector representing the power distribution across frequencies for each channel.

### Classifier

- **Support Vector Machine (SVM)**:
  - A supervised learning algorithm that finds the optimal hyperplane to separate classes.
  - **Kernel Trick**: Can use different kernel functions (linear, polynomial, RBF) to handle non-linearly separable data.

### Notes

- **Feature Dimensionality**: PSD features can be high-dimensional; dimensionality reduction or feature selection may be beneficial.
- **SVM Advantages**: Effective in high-dimensional spaces and with clear margin separation.

---

## 3. Time Domain Features + Random Forest (`TimeDomain_RF`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Time-Domain Features**:
  - **Mean**: Average value of the EEG signal over time.
  - **Variance**: Measure of signal variability.
  - **Skewness**: Measure of asymmetry in the signal distribution.
  - **How It Works**: These statistical features are computed across the time dimension for each channel.

### Classifier

- **Random Forest**:
  - An ensemble method using multiple decision trees.
  - **How It Works**: Each tree is trained on a bootstrap sample with random feature selection; the final prediction is made by aggregating the outputs of all trees (e.g., majority vote).

### Notes

- **Non-linear Relationships**: Random forests can capture non-linear relationships between features and target classes.
- **Feature Importance**: Random forests provide measures of feature importance, useful for feature selection.

---

## 4. Hilbert Transform + k-Nearest Neighbors (k-NN) (`Hilbert_KNN`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Hilbert Transform**:
  - **Purpose**: To obtain the analytic signal from a real-valued signal, allowing extraction of instantaneous amplitude and phase.
  - **How It Works**: The Hilbert transform is applied to the EEG signal to compute the amplitude envelope or phase information.
  - **Output**: Amplitude envelope of the EEG signal for each channel.

### Classifier

- **k-Nearest Neighbors (k-NN)**:
  - A non-parametric classifier that assigns a class based on the majority class among the k nearest neighbors in the feature space.
  - **Distance Metrics**: Commonly uses Euclidean distance, but other metrics can be applied.

### Notes

- **Parameter Tuning**: The choice of `k` and the distance metric can significantly impact performance.
- **Computational Cost**: k-NN can be computationally expensive for large datasets.

---

## 5. Wavelet Transform + PSD + Naive Bayes (`Wavelet_PSD_NB`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Wavelet Transform**:
  - **Purpose**: Decomposes the EEG signal into time-frequency components with good time and frequency resolution.
  - **How It Works**: Applies wavelet decomposition (e.g., using Daubechies wavelets) to capture signal characteristics at various scales.
  - **Output**: Wavelet coefficients representing the signal at different scales and positions.
- **Power Spectral Density (PSD)**:
  - Calculated on the wavelet coefficients to obtain power distribution across scales.

### Classifier

- **Naive Bayes**:
  - A probabilistic classifier based on Bayes' theorem, assuming feature independence.
  - **Advantages**: Simple, fast, and works well with high-dimensional data.

### Notes

- **Feature Independence Assumption**: Naive Bayes assumes features are independent, which may not hold for EEG data.
- **Wavelet Selection**: The choice of wavelet function and decomposition level can affect feature quality.

---

## 6. Common Average Referencing (CAR) + CSP + Decision Tree (`CAR_CSP_DT`)

### Preprocessing Steps

- **Common Average Referencing (CAR)**:
  - **Purpose**: To reduce common noise across all channels.
  - **How It Works**: Subtracts the average signal across all channels from each channel's signal.
  - **Output**: Referenced EEG data with reduced artifacts.

### Feature Extraction

- **Common Spatial Patterns (CSP)**:
  - As described in Pipeline 1.

### Classifier

- **Decision Tree**:
  - A tree-based classifier that splits data based on feature thresholds to maximize class separation.
  - **Advantages**: Simple to interpret, handles both numerical and categorical data.

### Notes

- **Overfitting Risk**: Decision trees can overfit; pruning or setting maximum depth can mitigate this.
- **CAR Benefits**: CAR can enhance signal-to-noise ratio, benefiting CSP feature extraction.

---

## 7. Independent Component Analysis (ICA) + Time Domain Features + SVM (`ICA_TimeDomain_SVM`)

### Preprocessing Steps

- **Independent Component Analysis (ICA)**:
  - **Purpose**: To separate mixed signals into statistically independent components.
  - **How It Works**: Decomposes EEG signals into independent sources, which can help isolate artifacts (e.g., eye blinks, muscle movements).
  - **Output**: Cleaned EEG data or components representing neural activity.

### Feature Extraction

- **Time-Domain Features**:
  - As described in Pipeline 3.

### Classifier

- **Support Vector Machine (SVM)**:
  - As described in Pipeline 2.

### Notes

- **Artifact Removal**: ICA can effectively remove artifacts when components corresponding to noise are identified and excluded.
- **Complexity**: ICA requires careful application to avoid removing neural signals of interest.

---

## 8. CSP + Linear Discriminant Analysis (LDA) (`CSP_LDA`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Common Spatial Patterns (CSP)**:
  - As described in Pipeline 1.

### Classifier

- **Linear Discriminant Analysis (LDA)**:
  - A linear classifier that projects data onto a line to maximize class separation.
  - **How It Works**: Finds a linear combination of features that best separates the classes.

### Notes

- **Popular Combination**: CSP + LDA is a classic approach in motor imagery BCI applications due to its simplicity and effectiveness.
- **Assumptions**: LDA assumes normally distributed features with equal covariance matrices for each class.

---

## 9. PSD + Gradient Boosting Classifier (`PSD_GB`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Power Spectral Density (PSD)**:
  - As described in Pipeline 2.

### Classifier

- **Gradient Boosting Classifier**:
  - An ensemble method that builds sequential weak learners (typically decision trees), where each new learner focuses on correcting the errors of previous ones.
  - **Advantages**: High performance, can model complex relationships.

### Notes

- **Hyperparameter Tuning**: Requires careful tuning of parameters like learning rate, number of estimators, and tree depth.
- **Overfitting Risk**: Can overfit if not properly regularized.

---

## 10. CAR + Riemannian Geometry Features + Logistic Regression (`CAR_Riemann_LogReg`)

### Preprocessing Steps

- **Common Average Referencing (CAR)**:
  - As described in Pipeline 6.

### Feature Extraction

- **Riemannian Geometry Features**:
  - **Covariance Matrices**: Compute the covariance matrix of EEG signals for each trial.
  - **Tangent Space Mapping**: Maps covariance matrices to the tangent space of the Riemannian manifold to obtain feature vectors.
  - **Purpose**: Captures the spatial covariance structure of EEG signals, which is informative for classification.

### Classifier

- **Logistic Regression**:
  - As described in Pipeline 1.

### Notes

- **Riemannian Methods**: Effective for BCI applications due to robustness to noise and ability to capture complex signal structures.
- **Computational Cost**: Calculating covariance matrices and tangent space mapping can be computationally intensive.

---

## 11. Short-Time Fourier Transform (STFT) + SVM (`STFT_SVM`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Short-Time Fourier Transform (STFT)**:
  - **Purpose**: Analyzes the signal's frequency content over time by applying the Fourier Transform to short overlapping time windows.
  - **How It Works**: Divides the EEG signal into segments, applies windowing, and computes the Fourier Transform for each segment.
  - **Output**: Time-frequency representation of the signal.

### Classifier

- **Support Vector Machine (SVM)**:
  - As described in Pipeline 2.

### Notes

- **Feature Dimensionality**: STFT features can be very high-dimensional; dimensionality reduction may be necessary.
- **Time-Frequency Analysis**: Useful for capturing non-stationary properties of EEG signals.

---

## 12. Laplacian Spatial Filtering + Wavelet Packet Decomposition (WPD) + Random Forest (`Laplacian_WPD_RF`)

### Preprocessing Steps

- **Laplacian Spatial Filtering**:
  - **Purpose**: Enhances spatial resolution by emphasizing local activity and reducing volume conduction effects.
  - **How It Works**: Computes the difference between a channel and the average of its neighboring channels.
  - **Output**: Spatially filtered EEG data.

### Feature Extraction

- **Wavelet Packet Decomposition (WPD)**:
  - **Purpose**: Decomposes the signal into a set of frequency subbands with both time and frequency localization.
  - **How It Works**: Applies wavelet packet analysis to the EEG data, providing a more detailed frequency analysis than standard wavelet transforms.
  - **Output**: Coefficients representing the signal's energy at various scales and positions.

### Classifier

- **Random Forest**:
  - As described in Pipeline 3.

### Notes

- **Information Preservation**: WPD can capture subtle signal features that may be relevant for classification.
- **Computational Complexity**: WPD can be computationally intensive due to the extensive decomposition.

---

## 13. Morlet Wavelets + k-NN (`Morlet_KNN`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Time-Frequency Analysis using Morlet Wavelets**:
  - **Purpose**: Captures time-frequency characteristics of EEG signals with high resolution.
  - **How It Works**: Applies Morlet wavelet transform, which provides a complex exponential modulated by a Gaussian, ideal for EEG analysis.
  - **Output**: Power spectra across frequencies and time.

### Classifier

- **k-Nearest Neighbors (k-NN)**:
  - As described in Pipeline 4.

### Notes

- **Frequency Selection**: Frequencies of interest (e.g., 8-30 Hz) are analyzed.
- **Data Dimensionality**: The resulting features may be high-dimensional, affecting computational cost.

---

## 14. Autoregressive (AR) Coefficients + Logistic Regression (`AR_LogReg`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Autoregressive (AR) Coefficients**:
  - **Purpose**: Models the EEG signal as a linear function of its previous values.
  - **How It Works**: Fits an AR model to the EEG time series, extracting the coefficients as features.
  - **Output**: AR coefficients for each channel.

### Classifier

- **Logistic Regression**:
  - As described in Pipeline 1.

### Notes

- **Model Order**: The choice of AR model order (number of lags) affects feature quality.
- **Stationarity Assumption**: AR models assume signal stationarity within the analysis window.

---

## 15. Mean Amplitude Features + Naive Bayes (`MeanAmp_NB`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Mean Amplitude**:
  - **Purpose**: Simple feature representing the average absolute value of the EEG signal.
  - **How It Works**: Computes the mean of the absolute values of the EEG signal over time for each channel.
  - **Output**: Feature vector of mean amplitudes.

### Classifier

- **Naive Bayes**:
  - As described in Pipeline 5.

### Notes

- **Simplicity**: This approach uses straightforward features and a simple classifier.
- **Performance**: May not capture complex signal characteristics, potentially limiting classification accuracy.

---

## 16. ICA + PSD + SVM (`ICA_PSD_SVM`)

### Preprocessing Steps

- **Independent Component Analysis (ICA)**:
  - As described in Pipeline 7.

### Feature Extraction

- **Power Spectral Density (PSD)**:
  - Calculated on the ICA components to analyze the frequency content of the independent sources.
  - **Output**: PSD features derived from the independent components.

### Classifier

- **Support Vector Machine (SVM)**:
  - As described in Pipeline 2.

### Notes

- **Artifact Removal**: ICA can help isolate neural signals from artifacts before feature extraction.
- **Component Selection**: Choosing relevant components is crucial; including artifact components can degrade performance.

---

## 17. Fast Fourier Transform (FFT) + Random Forest (`FFT_RF`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Fast Fourier Transform (FFT)**:
  - **Purpose**: Transforms the time-domain EEG signal into the frequency domain.
  - **How It Works**: Computes the discrete Fourier Transform efficiently to obtain frequency coefficients.
  - **Output**: Magnitude of the FFT coefficients for each channel.

### Classifier

- **Random Forest**:
  - As described in Pipeline 3.

### Notes

- **Frequency Resolution**: The length of the FFT determines the frequency resolution.
- **Feature Selection**: Not all frequency components may be informative; selecting relevant frequencies can improve performance.

---

## 18. Time-Frequency Features + Linear Discriminant Analysis (LDA) (`TimeFreq_LDA`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Time-Frequency Analysis (e.g., STFT or Wavelet Transform)**:
  - **Purpose**: Captures how the frequency content of the EEG signal changes over time.
  - **How It Works**: Applies time-frequency transformations to extract features representing both temporal and spectral information.
  - **Output**: Time-frequency representation flattened into feature vectors.

### Classifier

- **Linear Discriminant Analysis (LDA)**:
  - As described in Pipeline 8.

### Notes

- **Dimensionality Reduction**: Due to high dimensionality, techniques like PCA may be applied before classification.
- **Temporal Dynamics**: Time-frequency features can capture transient patterns associated with mental imagery.

---

## 19. EEGNet (Deep Convolutional Neural Network) (`EEGNet_CNN`)

### Preprocessing Steps

- **Assumption**: No external preprocessing; EEGNet handles preprocessing internally.

### Feature Extraction and Classification

- **EEGNet Architecture**:
  - **Purpose**: A compact CNN architecture tailored for EEG-based BCIs.
  - **How It Works**: Consists of convolutional layers that learn spatial and temporal filters, depthwise and separable convolutions to reduce parameters, and fully connected layers for classification.
  - **Input Shape**: Expects input data in the shape `(n_samples, n_channels, n_times, 1)`.

### Notes

- **End-to-End Learning**: Learns feature extraction and classification jointly.
- **Computational Resources**: Requires more computational power and training time compared to traditional methods.
- **Data Requirements**: Deep learning models generally require large amounts of data to avoid overfitting.

---

## 20. PSD Features + XGBoost Classifier (`PSD_XGB`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Power Spectral Density (PSD)**:
  - As described in Pipeline 2.

### Classifier

- **XGBoost Classifier**:
  - An optimized gradient boosting algorithm.
  - **Advantages**: High performance, regularization to prevent overfitting, handles missing values.

### Notes

- **Hyperparameter Tuning**: XGBoost has many parameters that can be tuned to optimize performance.
- **Feature Importance**: Provides measures of feature importance, aiding in feature selection.

---

## 21. Ensemble Methods (Stacking Classifier) (`Stacking`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Power Spectral Density (PSD)**:
  - As described in Pipeline 2.

### Classifier

- **Stacking Classifier**:
  - **Base Learners**: Combines predictions from multiple classifiers (e.g., SVM, Random Forest, k-NN).
  - **Meta-Learner**: Uses a higher-level classifier (e.g., Logistic Regression) to make the final prediction based on base learners' outputs.
  - **Advantages**: Can capture diverse patterns by leveraging strengths of different classifiers.

### Notes

- **Complexity**: Requires careful management to avoid overfitting due to increased model complexity.
- **Cross-Validation**: Internal cross-validation is often used to prevent information leakage between training and validation data.

---

## 22. Sparse Representation Classification (SRC) (`SRC`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Raw Data or CSP Features**:
  - **Purpose**: SRC can work with raw data or features extracted using methods like CSP.
  - **How It Works**: Each class's training samples form a dictionary; test samples are represented as sparse linear combinations of dictionary atoms.

### Classifier

- **Sparse Representation Classifier (SRC)**:
  - **How It Works**: Solves an optimization problem to find the sparsest representation of a test sample in terms of the training dictionary.
  - **Prediction**: Classifies based on which class's dictionary best represents the test sample.

### Notes

- **Computational Complexity**: SRC can be computationally intensive due to the optimization involved.
- **Implementation**: In practice, approximations or simplifications (e.g., using Lasso regression) may be used.

---

## 23. Multilayer Perceptron (MLP) Neural Network (`MLP_NN`)

### Preprocessing Steps

- **Assumption**: Bandpass filtering is performed separately before this pipeline.

### Feature Extraction

- **Flattened EEG Data or Features**:
  - The EEG data is reshaped into a 2D array (samples x features), either directly or after feature extraction.

### Classifier

- **MLP Neural Network**:
  - **Architecture**: Consists of input, hidden, and output layers with non-linear activation functions.
  - **Learning**: Uses backpropagation to adjust weights and biases.

### Notes

- **Flexibility**: MLPs can model complex non-linear relationships.
- **Overfitting Risk**: Prone to overfitting if the network is too large or data is insufficient.
- **Hyperparameters**: Number of layers, neurons, activation functions, and learning rate need to be tuned.

---

## General Considerations Across Pipelines

- **Data Preprocessing**:
  - **Bandpass Filtering**: Essential to focus on frequency bands relevant to mental imagery (e.g., alpha, beta rhythms).
  - **Artifact Removal**: Techniques like ICA and CAR help reduce noise and artifacts, improving feature quality.

- **Feature Extraction**:
  - **Importance**: The choice of feature extraction method significantly impacts classification performance.
  - **Dimensionality**: High-dimensional features may require dimensionality reduction or regularization to prevent overfitting.

- **Classifier Selection**:
  - **Linear vs. Non-linear**: Linear classifiers like LDA and Logistic Regression are simple and interpretable but may not capture complex patterns.
  - **Ensemble Methods**: Can improve performance by combining multiple models but may increase computational cost.
  - **Deep Learning Models**: Offer powerful feature learning capabilities but require more data and computational resources.

- **Evaluation**:
  - **Cross-Validation**: Essential for assessing model performance and generalization ability.
  - **Hyperparameter Tuning**: Optimization of model parameters is crucial for achieving the best performance.

---

## Conclusion

Each pipeline combines specific preprocessing techniques, feature extraction methods, and classifiers tailored to capture the characteristics of EEG signals associated with mental imagery tasks. The choice of pipeline depends on factors such as the nature of the EEG data, computational resources, and the specific requirements of the BCI application. By experimenting with different pipelines, researchers can identify the most effective approaches for their particular datasets and objectives.

---

If you have any questions about any of these pipelines or need further clarification on specific components, feel free to ask!


modified

In [3]:
import sys
sys.path.append('c:\\Users\\rokas\\Documents\\BCI\\mi-bci\\code')

# Import ml_pipelines from the file you saved earlier
from pipelines.ml_pipelines import ml_pipelines
from evaluation import train_and_evaluate

from helper_functions import setup_logger, load_procesed_data
from helper_functions import process_mi_epochs
from datasets import Lee2019

In [4]:
log = setup_logger("Lee_preprocess")
dataset = Lee2019()
dataset_no = 20
paradigm = "MI"
subject = 44
run = 1
data = load_procesed_data(dataset_no, paradigm, subject, run, include=['epochs_raw', 'epochs_raw_autoreject'])
epochs = data["epochs_raw"]
epochs_p = process_mi_epochs(epochs)

Reading c:\Users\rokas\Documents\BCI\mi-bci\data\procesed\20\MI\44\1\s44.01_epochs_raw-epo.fif ...
    Found the data of interest:
        t =   -1200.00 ...    5200.00 ms
        0 CTF compensation matrices available
Not setting metadata
100 matching events found
No baseline correction applied
0 projection items activated
Reading c:\Users\rokas\Documents\BCI\mi-bci\data\procesed\20\MI\44\1\s44.01_epochs_raw_autoreject-epo.fif ...
    Found the data of interest:
        t =   -1200.00 ...    5200.00 ms
        0 CTF compensation matrices available
Not setting metadata
73 matching events found
No baseline correction applied
0 projection items activated
NOTE: pick_channels() is a legacy function. New code should use inst.pick(...).
Setting up band-pass filter from 8 - 30 Hz

IIR filter parameters
---------------------
Butterworth bandpass zero-phase (two-pass forward and reverse) non-causal filter:
- Filter order 20 (effective, after forward-backward)
- Cutoffs at 8.00, 30.00 Hz: -6.02, 

In [4]:
epochs_train = epochs_for_train(epochs_p)
results = train_and_evaluate(epochs_train, ml_pipelines, n_splits=5)

Evaluating pipeline: CSP_LogReg
Computing rank from data with rank=None
    Using tolerance 2.7e-05 (2.2e-16 eps * 20 dim * 6.1e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimating class=0 covariance using EMPIRICAL
Done.
Estimating class=1 covariance using EMPIRICAL
Done.
Computing rank from data with rank=None
    Using tolerance 2.7e-05 (2.2e-16 eps * 20 dim * 6e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimating class=0 covariance using EMPIRICAL
Done.
Estimating class=1 covariance using EMPIRICAL
Done.
Computing rank from data with rank=None
    Using tolerance 2.8e-05 (2.2e-16 eps * 20 dim * 6.2e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimatin

1 fits failed out of a total of 5.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
1 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\base.py", line 1473, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\pipeline.py", line 469, in fit
    Xt = self._fit(X, y, routed_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\def

Fitting ICA to data using 1600 channels (please be patient, this may take a while)
Selecting by number: 15 components
Fitting ICA took 1.3s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.6s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.2s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.2s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.1s.
An error occurred in pipeline ICA_TimeDomain_SVM: 
All the 5 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\base.py", line 1473, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\pipeline.py", line 469, in fit
    Xt = self._fit(X, y, routed_params)
  File "c:\Users\rokas

  info = mne.create_info(ch_names=['eeg'] * n_channels,


    Using tolerance 2.7e-05 (2.2e-16 eps * 20 dim * 6.1e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimating class=0 covariance using EMPIRICAL
Done.
Estimating class=1 covariance using EMPIRICAL
Done.
Computing rank from data with rank=None
    Using tolerance 2.7e-05 (2.2e-16 eps * 20 dim * 6e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimating class=0 covariance using EMPIRICAL
Done.
Estimating class=1 covariance using EMPIRICAL
Done.
Computing rank from data with rank=None
    Using tolerance 2.8e-05 (2.2e-16 eps * 20 dim * 6.2e+09  max singular value)
    Estimated rank (data): 20
    data: rank 20 computed from 20 data channels with 0 projectors
Reducing data rank from 20 -> 20
Estimating class=0 covariance using EMPIRICAL
Done.
Estimating class=1 covariance

  eigvals = operator(eigvals)
  eigvals = operator(eigvals)
  def isqrt(x): return 1. / np.sqrt(x)


An error occurred in pipeline CAR_Riemann_LogReg: 
All the 5 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
4 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\base.py", line 1473, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\pipeline.py", line 469, in fit
    Xt = self._fit(X, y, routed_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages

  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],
  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.2s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.6s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.2s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.1s.


  info = mne.create_info(ch_names=['eeg'] * n_channels,
  info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],


Fitting ICA to data using 1600 channels (please be patient, this may take a while)


  self.ica.fit(raw)


Selecting by number: 15 components
Fitting ICA took 1.1s.
An error occurred in pipeline ICA_PSD_SVM: 
All the 5 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\base.py", line 1473, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "c:\Users\rokas\Documents\BCI\mi-bci\.pixi\envs\default\lib\site-packages\sklearn\pipeline.py", line 469, in fit
    Xt = self._fit(X, y, routed_params)
  File "c:\Users\rokas\Docume

  info = mne.create_info(ch_names=['eeg'] * n_channels,


Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.4000, Validation AUC: 0.4200
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.6500, Validation AUC: 0.8050
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.6500, Validation AUC: 0.5950
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.7000, Validation AUC: 0.7100
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.4000, Validation AUC: 0.5200
Our out-of-fold mean training accuracy is 1.0000
Our out-of-fold mean training AUC is 1.0000
Our out-of-fold mean validation accuracy is 0.5600
Our out-of-fold mean validation AUC is 0.6100
Evaluating pipeline: TimeFreq_LDA
Training Accuracy: 0.8125, Training AUC: 0.8919
Validation Accuracy: 0.7000, Validation AUC: 0.7000
Training Accuracy: 0.8250, Training AUC: 0.8894
Validation Accuracy: 0.6000, Validation AUC: 0.5800
Training Accuracy: 0.8125, Training AUC: 0.8544
Validation Accuracy: 0.7500,

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.



Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.6500, Validation AUC: 0.7500
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.9000, Validation AUC: 0.9900
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.8500, Validation AUC: 0.9400
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.9500, Validation AUC: 0.9800
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.8500, Validation AUC: 0.9200
Our out-of-fold mean training accuracy is 1.0000
Our out-of-fold mean training AUC is 1.0000
Our out-of-fold mean validation accuracy is 0.8400
Our out-of-fold mean validation AUC is 0.9160
Evaluating pipeline: Stacking
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.5000, Validation AUC: 0.6200
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.8000, Validation AUC: 0.9000
Training Accuracy: 1.0000, Training AUC: 1.0000
Validation Accuracy: 0.6500, Val

In [5]:
results

{'CSP_LogReg': {'train_accuracy': array([0.9875, 0.975 , 0.975 , 0.9875, 0.975 ]),
  'train_roc_auc': array([0.995625, 0.993125, 0.99125 , 0.995   , 0.995   ]),
  'val_accuracy': array([0.9 , 0.95, 0.95, 0.95, 0.9 ]),
  'val_roc_auc': array([0.97, 0.99, 1.  , 0.99, 1.  ]),
  'mean_train_accuracy': 0.9799999999999999,
  'mean_train_auc': 0.994,
  'mean_val_accuracy': 0.93,
  'mean_val_auc': 0.99},
 'PSD_SVM': {'train_accuracy': array([0.9375, 0.875 , 0.925 , 0.9   , 0.8875]),
  'train_roc_auc': array([0.99    , 0.953125, 0.97125 , 0.971875, 0.981875]),
  'val_accuracy': array([0.4 , 0.6 , 0.6 , 0.5 , 0.65]),
  'val_roc_auc': array([0.47, 0.66, 0.6 , 0.68, 0.62]),
  'mean_train_accuracy': 0.9049999999999999,
  'mean_train_auc': 0.9736249999999999,
  'mean_val_accuracy': 0.55,
  'mean_val_auc': 0.6060000000000001},
 'TimeDomain_RF': {'train_accuracy': array([1., 1., 1., 1., 1.]),
  'train_roc_auc': array([1., 1., 1., 1., 1.]),
  'val_accuracy': array([0.6 , 0.6 , 0.65, 0.6 , 0.5 ]),
  'va

In [None]:
from sklearn.model_selection import StratifiedKFold, train_test_split
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.base import clone
import numpy as np

def train_and_evaluate(X, y, pipelines, n_splits=5):
    """
    Trains and evaluates multiple pipelines using a train-validation split and computes accuracy and AUC.

    Parameters:
    - X: EEG data array of shape (n_samples, n_channels, n_times)
    - y: Labels array of shape (n_samples,)
    - pipelines: Dictionary of sklearn Pipelines
    - n_splits: Number of folds for cross-validation (not used in this version)

    Returns:
    - results: Dictionary containing accuracy and AUC scores for each pipeline
    """
    results = {}

    # Split data into training and validation sets
    X_tr, X_val, y_tr, y_val = train_test_split(
        X, y, test_size=0.1, random_state=42, stratify=y
    )

    for name, pipeline in pipelines.items():
        print(f"Evaluating pipeline: {name}")
        try:
            # Clone the pipeline to ensure a fresh model
            clf = clone(pipeline)

            # Fit the classifier on the training data
            clf.fit(X_tr, y_tr)

            # Predict on the validation set
            pred = clf.predict(X_val)
            if hasattr(clf, "predict_proba"):
                pred_prob = clf.predict_proba(X_val)[:, 1]
            else:
                # Use decision function if predict_proba is not available
                pred_prob = clf.decision_function(X_val)
                # If decision_function returns shape (n_samples,), convert it to probabilities
                pred_prob = (pred_prob - pred_prob.min()) / (pred_prob.max() - pred_prob.min())

            # Compute accuracy and AUC scores
            acc_score = accuracy_score(y_val, pred)
            auc_score = roc_auc_score(y_val, pred_prob)

            print(
                f"Our accuracy on the validation set is {acc_score:0.4f} and AUC is {auc_score:0.4f}"
            )

            # Store the results
            results[name] = {'accuracy': acc_score, 'auc': auc_score}

        except Exception as e:
            print(f"An error occurred in pipeline {name}: {e}")
            results[name] = None

    return results


In [17]:
results_ml

{'CSP_LogReg': None,
 'PSD_SVM': None,
 'TimeDomain_RF': None,
 'Hilbert_KNN': None,
 'Wavelet_PSD_NB': None,
 'CAR_CSP_DT': None,
 'ICA_TimeDomain_SVM': None,
 'CSP_LDA': None,
 'PSD_GB': None,
 'CAR_Riemann_LogReg': None,
 'STFT_SVM': None,
 'Morlet_KNN': None,
 'AR_LogReg': None,
 'MeanAmp_NB': None,
 'ICA_PSD_SVM': None,
 'FFT_RF': None,
 'TimeFreq_LDA': None,
 'PSD_XGB': None,
 'Stacking': None,
 'SRC': None,
 'MLP_NN': None}

Full

In [5]:
# Import necessary libraries
import mne
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score, KFold, StratifiedKFold
from sklearn.base import BaseEstimator, TransformerMixin, ClassifierMixin
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.ensemble import (RandomForestClassifier, GradientBoostingClassifier,
                              VotingClassifier, StackingClassifier)
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression, Lasso
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.decomposition import PCA
from sklearn.neural_network import MLPClassifier
import xgboost as xgb
from mne.decoding import CSP, Vectorizer
from pyriemann.estimation import Covariances
from pyriemann.tangentspace import TangentSpace

# Import for Deep Learning models
from tensorflow.keras.models import Model
from tensorflow.keras.layers import (Input, Conv2D, DepthwiseConv2D, SeparableConv2D,
                                     BatchNormalization, Activation, AveragePooling2D,
                                     Dropout, Flatten, Dense)
from tensorflow.keras.constraints import max_norm
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

# ===========================================
# Custom Transformers (Use built-in where possible)
# ===========================================

# Since some transformations are not available in built-in libraries,
# we define custom transformers only when necessary.

class ReshapeTransformer(BaseEstimator, TransformerMixin):
    """Reshapes 3D EEG data to 2D for classifier input."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        n_samples = X.shape[0]
        return X.reshape(n_samples, -1)

class PSDTransformer(BaseEstimator, TransformerMixin):
    """Extracts Power Spectral Density (PSD) features."""
    def __init__(self, sfreq=256, fmin=0.1, fmax=40):
        self.sfreq = sfreq
        self.fmin = fmin
        self.fmax = fmax
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        psd, freqs = mne.time_frequency.psd_array_multitaper(
            X, sfreq=self.sfreq, fmin=self.fmin, fmax=self.fmax, verbose=False)
        return psd.reshape(psd.shape[0], -1)

class ARTransformer(BaseEstimator, TransformerMixin):
    """Computes Autoregressive (AR) coefficients."""
    def __init__(self, order=5):
        self.order = order
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        from statsmodels.tsa.ar_model import AutoReg
        n_samples, n_channels, n_times = X.shape
        ar_features = np.zeros((n_samples, n_channels * self.order))
        for i in range(n_samples):
            for j in range(n_channels):
                model = AutoReg(X[i, j, :], lags=self.order, old_names=False)
                model_fit = model.fit()
                ar_coeffs = model_fit.params[1:]  # Exclude intercept
                ar_features[i, j * self.order:(j + 1) * self.order] = ar_coeffs
        return ar_features

# ===========================================
# Machine Learning Pipelines
# ===========================================

ml_pipelines = {}

# 1. CSP + Logistic Regression
ml_pipelines['CSP_LogReg'] = Pipeline([
    ('csp', CSP(n_components=4, reg=None, log=True, norm_trace=False)),
    ('logreg', LogisticRegression(max_iter=1000))
])

# 2. PSD + SVM
ml_pipelines['PSD_SVM'] = Pipeline([
    ('psd', PSDTransformer()),
    ('scaler', StandardScaler()),
    ('svm', SVC())
])

# 3. Time Domain Features + Random Forest
class TimeDomainTransformer(BaseEstimator, TransformerMixin):
    """Computes time-domain features like mean, variance, skewness."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        mean = np.mean(X, axis=2)
        var = np.var(X, axis=2)
        skewness = np.mean(((X - mean[:, :, np.newaxis]) ** 3), axis=2) / (var ** 1.5)
        features = np.concatenate((mean, var, skewness), axis=1)
        return features

ml_pipelines['TimeDomain_RF'] = Pipeline([
    ('time_features', TimeDomainTransformer()),
    ('rf', RandomForestClassifier())
])

# 4. Hilbert Transform + k-NN
class HilbertTransformer(BaseEstimator, TransformerMixin):
    """Applies the Hilbert transform to EEG data."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        analytic_signal = mne.filter.hilbert(X, picks=None, envelope=False, verbose=False)
        amplitude_envelope = np.abs(analytic_signal)
        return amplitude_envelope

ml_pipelines['Hilbert_KNN'] = Pipeline([
    ('hilbert', HilbertTransformer()),
    ('reshape', ReshapeTransformer()),
    ('scaler', StandardScaler()),
    ('knn', KNeighborsClassifier())
])

# 5. Wavelet Transform + PSD + Naive Bayes
class WaveletTransformer(BaseEstimator, TransformerMixin):
    """Applies Wavelet Transform."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        import pywt
        n_samples, n_channels, n_times = X.shape
        wavelet_features = []
        for i in range(n_samples):
            sample_features = []
            for j in range(n_channels):
                coeffs = pywt.wavedec(X[i, j, :], 'db4', level=3)
                coeffs_flat = np.concatenate([c.flatten() for c in coeffs])
                sample_features.append(coeffs_flat)
            wavelet_features.append(np.concatenate(sample_features))
        return np.array(wavelet_features)

ml_pipelines['Wavelet_PSD_NB'] = Pipeline([
    ('wavelet', WaveletTransformer()),
    ('psd', PSDTransformer()),
    ('nb', GaussianNB())
])

# 6. CAR + CSP + Decision Tree
class CARTransformer(BaseEstimator, TransformerMixin):
    """Applies Common Average Referencing (CAR)."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        car = X - np.mean(X, axis=1, keepdims=True)
        return car

ml_pipelines['CAR_CSP_DT'] = Pipeline([
    ('car', CARTransformer()),
    ('csp', CSP(n_components=4, reg=None, log=True, norm_trace=False)),
    ('dt', DecisionTreeClassifier())
])

# 7. ICA + Time Domain Features + SVM
class ICATransformer(BaseEstimator, TransformerMixin):
    """Applies Independent Component Analysis (ICA)."""
    def __init__(self, n_components=15, random_state=42):
        self.n_components = n_components
        self.random_state = random_state
        self.ica = None
    def fit(self, X, y=None):
        n_samples, n_channels, n_times = X.shape
        X_concat = X.reshape(n_samples * n_channels, n_times)
        self.ica = mne.preprocessing.ICA(n_components=self.n_components,
                                         random_state=self.random_state,
                                         max_iter='auto', verbose=False)
        info = mne.create_info(ch_names=['eeg'] * X_concat.shape[0],
                               sfreq=256, ch_types='eeg')
        raw = mne.io.RawArray(X_concat, info, verbose=False)
        self.ica.fit(raw)
        return self
    def transform(self, X):
        n_samples, n_channels, n_times = X.shape
        X_transformed = []
        for i in range(n_samples):
            data = X[i]
            info = mne.create_info(ch_names=['eeg'] * n_channels,
                                   sfreq=256, ch_types='eeg')
            raw = mne.io.RawArray(data, info, verbose=False)
            raw_ica = self.ica.apply(raw.copy(), exclude=[], verbose=False)
            X_transformed.append(raw_ica.get_data())
        return np.array(X_transformed)

ml_pipelines['ICA_TimeDomain_SVM'] = Pipeline([
    ('ica', ICATransformer(n_components=15)),
    ('time_features', TimeDomainTransformer()),
    ('svm', SVC())
])

# 8. CSP + LDA
ml_pipelines['CSP_LDA'] = Pipeline([
    ('csp', CSP(n_components=4, reg=None, log=True, norm_trace=False)),
    ('lda', LDA())
])

# 9. PSD + Gradient Boosting
ml_pipelines['PSD_GB'] = Pipeline([
    ('psd', PSDTransformer()),
    ('scaler', StandardScaler()),
    ('gb', GradientBoostingClassifier())
])

# 10. CAR + Riemannian Geometry Features + Logistic Regression
ml_pipelines['CAR_Riemann_LogReg'] = Pipeline([
    ('car', CARTransformer()),
    ('cov', Covariances()),
    ('ts', TangentSpace()),
    ('logreg', LogisticRegression(max_iter=1000))
])

# 11. STFT + SVM
class STFTTransformer(BaseEstimator, TransformerMixin):
    """Computes Short-Time Fourier Transform (STFT)."""
    def __init__(self, n_fft=256):
        self.n_fft = n_fft
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        from scipy.signal import stft
        n_samples, n_channels, n_times = X.shape
        stft_features = []
        for i in range(n_samples):
            sample_features = []
            for j in range(n_channels):
                _, _, Zxx = stft(X[i, j, :], nperseg=self.n_fft)
                sample_features.append(np.abs(Zxx).flatten())
            stft_features.append(np.concatenate(sample_features))
        return np.array(stft_features)

ml_pipelines['STFT_SVM'] = Pipeline([
    ('stft', STFTTransformer(n_fft=256)),
    ('scaler', StandardScaler()),
    ('svm', SVC())
])

# 12. Morlet Wavelet Transform + k-NN
class MorletWaveletTransformer(BaseEstimator, TransformerMixin):
    """Applies time-frequency analysis using Morlet wavelets."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        freqs = np.linspace(8, 30, num=22)
        n_samples, n_channels, n_times = X.shape
        power_features = []
        for i in range(n_samples):
            sample_features = []
            for j in range(n_channels):
                power = mne.time_frequency.tfr_array_morlet(
                    X[i:i+1, j:j+1, :], sfreq=256, freqs=freqs,
                    n_cycles=7, output='power', verbose=False)
                sample_features.append(power.flatten())
            power_features.append(np.concatenate(sample_features))
        return np.array(power_features)

ml_pipelines['Morlet_KNN'] = Pipeline([
    ('morlet', MorletWaveletTransformer()),
    ('scaler', StandardScaler()),
    ('knn', KNeighborsClassifier())
])

# 13. AR Coefficients + Logistic Regression
ml_pipelines['AR_LogReg'] = Pipeline([
    ('ar', ARTransformer(order=5)),
    ('scaler', StandardScaler()),
    ('logreg', LogisticRegression(max_iter=1000))
])

# 14. Mean Amplitude Features + Naive Bayes
class MeanAmplitudeTransformer(BaseEstimator, TransformerMixin):
    """Computes the mean amplitude of the signal."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        mean_amplitude = np.mean(np.abs(X), axis=2)
        return mean_amplitude

ml_pipelines['MeanAmp_NB'] = Pipeline([
    ('mean_amp', MeanAmplitudeTransformer()),
    ('nb', GaussianNB())
])

# 15. ICA + PSD + SVM
ml_pipelines['ICA_PSD_SVM'] = Pipeline([
    ('ica', ICATransformer(n_components=15)),
    ('psd', PSDTransformer()),
    ('scaler', StandardScaler()),
    ('svm', SVC())
])

# 16. FFT + Random Forest
class FFTTransformer(BaseEstimator, TransformerMixin):
    """Computes Fast Fourier Transform (FFT)."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        fft_coeffs = np.fft.rfft(X, axis=2)
        fft_features = np.abs(fft_coeffs)
        return fft_features.reshape(X.shape[0], -1)

ml_pipelines['FFT_RF'] = Pipeline([
    ('fft', FFTTransformer()),
    ('scaler', StandardScaler()),
    ('rf', RandomForestClassifier())
])

# 17. Time-Frequency Features (STFT) + LDA
ml_pipelines['TimeFreq_LDA'] = Pipeline([
    ('stft', STFTTransformer(n_fft=256)),
    ('scaler', StandardScaler()),
    ('lda', LDA())
])

# 18. PSD Features + XGBoost Classifier
ml_pipelines['PSD_XGB'] = Pipeline([
    ('psd', PSDTransformer()),
    ('scaler', StandardScaler()),
    ('xgb', xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss'))
])

# 19. Ensemble Methods (Stacking Classifier)
ml_pipelines['Stacking'] = Pipeline([
    ('psd', PSDTransformer()),
    ('scaler', StandardScaler()),
    ('stacking', StackingClassifier(
        estimators=[
            ('svm', SVC(probability=True)),
            ('rf', RandomForestClassifier()),
            ('knn', KNeighborsClassifier())
        ],
        final_estimator=LogisticRegression(max_iter=1000),
        cv=5
    ))
])

# 20. Sparse Representation Classification (SRC)
class SRCClassifier(BaseEstimator, ClassifierMixin):
    """Simulates a Sparse Representation Classifier using Lasso."""
    def __init__(self, alpha=0.1):
        self.alpha = alpha
        self.classes_ = None
        self.dictionary_ = None
    def fit(self, X, y):
        self.classes_ = np.unique(y)
        self.dictionary_ = {}
        for cls in self.classes_:
            self.dictionary_[cls] = X[y == cls].T
        return self
    def predict(self, X):
        preds = []
        for x in X:
            residuals = []
            for cls in self.classes_:
                lasso = Lasso(alpha=self.alpha, max_iter=1000)
                lasso.fit(self.dictionary_[cls], x)
                reconstruction = lasso.predict(self.dictionary_[cls])
                residual = np.linalg.norm(x - reconstruction)
                residuals.append(residual)
            preds.append(self.classes_[np.argmin(residuals)])
        return np.array(preds)

ml_pipelines['SRC'] = Pipeline([
    ('reshape', ReshapeTransformer()),
    ('scaler', StandardScaler()),
    ('src', SRCClassifier(alpha=0.1))
])

# 21. Multilayer Perceptron (MLP) Neural Network (As ML Pipeline)
ml_pipelines['MLP_NN'] = Pipeline([
    ('reshape', ReshapeTransformer()),
    ('scaler', StandardScaler()),
    ('mlp', MLPClassifier(hidden_layer_sizes=(100,), max_iter=500))
])

# ===========================================
# Neural Network Pipelines
# ===========================================

nn_pipelines = {}

# 1. EEGNet (Deep CNN)
class EEGNetTransformer(BaseEstimator, TransformerMixin):
    """Transformer that reshapes data for EEGNet."""
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        # Reshape X to (n_samples, n_channels, n_times, 1)
        n_samples, n_channels, n_times = X.shape
        X_reshaped = X.reshape(n_samples, n_channels, n_times, 1)
        return X_reshaped

def create_eegnet_model(n_channels, n_times, n_classes):
    """Creates an EEGNet model."""
    input_shape = (n_channels, n_times, 1)
    inputs = Input(shape=input_shape)

    # Block 1
    x = Conv2D(16, (1, 64), padding='same',
               use_bias=False)(inputs)
    x = BatchNormalization()(x)
    x = DepthwiseConv2D((n_channels, 1), use_bias=False,
                        depth_multiplier=2,
                        depthwise_constraint=max_norm(1.))(x)
    x = BatchNormalization()(x)
    x = Activation('elu')(x)
    x = AveragePooling2D((1, 4))(x)
    x = Dropout(0.25)(x)

    # Block 2
    x = SeparableConv2D(32, (1, 16), use_bias=False, padding='same')(x)
    x = BatchNormalization()(x)
    x = Activation('elu')(x)
    x = AveragePooling2D((1, 8))(x)
    x = Dropout(0.25)(x)

    x = Flatten()(x)

    x = Dense(n_classes, activation='softmax')(x)

    model = Model(inputs=inputs, outputs=x)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# Placeholder function to create the model with the correct input dimensions
def create_eegnet_model_wrapper():
    n_channels = X.shape[1]
    n_times = X.shape[2]
    n_classes = len(np.unique(y))
    return create_eegnet_model(n_channels, n_times, n_classes)

nn_pipelines['EEGNet_CNN'] = Pipeline([
    ('reshape', EEGNetTransformer()),
    ('eegnet', KerasClassifier(build_fn=create_eegnet_model_wrapper, epochs=50, batch_size=16, verbose=0))
])

# ===========================================
# Training Function with K-Fold Cross-Validation
# ===========================================

def train_and_evaluate(X, y, pipelines, n_splits=5):
    """
    Trains and evaluates multiple pipelines using k-fold cross-validation.

    Parameters:
    - X: EEG data array of shape (n_samples, n_channels, n_times)
    - y: Labels array of shape (n_samples,)
    - pipelines: Dictionary of sklearn Pipelines
    - n_splits: Number of folds for cross-validation

    Returns:
    - results: Dictionary containing cross-validation scores for each pipeline
    """
    results = {}

    for name, pipeline in pipelines.items():
        print(f"Evaluating pipeline: {name}")
        try:
            if name == 'EEGNet_CNN':
                # Use StratifiedKFold for EEGNet to maintain class balance
                skf = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=42)
                scores = []
                for train_index, test_index in skf.split(X, y):
                    X_train, X_test = X[train_index], X[test_index]
                    y_train, y_test = y[train_index], y[test_index]
                    pipeline.fit(X_train, y_train)
                    score = pipeline.score(X_test, y_test)
                    scores.append(score)
                scores = np.array(scores)
            else:
                kf = KFold(n_splits=n_splits, shuffle=True, random_state=42)
                scores = cross_val_score(pipeline, X, y, cv=kf, scoring='accuracy', n_jobs=-1)
            print(f"Scores: {scores}")
            print(f"Mean accuracy: {np.mean(scores):.4f}")
            results[name] = scores
        except Exception as e:
            print(f"An error occurred in pipeline {name}: {e}")
            results[name] = None
    return results

# ===========================================
# Example Usage
# ===========================================

# Assuming `X` and `y` are your EEG data and labels
# Adjust the code to match your data loading and preprocessing steps

# Example call for Machine Learning Pipelines
# results_ml = train_and_evaluate(X, y, ml_pipelines, n_splits=5)

# Example call for Neural Network Pipelines
# results_nn = train_and_evaluate(X, y, nn_pipelines, n_splits=5)


  ('eegnet', KerasClassifier(build_fn=create_eegnet_model_wrapper, epochs=50, batch_size=16, verbose=0))
