# LLM-EEG Framework - Phase 3: Feature Extraction & Classification

This notebook demonstrates the complete Phase 3 implementation for the LLM-EEG framework,
focused on feature extraction and classification for motor imagery EEG signals.

## Overview

**Phase 3 Components:**
- **Feature Extractors**: CSP, Band Power, Time Domain
- **Feature Pipeline**: Modular feature extraction with multiple extractors
- **Classifiers**: LDA, SVM, EEGNet
- **Evaluation**: Cross-validation, metrics, model comparison

**Building on Phase 2:**
- Data loading (BCICIV2aLoader)
- Preprocessing (Bandpass, Notch, Normalization)
- PyTorch datasets

**Performance Targets:**
- Subject-dependent accuracy: >85%
- Subject-independent accuracy: >70%
- Cohen's Kappa: >0.80

---

## Step 1: Environment Setup

In [None]:
# Step 1.1: Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Step 1.2: Clone the repository
!git clone https://github.com/erlika/llm-eeg.git
%cd llm-eeg

In [None]:
# Step 1.3: Install dependencies
!pip install -q numpy scipy mne torch scikit-learn matplotlib seaborn

In [None]:
# Step 1.4: Add src to Python path and verify imports
import sys
sys.path.insert(0, '/content/llm-eeg')

# Verify Phase 3 imports
from src.features import (
    CSPExtractor, BandPowerExtractor, TimeDomainExtractor,
    FeatureExtractorFactory, FeatureExtractionPipeline,
    create_csp_extractor, create_motor_imagery_pipeline
)
from src.classifiers import (
    LDAClassifier, SVMClassifier, EEGNetClassifier,
    ClassifierFactory, create_lda_classifier, create_svm_classifier,
    create_eegnet_classifier, list_available_classifiers
)

print("✅ Phase 3 modules imported successfully!")
print(f"\nAvailable classifiers: {list_available_classifiers()}")

## Step 2: Load and Preprocess Data (Phase 2 Review)

In [None]:
# Step 2.1: Configure data paths
import os
import numpy as np

# Update this path to your Google Drive location
DATA_DIR = '/content/drive/MyDrive/BCI_Data/dataset_2a'

# Alternative paths:
# DATA_DIR = '/content/drive/MyDrive/BCI_Competition_IV_2a'

if os.path.exists(DATA_DIR):
    files = os.listdir(DATA_DIR)
    mat_files = [f for f in files if f.endswith('.mat')]
    print(f"✅ Found {len(mat_files)} MAT files in {DATA_DIR}")
else:
    print(f"❌ Directory not found: {DATA_DIR}")
    print("Please update DATA_DIR to your dataset location")

In [None]:
# Step 2.2: Load data using Phase 2 BCICIV2aLoader
from scipy.io import loadmat
from src.core.data_types import EEGData, EventMarker

class BCICIV2aLoader:
    """Data loader for BCI Competition IV-2a dataset."""
    
    def __init__(self, sampling_rate=250, include_eog=False, trial_duration=4.0, trial_offset=0.0):
        self.sampling_rate = sampling_rate
        self.n_eeg_channels = 22
        self.include_eog = include_eog
        self.trial_duration = trial_duration
        self.trial_offset = trial_offset
        self.class_mapping = {1: 'left_hand', 2: 'right_hand', 3: 'feet', 4: 'tongue'}
        self.eeg_channel_names = [
            'Fz', 'FC3', 'FC1', 'FCz', 'FC2', 'FC4',
            'C5', 'C3', 'C1', 'Cz', 'C2', 'C4', 'C6',
            'CP3', 'CP1', 'CPz', 'CP2', 'CP4', 'P1', 'Pz', 'P2', 'POz'
        ]
        
    def load(self, file_path):
        mat_data = loadmat(file_path, struct_as_record=False, squeeze_me=True)
        data_array = mat_data['data']
        
        all_signals = []
        all_events = []
        sample_offset = 0
        
        for run_idx in range(len(data_array)):
            run = data_array[run_idx]
            signals = run.X
            n_samples = signals.shape[0]
            all_signals.append(signals)
            
            if hasattr(run, 'y') and hasattr(run.y, '__len__') and len(run.y) > 0:
                for start, label in zip(run.trial, run.y):
                    event = EventMarker(
                        sample=int(start) + sample_offset,
                        code=768 + int(label),
                        label=self.class_mapping.get(int(label), f'class_{label}')
                    )
                    all_events.append(event)
            sample_offset += n_samples
        
        signals = np.vstack(all_signals).T[:self.n_eeg_channels, :]
        
        return EEGData(
            signals=signals,
            sampling_rate=self.sampling_rate,
            channel_names=self.eeg_channel_names,
            events=all_events
        )
    
    def extract_trials(self, eeg_data, duration=None, offset=None):
        duration = duration or self.trial_duration
        offset = offset or self.trial_offset
        samples_per_trial = int(duration * self.sampling_rate)
        offset_samples = int(offset * self.sampling_rate)
        
        trials, labels = [], []
        for event in eeg_data.events:
            start = event.sample + offset_samples
            end = start + samples_per_trial
            if start < 0 or end > eeg_data.signals.shape[1]:
                continue
            trials.append(eeg_data.signals[:, start:end])
            labels.append(event.code - 769)
        
        return np.array(trials), np.array(labels)

print("✅ BCICIV2aLoader ready")

In [None]:
# Step 2.3: Load Subject A01 Training Data
loader = BCICIV2aLoader()

subject_file = os.path.join(DATA_DIR, 'A01T.mat')
eeg_data = loader.load(subject_file)
X, y = loader.extract_trials(eeg_data)

print(f"\n=== Subject A01 Data ===")
print(f"Trials shape: {X.shape}")
print(f"Labels shape: {y.shape}")
print(f"Classes: {np.unique(y)}")
print(f"Samples per class: {[np.sum(y==c) for c in range(4)]}")

In [None]:
# Step 2.4: Preprocess Data
from src.preprocessing import create_standard_pipeline

pipeline = create_standard_pipeline(
    sampling_rate=250,
    notch_freq=50.0,
    low_freq=8.0,
    high_freq=30.0,
    normalize_method='zscore'
)
pipeline.initialize()

X_processed = pipeline.process(X)
print(f"\n=== Preprocessed Data ===")
print(f"Shape: {X_processed.shape}")
print(f"Range: [{X_processed.min():.2f}, {X_processed.max():.2f}]")

In [None]:
# Step 2.5: Split Data
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X_processed, y, test_size=0.2, random_state=42, stratify=y
)

print(f"\n=== Data Split ===")
print(f"Train: {X_train.shape}")
print(f"Test: {X_test.shape}")

## Step 3: CSP Feature Extraction

Common Spatial Pattern (CSP) is the most effective spatial filtering technique for motor imagery EEG classification.

**How CSP works:**
1. Learn spatial filters that maximize variance for one class while minimizing for another
2. Project EEG data through these filters
3. Compute log-variance features

In [None]:
# Step 3.1: Create and fit CSP extractor
from src.features import CSPExtractor, create_csp_extractor

# Create CSP with 6 components (3 per class for binary)
csp = create_csp_extractor(n_components=6, sampling_rate=250)

# Fit and extract features
X_train_csp = csp.fit_extract(X_train, y_train)
X_test_csp = csp.extract(X_test)

print(f"\n=== CSP Feature Extraction ===")
print(f"Original shape: {X_train.shape}")
print(f"CSP features shape: {X_train_csp.shape}")
print(f"Feature names: {csp.get_feature_names()[:6]}")
print(f"\nFilters shape: {csp.get_spatial_filters().shape}")
print(f"Patterns shape: {csp.get_spatial_patterns().shape}")

In [None]:
# Step 3.2: Visualize CSP Spatial Patterns
import matplotlib.pyplot as plt
import numpy as np

patterns = csp.get_spatial_patterns()
n_patterns = min(6, patterns.shape[0])

fig, axes = plt.subplots(2, 3, figsize=(12, 8))
axes = axes.flatten()

for i in range(n_patterns):
    ax = axes[i]
    pattern = patterns[i]
    
    # Simple bar plot of channel weights
    ax.bar(range(len(pattern)), pattern)
    ax.set_title(f'CSP Pattern {i+1}')
    ax.set_xlabel('Channel')
    ax.set_ylabel('Weight')
    ax.axhline(y=0, color='k', linestyle='-', linewidth=0.5)

plt.tight_layout()
plt.suptitle('CSP Spatial Patterns', y=1.02, fontsize=14)
plt.show()

In [None]:
# Step 3.3: CSP Feature Visualization
import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Feature distribution by class
ax1 = axes[0]
for class_idx in range(4):
    class_mask = y_train == class_idx
    ax1.scatter(
        X_train_csp[class_mask, 0], 
        X_train_csp[class_mask, 1],
        label=f'Class {class_idx}',
        alpha=0.6
    )
ax1.set_xlabel('CSP Feature 1')
ax1.set_ylabel('CSP Feature 2')
ax1.set_title('CSP Features: First 2 Components')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Box plot of features per class
ax2 = axes[1]
feature_data = [X_train_csp[y_train == c, 0] for c in range(4)]
ax2.boxplot(feature_data, labels=['Left Hand', 'Right Hand', 'Feet', 'Tongue'])
ax2.set_xlabel('Class')
ax2.set_ylabel('CSP Feature 1')
ax2.set_title('CSP Feature 1 Distribution by Class')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Step 4: Band Power Feature Extraction

Band power features extract the spectral energy in specific frequency bands relevant to motor imagery:
- **Mu (8-12 Hz)**: Sensorimotor rhythm
- **Beta (12-30 Hz)**: Motor planning and execution

In [None]:
# Step 4.1: Band Power Extraction
from src.features import BandPowerExtractor, create_band_power_extractor

# Create band power extractor for motor imagery bands
bands = {
    'mu': (8, 12),
    'beta_low': (12, 20),
    'beta_high': (20, 30)
}

bp = create_band_power_extractor(
    bands=bands,
    sampling_rate=250,
    average_channels=False,
    log=True
)

# Extract features
X_train_bp = bp.extract(X_train)
X_test_bp = bp.extract(X_test)

print(f"\n=== Band Power Features ===")
print(f"Feature shape: {X_train_bp.shape}")
print(f"Features per trial: {X_train_bp.shape[1]} (22 channels x 3 bands)")

In [None]:
# Step 4.2: Visualize Band Power
import matplotlib.pyplot as plt

# Average band power across trials per class
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
class_names = ['Left Hand', 'Right Hand', 'Feet', 'Tongue']

for class_idx, ax in enumerate(axes.flatten()):
    class_mask = y_train == class_idx
    class_bp = X_train_bp[class_mask].mean(axis=0)
    
    # Reshape to (channels, bands)
    n_channels = 22
    n_bands = 3
    bp_reshaped = class_bp.reshape(n_channels, n_bands)
    
    im = ax.imshow(bp_reshaped.T, aspect='auto', cmap='RdBu_r')
    ax.set_xlabel('Channel')
    ax.set_ylabel('Frequency Band')
    ax.set_yticks([0, 1, 2])
    ax.set_yticklabels(['Mu', 'Beta Low', 'Beta High'])
    ax.set_title(f'{class_names[class_idx]}')
    plt.colorbar(im, ax=ax)

plt.suptitle('Average Band Power by Class', y=1.02, fontsize=14)
plt.tight_layout()
plt.show()

## Step 5: Feature Pipeline

Combine multiple feature extractors into a single pipeline for comprehensive feature representation.

In [None]:
# Step 5.1: Create Feature Pipeline
from src.features import FeatureExtractionPipeline, create_motor_imagery_pipeline

# Create motor imagery optimized pipeline
pipeline = create_motor_imagery_pipeline(
    n_csp_components=6,
    sampling_rate=250
)

# Fit and extract
X_train_features = pipeline.fit_extract(X_train, y_train)
X_test_features = pipeline.extract(X_test)

print(f"\n=== Feature Pipeline ===")
print(f"Combined features shape: {X_train_features.shape}")
print(pipeline.summary())

## Step 6: Classification with LDA

Linear Discriminant Analysis (LDA) is a classic and effective classifier for CSP features.

In [None]:
# Step 6.1: Train LDA Classifier
from src.classifiers import create_lda_classifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Create and train LDA on CSP features
lda = create_lda_classifier(n_classes=4)
lda.fit(X_train_csp, y_train)

# Predict
y_pred_lda = lda.predict(X_test_csp)
y_prob_lda = lda.predict_proba(X_test_csp)

# Evaluate
acc_lda = accuracy_score(y_test, y_pred_lda)

print(f"\n=== CSP + LDA Results ===")
print(f"Accuracy: {acc_lda:.4f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred_lda, target_names=['Left', 'Right', 'Feet', 'Tongue']))

In [None]:
# Step 6.2: Confusion Matrix
import seaborn as sns

cm = confusion_matrix(y_test, y_pred_lda)

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Left', 'Right', 'Feet', 'Tongue'],
            yticklabels=['Left', 'Right', 'Feet', 'Tongue'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title(f'CSP + LDA Confusion Matrix (Accuracy: {acc_lda:.2%})')
plt.show()

## Step 7: Classification with SVM

Support Vector Machine with RBF kernel often achieves higher accuracy than LDA for complex decision boundaries.

In [None]:
# Step 7.1: Train SVM Classifier
from src.classifiers import create_svm_classifier

# Create and train SVM with RBF kernel
svm = create_svm_classifier(kernel='rbf', C=1.0, gamma='scale', n_classes=4)
svm.fit(X_train_csp, y_train)

# Predict
y_pred_svm = svm.predict(X_test_csp)
acc_svm = accuracy_score(y_test, y_pred_svm)

print(f"\n=== CSP + SVM Results ===")
print(f"Accuracy: {acc_svm:.4f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred_svm, target_names=['Left', 'Right', 'Feet', 'Tongue']))
print(f"\nSupport Vectors per class: {svm.n_support_}")

## Step 8: EEGNet Deep Learning Classifier

EEGNet is a compact CNN designed specifically for EEG classification. It can learn directly from raw/preprocessed EEG without manual feature extraction.

In [None]:
# Step 8.1: Create and Train EEGNet
import torch
from src.classifiers import create_eegnet_classifier

# Check device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Create EEGNet
eegnet = create_eegnet_classifier(
    n_classes=4,
    n_channels=22,
    n_samples=1000,  # 4 seconds at 250 Hz
    F1=8,
    D=2,
    dropout_rate=0.5,
    learning_rate=0.001,
    device=device
)

print(f"\n=== EEGNet Model ===")
print(f"Parameters: {eegnet.count_parameters()}")

In [None]:
# Step 8.2: Train EEGNet
# Split training data for validation
X_train_dl, X_val_dl, y_train_dl, y_val_dl = train_test_split(
    X_train, y_train, test_size=0.15, random_state=42, stratify=y_train
)

# Train
print("\n=== Training EEGNet ===")
eegnet.fit(
    X_train_dl.astype(np.float32), 
    y_train_dl,
    validation_data=(X_val_dl.astype(np.float32), y_val_dl),
    epochs=50,
    batch_size=32,
    verbose=1
)

In [None]:
# Step 8.3: Evaluate EEGNet
y_pred_eegnet = eegnet.predict(X_test.astype(np.float32))
acc_eegnet = accuracy_score(y_test, y_pred_eegnet)

print(f"\n=== EEGNet Results ===")
print(f"Accuracy: {acc_eegnet:.4f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred_eegnet, target_names=['Left', 'Right', 'Feet', 'Tongue']))

In [None]:
# Step 8.4: Plot Training History
history = eegnet.get_training_history()

if history:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    # Loss
    ax1 = axes[0]
    ax1.plot(history['train_loss'], label='Train Loss')
    ax1.plot(history['val_loss'], label='Val Loss')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Loss')
    ax1.set_title('EEGNet Training Loss')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # Accuracy
    ax2 = axes[1]
    ax2.plot(history['train_accuracy'], label='Train Acc')
    ax2.plot(history['val_accuracy'], label='Val Acc')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Accuracy')
    ax2.set_title('EEGNet Training Accuracy')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

## Step 9: Model Comparison

In [None]:
# Step 9.1: Compare All Models
from sklearn.metrics import cohen_kappa_score

results = {
    'Model': ['CSP + LDA', 'CSP + SVM', 'EEGNet'],
    'Accuracy': [acc_lda, acc_svm, acc_eegnet],
    'Kappa': [
        cohen_kappa_score(y_test, y_pred_lda),
        cohen_kappa_score(y_test, y_pred_svm),
        cohen_kappa_score(y_test, y_pred_eegnet)
    ]
}

import pandas as pd
results_df = pd.DataFrame(results)
results_df = results_df.sort_values('Accuracy', ascending=False)

print("\n=== Model Comparison ===")
print(results_df.to_string(index=False))

# Visualization
fig, ax = plt.subplots(figsize=(10, 5))
x = np.arange(len(results['Model']))
width = 0.35

bars1 = ax.bar(x - width/2, results['Accuracy'], width, label='Accuracy', color='steelblue')
bars2 = ax.bar(x + width/2, results['Kappa'], width, label='Kappa', color='darkorange')

ax.set_ylabel('Score')
ax.set_title('Model Comparison: Accuracy & Kappa')
ax.set_xticks(x)
ax.set_xticklabels(results['Model'])
ax.legend()
ax.axhline(y=0.85, color='green', linestyle='--', alpha=0.5, label='Target (85%)')
ax.set_ylim([0, 1])
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

## Step 10: Cross-Subject Evaluation (LOSO)

In [None]:
# Step 10.1: Leave-One-Subject-Out Cross-Validation
def run_loso_cv(data_dir, subjects, use_csp_lda=True):
    """
    Run Leave-One-Subject-Out cross-validation.
    
    Args:
        data_dir: Path to dataset
        subjects: List of subject IDs
        use_csp_lda: If True, use CSP+LDA; else use EEGNet
        
    Returns:
        Dict with results per subject and aggregate metrics
    """
    loader = BCICIV2aLoader()
    preproc = create_standard_pipeline(sampling_rate=250, notch_freq=50.0, low_freq=8.0, high_freq=30.0)
    preproc.initialize()
    
    results = []
    
    for test_subject in subjects:
        print(f"\nTest Subject: {test_subject}")
        
        # Collect train and test data
        X_train_all, y_train_all = [], []
        X_test_sub, y_test_sub = None, None
        
        for subject in subjects:
            file_path = os.path.join(data_dir, f"{subject}T.mat")
            if not os.path.exists(file_path):
                continue
            
            eeg_data = loader.load(file_path)
            X, y = loader.extract_trials(eeg_data)
            X = preproc.process(X)
            
            if subject == test_subject:
                X_test_sub, y_test_sub = X, y
            else:
                X_train_all.append(X)
                y_train_all.append(y)
        
        X_train_all = np.concatenate(X_train_all, axis=0)
        y_train_all = np.concatenate(y_train_all, axis=0)
        
        # Train and evaluate
        if use_csp_lda:
            csp = create_csp_extractor(n_components=6, sampling_rate=250)
            X_train_feat = csp.fit_extract(X_train_all, y_train_all)
            X_test_feat = csp.extract(X_test_sub)
            
            clf = create_lda_classifier(n_classes=4)
            clf.fit(X_train_feat, y_train_all)
            y_pred = clf.predict(X_test_feat)
        else:
            clf = create_eegnet_classifier(
                n_classes=4, n_channels=22, n_samples=1000, device='cpu'
            )
            clf.fit(X_train_all.astype(np.float32), y_train_all, epochs=30, verbose=0)
            y_pred = clf.predict(X_test_sub.astype(np.float32))
        
        acc = accuracy_score(y_test_sub, y_pred)
        kappa = cohen_kappa_score(y_test_sub, y_pred)
        
        results.append({
            'subject': test_subject,
            'accuracy': acc,
            'kappa': kappa
        })
        print(f"  Accuracy: {acc:.4f}, Kappa: {kappa:.4f}")
    
    # Aggregate
    accs = [r['accuracy'] for r in results]
    kappas = [r['kappa'] for r in results]
    
    return {
        'results': results,
        'mean_accuracy': np.mean(accs),
        'std_accuracy': np.std(accs),
        'mean_kappa': np.mean(kappas),
        'std_kappa': np.std(kappas)
    }

print("✅ LOSO CV function defined")

In [None]:
# Step 10.2: Run LOSO Cross-Validation
# Uncomment to run (takes several minutes)
# subjects = ['A01', 'A02', 'A03', 'A04', 'A05', 'A06', 'A07', 'A08', 'A09']
# loso_results = run_loso_cv(DATA_DIR, subjects, use_csp_lda=True)
# print(f"\n=== LOSO Results (CSP + LDA) ===")
# print(f"Mean Accuracy: {loso_results['mean_accuracy']:.4f} ± {loso_results['std_accuracy']:.4f}")
# print(f"Mean Kappa: {loso_results['mean_kappa']:.4f} ± {loso_results['std_kappa']:.4f}")

## Step 11: Save and Export Results

In [None]:
# Step 11.1: Export Results to Google Drive
import json
from datetime import datetime

def export_results(results_dict, output_dir):
    """Export experiment results to JSON for sharing."""
    os.makedirs(output_dir, exist_ok=True)
    
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    filename = f"phase3_results_{timestamp}.json"
    filepath = os.path.join(output_dir, filename)
    
    with open(filepath, 'w') as f:
        json.dump(results_dict, f, indent=2)
    
    print(f"Results saved to: {filepath}")
    return filepath

# Export
output_dir = '/content/drive/MyDrive/llm-eeg/experiments'
results_to_export = {
    'experiment': 'Phase 3 - Feature Extraction & Classification',
    'date': datetime.now().isoformat(),
    'subject': 'A01',
    'models': {
        'csp_lda': {'accuracy': float(acc_lda), 'kappa': float(cohen_kappa_score(y_test, y_pred_lda))},
        'csp_svm': {'accuracy': float(acc_svm), 'kappa': float(cohen_kappa_score(y_test, y_pred_svm))},
        'eegnet': {'accuracy': float(acc_eegnet), 'kappa': float(cohen_kappa_score(y_test, y_pred_eegnet))}
    },
    'best_model': max(['csp_lda', 'csp_svm', 'eegnet'], 
                      key=lambda m: results_to_export['models'][m]['accuracy'] if 'models' in dir() else 0)
}

# Uncomment to save
# export_results(results_to_export, output_dir)

In [None]:
# Step 11.2: Save Models
import tempfile

def save_models(models_dict, output_dir):
    """Save trained models."""
    os.makedirs(output_dir, exist_ok=True)
    
    for name, model in models_dict.items():
        if hasattr(model, 'save'):
            filepath = os.path.join(output_dir, f"{name}_model.pkl")
            model.save(filepath)
            print(f"Saved {name} to {filepath}")

# Save CSP extractor and classifiers
# models_to_save = {'csp': csp, 'lda': lda, 'svm': svm}
# save_models(models_to_save, '/content/drive/MyDrive/llm-eeg/models')

## Step 12: Quick Reference & Summary

### Results Summary

Print a comprehensive summary to share with AI assistant for analysis.

In [None]:
# Step 12.1: Generate Summary for AI Assistant
def generate_summary():
    """Generate shareable summary."""
    summary = f"""
# Phase 3 Results Summary

## Dataset
- Subject: A01
- Trials: {X.shape[0]} ({X_train.shape[0]} train, {X_test.shape[0]} test)
- Channels: {X.shape[1]}
- Samples: {X.shape[2]} (4s @ 250Hz)
- Classes: 4 (Left Hand, Right Hand, Feet, Tongue)

## Feature Extraction
- CSP: {X_train_csp.shape[1]} features ({csp._n_components} components)
- Band Power: 3 bands (Mu, Beta-Low, Beta-High)

## Classification Results
| Model | Accuracy | Kappa |
|-------|----------|-------|
| CSP + LDA | {acc_lda:.4f} | {cohen_kappa_score(y_test, y_pred_lda):.4f} |
| CSP + SVM | {acc_svm:.4f} | {cohen_kappa_score(y_test, y_pred_svm):.4f} |
| EEGNet | {acc_eegnet:.4f} | {cohen_kappa_score(y_test, y_pred_eegnet):.4f} |

## Observations
- Best performing model: {'CSP+LDA' if acc_lda > max(acc_svm, acc_eegnet) else 'CSP+SVM' if acc_svm > acc_eegnet else 'EEGNet'}
- Target (85%): {'Achieved ✅' if max(acc_lda, acc_svm, acc_eegnet) >= 0.85 else 'Not yet achieved ❌'}
"""
    return summary

print(generate_summary())

## Phase 3 Complete!

### Summary

You have successfully completed Phase 3: Feature Extraction & Classification:

1. **CSP Feature Extraction**: 6 components, spatial patterns visualization
2. **Band Power Features**: Mu, Beta frequency bands
3. **Feature Pipeline**: Modular multi-extractor pipeline
4. **LDA Classification**: Linear discriminant analysis on CSP features
5. **SVM Classification**: RBF kernel SVM for non-linear boundaries
6. **EEGNet**: End-to-end deep learning classifier
7. **Model Comparison**: Accuracy and Kappa metrics
8. **Cross-Validation**: LOSO setup for subject-independent evaluation
9. **Results Export**: JSON export for AI assistant sharing

### Next Steps: Phase 4 - Agent System

- Adaptive Preprocessing Agent (APA)
- Decision Validation Agent (DVA)
- Q-learning policy optimization
- Cross-trial learning

---

### Quick Reference

```python
# CSP Feature Extraction
csp = create_csp_extractor(n_components=6, sampling_rate=250)
X_csp = csp.fit_extract(X_train, y_train)
X_test_csp = csp.extract(X_test)

# LDA Classification
lda = create_lda_classifier(n_classes=4)
lda.fit(X_csp, y_train)
predictions = lda.predict(X_test_csp)

# EEGNet
eegnet = create_eegnet_classifier(n_classes=4, n_channels=22, n_samples=1000)
eegnet.fit(X_train, y_train, epochs=50)
predictions = eegnet.predict(X_test)
```