# LLM-EEG Framework - Phase 1: Setup and Test

This notebook demonstrates how to set up and test the LLM-EEG Framework in Google Colab.

**Repository**: https://github.com/erlika/llm-eeg

**Phase 1 Deliverables**:
- 9 Core Interfaces (IDataLoader, IPreprocessor, IClassifier, IAgent, etc.)
- 4 Data Types (EEGData, TrialData, EventMarker, DatasetInfo)
- ConfigManager with user-approved defaults
- ComponentRegistry for plugin architecture
- 36 Custom Exceptions
- Logging & Validation Utilities

## Step 1: Clone Repository

In [None]:
# Remove old clone if exists and clone fresh
!rm -rf /content/llm-eeg
!git clone https://github.com/erlika/llm-eeg.git /content/llm-eeg
print("‚úÖ Repository cloned successfully!")

## Step 2: Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')
print("‚úÖ Google Drive mounted!")

## Step 3: Setup Python Path

In [None]:
import sys
import os

REPO_PATH = '/content/llm-eeg'
os.chdir(REPO_PATH)

if REPO_PATH not in sys.path:
    sys.path.insert(0, REPO_PATH)

print(f"‚úÖ Working directory: {os.getcwd()}")
print(f"‚úÖ Python path configured")

## Step 4: Install Dependencies

In [None]:
# Install core dependencies (minimal for Phase 1)
!pip install -q pyyaml numpy scipy
print("‚úÖ Dependencies installed!")

## Step 5: Import Framework Components

In [None]:
# Import core components
from src.core import (
    # Interfaces
    IDataLoader, IPreprocessor, IFeatureExtractor, IClassifier,
    IAgent, IPolicy, IReward,
    ILLMProvider, IStorageAdapter,
    
    # Data Types
    EEGData, TrialData, EventMarker, DatasetInfo,
    
    # Configuration
    ConfigManager, get_config,
    
    # Registry
    ComponentRegistry, get_registry,
    
    # Exceptions
    BCIFrameworkError, DataLoadError, ModelNotFittedError
)

# Import utilities
from src.utils import setup_logging, get_logger

print("‚úÖ All framework components imported successfully!")

## Step 6: Initialize Framework

In [None]:
# Setup logging
setup_logging(level='INFO')
logger = get_logger(__name__)

# Get configuration
config = get_config()

print("="*60)
print("‚úÖ LLM-EEG Framework Initialized!")
print("="*60)
print(f"\nüìÅ Dataset URL: {config.get('data.google_drive.folder_url')}")
print(f"üéØ DVA Confidence Threshold: {config.get('agents.dva.confidence_threshold')}")
print(f"ü§ñ APA Policy Type: {config.get('agents.apa.policy.type')}")
print(f"üîÑ Cross-Trial Learning: {config.get('agents.apa.cross_trial_learning')}")
print(f"üß† LLM Provider: {config.get('llm.provider')}")
print("="*60)

## Step 7: Test Data Types

In [None]:
import numpy as np

# Test EventMarker
event = EventMarker(sample=0, code=1, label='left_hand')
print(f"‚úÖ EventMarker: {event}")

# Test DatasetInfo
dataset_info = DatasetInfo.for_bci_competition_iv_2a()
print(f"\n‚úÖ DatasetInfo:")
print(f"   Name: {dataset_info.name}")
print(f"   Subjects: {dataset_info.n_subjects}")
print(f"   Classes: {dataset_info.n_classes} - {dataset_info.class_names}")
print(f"   Channels: {len(dataset_info.channel_names)}")
print(f"   Sampling Rate: {dataset_info.sampling_rate} Hz")

In [None]:
# Test EEGData
n_channels = 22
n_samples = 1000  # 4 seconds at 250 Hz
sampling_rate = 250

# Simulate EEG signals
signals = np.random.randn(n_channels, n_samples) * 50  # ~50 ¬µV amplitude

# Create event markers
events = [
    EventMarker(sample=0, code=1, label='left_hand'),
    EventMarker(sample=250, code=2, label='right_hand'),
    EventMarker(sample=500, code=3, label='feet'),
    EventMarker(sample=750, code=4, label='tongue'),
]

# Create EEGData object
eeg_data = EEGData(
    signals=signals,
    sampling_rate=sampling_rate,
    channel_names=config.get('data.channel_names'),
    events=events,
    subject_id='S01',
    session_id='T'
)

print(f"\n‚úÖ EEGData created:")
print(f"   {eeg_data}")
print(f"   Shape: {eeg_data.shape}")
print(f"   Duration: {eeg_data.duration_seconds:.1f} seconds")
print(f"   Events: {eeg_data.n_events}")
print(f"   Channels: {eeg_data.channel_names[:5]}...")

In [None]:
# Test TrialData
trial_signals = np.random.randn(22, 1000) * 50

trial = TrialData(
    signals=trial_signals,
    label=0,
    label_name='left_hand',
    trial_id=1,
    subject_id='S01',
    session_id='T',
    sampling_rate=250
)

print(f"\n‚úÖ TrialData created:")
print(f"   {trial}")
print(f"   Shape: {trial.signals.shape}")
print(f"   Label: {trial.label} ({trial.label_name})")

## Step 8: Test Component Registry

In [None]:
# Get registry
registry = get_registry()

# List available categories
categories = registry.get_categories()
print("‚úÖ Component Registry Categories:")
for cat in categories:
    print(f"   ‚Ä¢ {cat}")

In [None]:
# Test registering a custom component
class MyCustomLoader:
    """Example custom data loader."""
    def __init__(self, path=''):
        self.path = path
    
    def load(self):
        return f"Loading from {self.path}"

# Register the component
registry.register('data_loader', 'my_custom', MyCustomLoader)

# List data loaders
loaders = registry.list('data_loader')
print(f"\n‚úÖ Registered data loaders: {loaders}")

# Create instance
loader = registry.create('data_loader', 'my_custom')
print(f"‚úÖ Created loader: {type(loader).__name__}")

## Step 9: Test Exceptions

In [None]:
from src.core.exceptions import (
    DataLoadError, DataValidationError,
    PreprocessingError, ModelNotFittedError,
    AgentNotInitializedError, LLMNotLoadedError
)

# Test exception creation
try:
    raise DataLoadError('/path/to/file.mat', reason='File not found')
except DataLoadError as e:
    print(f"‚úÖ DataLoadError: {e.message}")

try:
    raise ModelNotFittedError('EEGNet')
except ModelNotFittedError as e:
    print(f"‚úÖ ModelNotFittedError: {e.message}")

# Count exceptions
from src.core import exceptions
exc_count = len([name for name in dir(exceptions) if name.endswith('Error')])
print(f"\n‚úÖ Total exception types available: {exc_count}")

## Step 10: Validation Utilities Test

In [None]:
from src.utils.validation import (
    check_type, check_range, check_probability,
    validate_array, validate_eeg_data
)

# Test type checking
check_type(42, int, 'test_int')
print("‚úÖ Type validation passed")

# Test range checking
check_range(0.8, min_val=0, max_val=1, name='confidence')
print("‚úÖ Range validation passed")

# Test probability checking
check_probability(0.8, name='threshold')
print("‚úÖ Probability validation passed")

# Test array validation
test_array = np.random.randn(10, 22, 1000)
validate_array(test_array, expected_ndim=3, name='eeg_batch')
print("‚úÖ Array validation passed")

# Test EEG data validation
eeg_array = np.random.randn(22, 1000)
validate_eeg_data(eeg_array, n_channels=22, name='eeg_signal')
print("‚úÖ EEG data validation passed")

## Summary

### Phase 1 Components Tested:
- ‚úÖ Repository cloned and configured
- ‚úÖ Google Drive mounted
- ‚úÖ All imports working
- ‚úÖ ConfigManager with user-approved defaults
- ‚úÖ EEGData, TrialData, EventMarker, DatasetInfo
- ‚úÖ ComponentRegistry
- ‚úÖ Custom Exceptions
- ‚úÖ Validation Utilities

### User-Approved Settings:
- DVA Confidence Threshold: 0.8
- APA Policy Type: q_learning
- Cross-Trial Learning: True
- LLM Provider: phi3

### Next Steps (Phase 2):
- Implement MatLoader for BCI Competition IV-2a .mat files
- Google Drive integration
- Preprocessing pipeline (bandpass, notch, artifact removal)
- Data validation and checkpointing

In [None]:
print("="*60)
print("üéâ PHASE 1 VERIFICATION COMPLETE!")
print("="*60)
print("\nAll Phase 1 components are working correctly.")
print("Ready for Phase 2: Data Loading & Processing")
print("="*60)