# MNE Data Analysis Pipeline with Metadata

This notebook provides tools for:
- Loading FIF files and their corresponding CSV metadata
- Concatenating multiple sessions from a subject
- Aligning neural data with behavioral metadata
- Filtering and preprocessing
- Creating epochs with metadata for trial selection
- ERP analysis by condition (accuracy, set size, etc.)
- Reaction time analysis

## 1. Imports

In [None]:
import h5py
import numpy as np
import mne
from pathlib import Path
from typing import Dict, List, Optional, Tuple
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import os
from scipy import stats as scipy_stats
import glob

# Set plotting style
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 8)

# For interactive plotting
matplotlib.use("Qt5Agg")

Channels marked as bad:
none


## 2. Metadata Loading Functions

In [2]:
def load_session_metadata(csv_path):
    """
    Load metadata from a single CSV file.
    
    Parameters
    ----------
    csv_path : str
        Path to CSV metadata file
    
    Returns
    -------
    metadata : pd.DataFrame
        Trial metadata with columns: trial_number, set_size, match, correct, 
        response, response_time, probe_letter
    """
    metadata = pd.read_csv(csv_path)
    
    # Convert to appropriate types
    numeric_cols = ['trial_number', 'set_size', 'match', 'correct', 'response', 'response_time']
    for col in numeric_cols:
        if col in metadata.columns:
            metadata[col] = pd.to_numeric(metadata[col], errors='coerce')
    
    print(f"Loaded metadata: {len(metadata)} trials")
    print(f"  Columns: {list(metadata.columns)}")
    
    # Show summary
    if 'correct' in metadata.columns:
        acc = metadata['correct'].mean() * 100
        print(f"  Accuracy: {acc:.1f}%")
    if 'response_time' in metadata.columns:
        rt = metadata['response_time'].mean()
        print(f"  Mean RT: {rt:.3f}s")
    if 'set_size' in metadata.columns:
        sizes = sorted(metadata['set_size'].dropna().unique())
        print(f"  Set sizes: {sizes}")
    
    return metadata


def load_subject_metadata(subject_dir, subject_id=None, pattern='*metadata*.csv'):
    """
    Load and concatenate metadata from all sessions for a subject.
    
    Parameters
    ----------
    subject_dir : str
        Directory containing subject's session files
    subject_id : str, optional
        Subject identifier to add as a column
    pattern : str
        Glob pattern to match metadata CSV files (default: '*metadata*.csv')
    
    Returns
    -------
    metadata_all : pd.DataFrame
        Concatenated metadata from all sessions with added 'session' column
    """
    csv_files = sorted(glob.glob(os.path.join(subject_dir, pattern)))
    
    if len(csv_files) == 0:
        raise ValueError(f"No metadata CSV files found in {subject_dir}")
    
    print(f"Found {len(csv_files)} metadata files:")
    for f in csv_files:
        print(f"  {os.path.basename(f)}")
    
    print("\n" + "="*80)
    
    metadata_list = []
    
    for session_idx, csv_file in enumerate(csv_files, start=1):
        print(f"\nLoading session {session_idx}: {os.path.basename(csv_file)}")
        
        metadata = pd.read_csv(csv_file)
        
        # Add session identifier
        metadata['session'] = session_idx
        metadata['session_file'] = os.path.basename(csv_file)
        
        if subject_id is not None:
            metadata['subject'] = subject_id
        
        # Convert types
        numeric_cols = ['trial_number', 'set_size', 'match', 'correct', 'response', 'response_time']
        for col in numeric_cols:
            if col in metadata.columns:
                metadata[col] = pd.to_numeric(metadata[col], errors='coerce')
        
        print(f"  Trials: {len(metadata)}")
        if 'correct' in metadata.columns:
            acc = metadata['correct'].mean() * 100
            print(f"  Accuracy: {acc:.1f}%")
        
        metadata_list.append(metadata)
    
    # Concatenate all sessions
    metadata_all = pd.concat(metadata_list, ignore_index=True)
    
    print("\n" + "="*80)
    print(f"\nCombined metadata:")
    print(f"  Total trials: {len(metadata_all)}")
    print(f"  Sessions: {metadata_all['session'].nunique()}")
    
    if 'correct' in metadata_all.columns:
        overall_acc = metadata_all['correct'].mean() * 100
        print(f"  Overall accuracy: {overall_acc:.1f}%")
        
        # Per-session accuracy
        print(f"\n  Per-session accuracy:")
        for session in sorted(metadata_all['session'].unique()):
            session_data = metadata_all[metadata_all['session'] == session]
            acc = session_data['correct'].mean() * 100
            print(f"    Session {session}: {acc:.1f}%")
    
    if 'response_time' in metadata_all.columns:
        overall_rt = metadata_all['response_time'].mean()
        print(f"\n  Overall mean RT: {overall_rt:.3f}s")
    
    return metadata_all

## 3. Multi-Session Concatenation Functions

In [3]:
def load_and_concatenate_subject(
    subject_dir: str,
    use_common_channels: bool = True,
    preload: bool = True,
    verbose: bool = False
) -> Tuple[mne.io.Raw, pd.DataFrame]:
    """
    Load and concatenate all FIF files and metadata for a subject.
    
    Automatically finds all .fif and .csv files in the directory.
    Matches them by sorting alphabetically.
    
    Parameters
    ----------
    subject_dir : str
        Directory containing subject's session files
    use_common_channels : bool
        If True, only keep channels common to all files
    preload : bool
        Whether to load data into memory
    verbose : bool
        Verbose output
    
    Returns
    -------
    raw_concat : mne.io.Raw
        Concatenated Raw object from all sessions
    metadata_all : pd.DataFrame
        Concatenated metadata from all sessions with 'session' column
    
    Examples
    --------
    >>> raw, metadata = load_and_concatenate_subject('data/Subject_01/')
    """
    
    print("="*80)
    print("LOADING SUBJECT DATA")
    print("="*80)
    print(f"\nDirectory: {subject_dir}")
    
    # Get all files in directory
    all_files = os.listdir(subject_dir)
    
    # Find FIF files
    fif_files = sorted([f for f in all_files if f.endswith('.fif')])
    
    # Find CSV files (look for files with .csv extension)
    csv_files = sorted([f for f in all_files if f.endswith('.csv')])
    
    # Verify we found files
    if len(fif_files) == 0:
        raise ValueError(f"No .fif files found in {subject_dir}")
    
    if len(csv_files) == 0:
        raise ValueError(f"No .csv files found in {subject_dir}")
    
    print(f"\n[1/2] Loading FIF files...")
    print(f"Found {len(fif_files)} FIF files:")
    for f in fif_files:
        print(f"  {f}")
    
    # Load FIF files
    raw_list = []
    
    for fif_file in fif_files:
        full_path = os.path.join(subject_dir, fif_file)
        raw = mne.io.read_raw_fif(full_path, preload=False, verbose=verbose)
        raw_list.append(raw)
        
        if verbose:
            print(f"  Loaded: {fif_file} ({len(raw.ch_names)} channels)")
    
    # Find and use common channels
    if use_common_channels and len(raw_list) > 1:
        common_channels = set(raw_list[0].ch_names)
        for raw in raw_list[1:]:
            common_channels &= set(raw.ch_names)
        
        common_channels = sorted(list(common_channels))
        print(f"\nUsing {len(common_channels)} common channels")
        
        # Pick common channels from all files
        for i, raw in enumerate(raw_list):
            raw_list[i] = raw.copy().pick_channels(common_channels, ordered=True)
    
    # Concatenate FIF files
    print("\nConcatenating FIF files...")
    raw_concat = mne.concatenate_raws(raw_list, preload=preload, verbose=verbose)
    
    print(f"✓ Neural data concatenated:")
    print(f"  Duration: {raw_concat.times[-1]:.2f}s")
    print(f"  Channels: {len(raw_concat.ch_names)}")
    print(f"  Sampling rate: {raw_concat.info['sfreq']} Hz")
    
    # Load metadata
    print("\n" + "="*80)
    print("[2/2] Loading metadata...")
    print(f"Found {len(csv_files)} CSV files:")
    for f in csv_files:
        print(f"  {f}")
    
    metadata_list = []
    
    for session_idx, csv_file in enumerate(csv_files, start=1):
        full_path = os.path.join(subject_dir, csv_file)
        
        if verbose:
            print(f"\n  Loading session {session_idx}: {csv_file}")
        
        metadata = pd.read_csv(full_path)
        
        # Add session identifier
        metadata['session'] = session_idx
        metadata['session_file'] = csv_file
        
        # Convert to numeric types
        numeric_cols = ['trial_number', 'set_size', 'match', 'correct', 
                       'response', 'response_time']
        for col in numeric_cols:
            if col in metadata.columns:
                metadata[col] = pd.to_numeric(metadata[col], errors='coerce')
        
        if verbose:
            print(f"    Trials: {len(metadata)}")
            if 'correct' in metadata.columns:
                acc = metadata['correct'].mean() * 100
                print(f"    Accuracy: {acc:.1f}%")
        
        metadata_list.append(metadata)
    
    # Concatenate all metadata
    metadata_all = pd.concat(metadata_list, ignore_index=True)
    
    print(f"\n✓ Metadata concatenated:")
    print(f"  Total trials: {len(metadata_all)}")
    print(f"  Sessions: {metadata_all['session'].nunique()}")
    
    if 'correct' in metadata_all.columns:
        overall_acc = metadata_all['correct'].mean() * 100
        print(f"  Overall accuracy: {overall_acc:.1f}%")
    
    if 'response_time' in metadata_all.columns:
        overall_rt = metadata_all['response_time'].mean()
        print(f"  Overall mean RT: {overall_rt:.3f}s")
    
    # Verify alignment
    print("\n" + "="*80)
    print("VERIFICATION")
    print("="*80)
    
    n_sessions_fif = len(fif_files)
    n_sessions_meta = metadata_all['session'].nunique()
    
    if n_sessions_fif == n_sessions_meta:
        print(f"✓ Session count matches: {n_sessions_fif} sessions")
    else:
        print(f"⚠ WARNING: Session count mismatch!")
        print(f"  FIF files: {n_sessions_fif}")
        print(f"  CSV files: {n_sessions_meta}")
    
    print("\n✓ Loading complete!")
    
    return raw_concat, metadata_all


def load_and_concatenate_subject_paired(
    subject_dir: str,
    fif_prefix: Optional[str] = None,
    csv_suffix: Optional[str] = None,
    use_common_channels: bool = True,
    preload: bool = True,
    verbose: bool = False
) -> Tuple[mne.io.Raw, pd.DataFrame]:
    """
    Load and concatenate with smart FIF-CSV pairing.
    
    Pairs files based on shared naming (e.g., Session_01_raw.fif with Session_01_metadata.csv).
    
    Parameters
    ----------
    subject_dir : str
        Directory containing subject's session files
    fif_prefix : str, optional
        Only load FIF files starting with this prefix
    csv_suffix : str, optional
        Only load CSV files with this suffix (e.g., 'metadata')
    use_common_channels : bool
        If True, only keep channels common to all files
    preload : bool
        Whether to load data into memory
    verbose : bool
        Verbose output
    
    Returns
    -------
    raw_concat : mne.io.Raw
        Concatenated Raw object
    metadata_all : pd.DataFrame
        Concatenated metadata
    
    Examples
    --------
    >>> # Load only files with specific naming
    >>> raw, meta = load_and_concatenate_subject_paired(
    ...     'data/Subject_01/',
    ...     csv_suffix='metadata'
    ... )
    """
    
    print("="*80)
    print("LOADING SUBJECT DATA (PAIRED MODE)")
    print("="*80)
    print(f"\nDirectory: {subject_dir}")
    
    # Get all files
    all_files = os.listdir(subject_dir)
    
    # Find FIF files
    fif_files = [f for f in all_files if f.endswith('.fif')]
    if fif_prefix:
        fif_files = [f for f in fif_files if f.startswith(fif_prefix)]
    fif_files = sorted(fif_files)
    
    # Find CSV files
    csv_files = [f for f in all_files if f.endswith('.csv')]
    if csv_suffix:
        csv_files = [f for f in csv_files if csv_suffix in f.lower()]
    csv_files = sorted(csv_files)
    
    # Verify
    if len(fif_files) == 0:
        raise ValueError(f"No FIF files found in {subject_dir}")
    if len(csv_files) == 0:
        raise ValueError(f"No CSV files found in {subject_dir}")
    
    print(f"\n[1/2] Loading {len(fif_files)} FIF files...")
    for f in fif_files:
        print(f"  {f}")
    
    # Try to pair FIF with CSV files
    paired_files = []
    
    for fif_file in fif_files:
        # Extract base name (remove extension and common suffixes)
        base_name = fif_file.replace('.fif', '').replace('_raw', '').replace('_eeg', '')
        
        # Look for matching CSV
        matching_csv = None
        for csv_file in csv_files:
            csv_base = csv_file.replace('.csv', '').replace('_metadata', '')
            if base_name in csv_base or csv_base in base_name:
                matching_csv = csv_file
                break
        
        if matching_csv:
            paired_files.append((fif_file, matching_csv))
            if verbose:
                print(f"  Paired: {fif_file} ↔ {matching_csv}")
        else:
            # No match found, still use this FIF but warn
            paired_files.append((fif_file, None))
            print(f"  ⚠ No matching CSV for: {fif_file}")
    
    # Load FIF files
    raw_list = []
    for fif_file, _ in paired_files:
        full_path = os.path.join(subject_dir, fif_file)
        raw = mne.io.read_raw_fif(full_path, preload=False, verbose=verbose)
        raw_list.append(raw)
    
    # Common channels
    if use_common_channels and len(raw_list) > 1:
        common_channels = set(raw_list[0].ch_names)
        for raw in raw_list[1:]:
            common_channels &= set(raw.ch_names)
        common_channels = sorted(list(common_channels))
        print(f"\nUsing {len(common_channels)} common channels")
        
        for i, raw in enumerate(raw_list):
            raw_list[i] = raw.copy().pick_channels(common_channels, ordered=True)
    
    # Concatenate
    raw_concat = mne.concatenate_raws(raw_list, preload=preload, verbose=verbose)
    print(f"\n✓ Neural data: {raw_concat.times[-1]:.2f}s, {len(raw_concat.ch_names)} channels")
    
    # Load metadata
    print("\n[2/2] Loading metadata...")
    metadata_list = []
    
    for session_idx, (fif_file, csv_file) in enumerate(paired_files, start=1):
        if csv_file is None:
            print(f"  ⚠ Skipping session {session_idx} (no CSV)")
            continue
        
        full_path = os.path.join(subject_dir, csv_file)
        metadata = pd.read_csv(full_path)
        metadata['session'] = session_idx
        metadata['session_file'] = csv_file
        metadata['fif_file'] = fif_file
        
        # Convert types
        numeric_cols = ['trial_number', 'set_size', 'match', 'correct', 
                       'response', 'response_time']
        for col in numeric_cols:
            if col in metadata.columns:
                metadata[col] = pd.to_numeric(metadata[col], errors='coerce')
        
        metadata_list.append(metadata)
    
    if len(metadata_list) == 0:
        raise ValueError("No metadata files could be loaded")
    
    metadata_all = pd.concat(metadata_list, ignore_index=True)
    
    print(f"\n✓ Metadata: {len(metadata_all)} trials from {len(metadata_list)} sessions")
    
    return raw_concat, metadata_all


## 4. Epochs with Metadata

In [4]:
def create_epochs_with_metadata(
    raw,
    metadata,
    event_id,
    tmin=-0.2,
    tmax=0.8,
    baseline=(None, 0),
    event_name='retrieval',
    reject=None,
    preload=True
):
    """
    Create epochs with attached metadata for advanced trial selection.
    
    Parameters
    ----------
    raw : mne.io.Raw
        Raw data object
    metadata : pd.DataFrame
        Trial metadata DataFrame
    event_id : dict
        Event ID dictionary
    tmin, tmax : float
        Epoch time window
    baseline : tuple
        Baseline correction window
    event_name : str
        Which event to epoch around (default: 'retrieval')
    reject : dict, optional
        Rejection criteria
    preload : bool
        Load data into memory
    
    Returns
    -------
    epochs : mne.Epochs
        Epochs object with metadata attached
    """
    # Validate metadata
    if 'response_time' in metadata.columns:
        valid_rt = metadata['response_time'].notna()
        if not valid_rt.all():
            print(f"⚠ Warning: {(~valid_rt).sum()} trials have missing RT")
        
        rt_values = metadata.loc[valid_rt, 'response_time']
        print(f"RT stats: mean={rt_values.mean():.3f}s, "
              f"range=[{rt_values.min():.3f}, {rt_values.max():.3f}]s")
    
    # Extract ALL events
    events, event_dict = mne.events_from_annotations(raw)
    
    # Filter to only the events we want to epoch around
    target_event_code = event_id[event_name]
    event_mask = events[:, 2] == target_event_code
    filtered_events = events[event_mask]
    
    # Verify we have the right number of trials
    n_events = len(filtered_events)
    n_metadata = len(metadata)
    
    print(f"\nFound {n_events} '{event_name}' events (out of {len(events)} total events)")
    print(f"Have {n_metadata} metadata rows")
    
    if n_events != n_metadata:
        print(f"\n⚠ WARNING: Event count mismatch!")
        min_trials = min(n_events, n_metadata)
        print(f"  Using minimum: {min_trials} trials")
        
        # Trim both to match
        filtered_events = filtered_events[:min_trials]
        metadata = metadata.iloc[:min_trials].copy()
        
        print(f"  Trimmed to {len(filtered_events)} events and {len(metadata)} metadata rows")
    
    # Create epochs with FILTERED events
    epochs = mne.Epochs(
        raw,
        filtered_events,  # <--- CHANGED: Use filtered_events, not events
        event_id={event_name: target_event_code},
        tmin=tmin,
        tmax=tmax,
        baseline=baseline,
        metadata=metadata,
        reject=reject,
        preload=preload,
        verbose=False
    )
    
    print(f"\n✓ Created {len(epochs)} epochs with metadata")
    print(f"  Time window: {tmin} to {tmax}s")
    print(f"  Baseline: {baseline}")
    
    # Show metadata columns available for selection
    if metadata is not None and len(metadata.columns) > 0:
        print(f"  Available metadata: {list(metadata.columns)}")
    
    return epochs

## 5. Usage Example: Load Subject Data

In [5]:
if __name__ == '__main__':
    # Example 1: Simple loading (all .fif and .csv files)
    subject_dir = 'Data_converted_MetaData\Subject_01'
    
    try:
        raw, metadata = load_and_concatenate_subject(
            subject_dir=subject_dir,
            use_common_channels=True,
            preload=True,
            verbose=True
        )
        
        print("\n" + "="*80)
        print("SUCCESS!")
        print("="*80)
        print(f"Loaded {len(raw.ch_names)} channels, {raw.times[-1]:.1f}s")
        print(f"Loaded {len(metadata)} trials")
        
    except FileNotFoundError as e:
        print(f"\nDirectory not found: {e}")
        print("Please update subject_dir to point to your data")
    except ValueError as e:
        print(f"\nError: {e}")
    
    print("\n" + "="*80)



LOADING SUBJECT DATA

Directory: Data_converted_MetaData\Subject_01

[1/2] Loading FIF files...
Found 4 FIF files:
  Data_Subject_01_Session_01.h5_seeg_raw.fif
  Data_Subject_01_Session_02.h5_seeg_raw.fif
  Data_Subject_01_Session_03.h5_seeg_raw.fif
  Data_Subject_01_Session_04.h5_seeg_raw.fif
Opening raw data file Data_converted_MetaData\Subject_01\Data_Subject_01_Session_01.h5_seeg_raw.fif...
Isotrak not found
    Range : 0 ... 79999 =      0.000 ...   399.995 secs
Ready.


  subject_dir = 'Data_converted_MetaData\Subject_01'


  Loaded: Data_Subject_01_Session_01.h5_seeg_raw.fif (20 channels)
Opening raw data file Data_converted_MetaData\Subject_01\Data_Subject_01_Session_02.h5_seeg_raw.fif...
Isotrak not found
    Range : 0 ... 79999 =      0.000 ...   399.995 secs
Ready.
  Loaded: Data_Subject_01_Session_02.h5_seeg_raw.fif (20 channels)
Opening raw data file Data_converted_MetaData\Subject_01\Data_Subject_01_Session_03.h5_seeg_raw.fif...
Isotrak not found
    Range : 0 ... 79999 =      0.000 ...   399.995 secs
Ready.
  Loaded: Data_Subject_01_Session_03.h5_seeg_raw.fif (20 channels)
Opening raw data file Data_converted_MetaData\Subject_01\Data_Subject_01_Session_04.h5_seeg_raw.fif...
Isotrak not found
    Range : 0 ... 79999 =      0.000 ...   399.995 secs
Ready.
  Loaded: Data_Subject_01_Session_04.h5_seeg_raw.fif (20 channels)

Using 20 common channels
NOTE: pick_channels() is a legacy function. New code should use inst.pick(...).
NOTE: pick_channels() is a legacy function. New code should use inst.pick(

In [6]:
# Inspect metadata
print("\nMetadata columns:")
print(metadata.columns.tolist())

print("\nFirst few trials:")
display(metadata.head(10))

print("\nMetadata summary:")
display(metadata.describe())


Metadata columns:
['trial_number', 'set_size', 'match', 'correct', 'response', 'response_time', 'probe_letter', 'session', 'session_file']

First few trials:


Unnamed: 0,trial_number,set_size,match,correct,response,response_time,probe_letter,session,session_file
0,1.0,8.0,2.0,1.0,52.0,2.484,H,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
1,2.0,4.0,1.0,1.0,51.0,1.66775,T,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
2,3.0,8.0,2.0,1.0,52.0,1.472,L,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
3,4.0,6.0,2.0,1.0,52.0,1.30875,G,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
4,5.0,8.0,2.0,1.0,52.0,1.51625,S,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
5,6.0,4.0,1.0,1.0,51.0,0.899,R,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
6,7.0,8.0,2.0,1.0,52.0,1.132,N,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
7,8.0,8.0,1.0,1.0,51.0,1.58025,Z,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
8,9.0,4.0,2.0,1.0,52.0,0.8845,Z,1,Data_Subject_01_Session_01.h5_seeg_raw.csv
9,10.0,6.0,2.0,1.0,52.0,0.943,V,1,Data_Subject_01_Session_01.h5_seeg_raw.csv



Metadata summary:


Unnamed: 0,trial_number,set_size,match,correct,response,response_time,session
count,200.0,200.0,200.0,200.0,200.0,200.0,200.0
mean,25.5,5.97,1.51,0.925,47.775,1.304856,2.5
std,14.467083,1.683231,0.501154,0.264052,13.190692,0.669556,1.12084
min,1.0,4.0,1.0,0.0,1.0,0.591375,1.0
25%,13.0,4.0,1.0,1.0,51.0,0.918281,1.75
50%,25.5,6.0,2.0,1.0,51.0,1.089875,2.5
75%,38.0,8.0,2.0,1.0,52.0,1.401563,3.25
max,50.0,8.0,2.0,1.0,52.0,5.585375,4.0


## 6. Filtering and Preprocessing

In [7]:
# Filter the data
raw_filtered = raw.copy().filter(l_freq=0.1, h_freq=40.0, verbose=True)
raw_filtered_ica = raw.copy().filter(l_freq=1, h_freq = 40, verbose=True)

print("\n✓ Applied bandpass filter: 0.1-40 Hz")

Filtering raw data in 4 contiguous segments
Setting up band-pass filter from 0.1 - 40 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 0.10
- Lower transition bandwidth: 0.10 Hz (-6 dB cutoff frequency: 0.05 Hz)
- Upper passband edge: 40.00 Hz
- Upper transition bandwidth: 10.00 Hz (-6 dB cutoff frequency: 45.00 Hz)
- Filter length: 6601 samples (33.005 s)

Filtering raw data in 4 contiguous segments
Setting up band-pass filter from 1 - 40 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 1.00
- Lower transition bandwidth: 1.00 Hz (-6 dB cutoff frequency: 0.50 Hz)
- Upper passband e

## 7. Create Epochs with Metadata

In [8]:
# Extract events and event_id from raw
events, event_id = mne.events_from_annotations(raw)

print("Event types:")
for event_name, event_code in event_id.items():
    n_events = len(events[events[:, 2] == event_code])
    print(f"  {event_name}: {n_events} events (code {event_code})")

Used Annotations descriptions: [np.str_('encoding'), np.str_('fixation'), np.str_('maintenance'), np.str_('response'), np.str_('retrieval')]
Event types:
  encoding: 200 events (code 1)
  fixation: 200 events (code 2)
  maintenance: 200 events (code 3)
  response: 200 events (code 4)
  retrieval: 200 events (code 5)


In [9]:
evoked = epochs[0:100].average()
evoked.plot()

NameError: name 'epochs' is not defined

In [None]:
# Create epochs with metadata
# Epoch around response event
# response_epochs = create_epochs_with_metadata(
#     raw=raw_filtered_ica,
#     metadata=metadata,
#     event_id= event_id,
#     tmin=-1.02,
#     tmax=0.98,
#     baseline=(None, 0),
#     event_name='response',  # Change to your event name
#     preload=True
# )
# response_epochs.plot()

# Create epochs with metadata
# Epoch around retrieval event
# retrieval_epochs = create_epochs_with_metadata(
#     raw=raw_filtered_ica,
#     metadata=metadata,
#     event_id= event_id,
#     tmin=-0.2,
#     tmax=2,
#     baseline=(-0.2, 0),
#     event_name='retrieval',  # Change to your event name
#     preload=True
# )
# retrieval_epochs.plot()

RT stats: mean=1.305s, range=[0.591, 5.585]s
Used Annotations descriptions: [np.str_('encoding'), np.str_('fixation'), np.str_('maintenance'), np.str_('response'), np.str_('retrieval')]

Found 200 'response' events (out of 1000 total events)
Have 200 metadata rows

✓ Created 199 epochs with metadata
  Time window: -1.02 to 0.98s
  Baseline: (None, 0)
  Available metadata: ['trial_number', 'set_size', 'match', 'correct', 'response', 'response_time', 'probe_letter', 'session', 'session_file']
RT stats: mean=1.305s, range=[0.591, 5.585]s
Used Annotations descriptions: [np.str_('encoding'), np.str_('fixation'), np.str_('maintenance'), np.str_('response'), np.str_('retrieval')]

Found 200 'retrieval' events (out of 1000 total events)
Have 200 metadata rows

✓ Created 196 epochs with metadata
  Time window: -0.2 to 2s
  Baseline: (-0.2, 0)
  Available metadata: ['trial_number', 'set_size', 'match', 'correct', 'response', 'response_time', 'probe_letter', 'session', 'session_file']


<mne_qt_browser._pg_figure.MNEQtBrowser at 0x1ceba592330>

In [8]:
n_components = 0.999  # Should normally be higher, like 0.999!!
method = 'picard'
max_iter = 500  # Should normally be higher, like 500 or even 1000!!

random_state = 42

retrieval_ica = mne.preprocessing.ICA(n_components=n_components,
                            method=method,
                            max_iter=max_iter,
                            random_state=random_state)
retrieval_ica.fit(raw_filtered_ica)

Fitting ICA to data using 19 channels (please be patient, this may take a while)
Selecting by explained variance: 15 components
Fitting ICA took 19.0s.


0,1
Method,picard
Fit parameters,max_iter=500
Fit,67 iterations on raw data (320000 samples)
ICA components,15
Available PCA components,19
Channel types,eeg
ICA components marked for exclusion,—


In [11]:
retrieval_ica.plot_components(inst = raw_filtered_ica)

    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items activated
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items activated
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items activated
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items activated
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items activated
    Using multitaper spectrum estimation with 7 DPSS windows
Not setting metadata
200 matching events found
No baseline correction applied
0 projection items ac

<MNEFigure size 2880x1760 with 15 Axes>

In [17]:
def add_arrows(axes):
    # add some arrows at 60 Hz and its harmonics
    for ax in axes:
        freqs = ax.lines[-1].get_xdata()
        psds = ax.lines[-1].get_ydata()
        for freq in (60, 120, 180, 240):
            idx = np.searchsorted(freqs, freq)
            # get ymax of a small region around the freq. of interest
            y = psds[(idx - 4) : (idx + 5)].max()
            ax.arrow(
                x=freqs[idx],
                y=y + 18,
                dx=0,
                dy=-12,
                color="red",
                width=0.1,
                head_width=3,
                length_includes_head=True,
            )

raw_firt_h = raw.copy().crop(tmin=0, tmax=800)
fig = raw_firt_h.compute_psd(fmax=100).plot(average=True)
add_arrows(fig.axes[:2])

Effective window size : 10.240 (s)
Plotting power spectral density (dB=True).


IndexError: index 1025 is out of bounds for axis 0 with size 1025

In [None]:
ica.apply(raw, exclude=blink_idx + heartbeat_idx)
ica.plot_overlay(raw, exclude=muscle_idx)

## 8. Trial Selection Using Metadata

In [33]:
# Now you can select trials using pandas-style queries!

# Example 1: Select correct trials only
epochs_correct = epochs['correct == 1']
print(f"Correct trials: {len(epochs_correct)}")

# Example 2: Select incorrect trials
epochs_incorrect = epochs['correct == 0']
print(f"Incorrect trials: {len(epochs_incorrect)}")

# Example 3: Select high memory load trials (set size >= 6)
epochs_high_load = epochs['set_size >= 6']
print(f"High load trials (set size ≥ 6): {len(epochs_high_load)}")

# Example 4: Select match trials
epochs_match = epochs['match == 1']
print(f"Match trials: {len(epochs_match)}")

# Example 5: Complex query - correct high load trials
epochs_correct_high_load = epochs['correct == 1 and set_size >= 6']
print(f"Correct high load trials: {len(epochs_correct_high_load)}")

# Example 6: Fast responses (RT < median)
median_rt = epochs.metadata['response_time'].median()
epochs_fast = epochs[f'response_time < {median_rt}']
print(f"Fast responses (RT < {median_rt:.3f}s): {len(epochs_fast)}")

Correct trials: 184
Incorrect trials: 15
High load trials (set size ≥ 6): 128
Match trials: 97
Correct high load trials: 113
Fast responses (RT < 1.087s): 99


## 9. ERP Analysis by Condition

In [34]:
# Compare correct vs incorrect trials
evoked_correct = epochs['correct == 1'].average()
evoked_incorrect = epochs['correct == 0'].average()

# Plot comparison
mne.viz.plot_compare_evokeds(
    {'Correct': evoked_correct, 'Incorrect': evoked_incorrect},
    picks='eeg',
    combine='mean'
)

combining channels using "mean"
combining channels using "mean"


[<Figure size 800x600 with 1 Axes>]

In [None]:
# Compare by memory load (set size)
evoked_by_load = {}

for set_size in sorted(epochs.metadata['set_size'].dropna().unique()):
    epochs_size = epochs[f'set_size == {set_size}']
    evoked_by_load[f'Set size {int(set_size)}'] = epochs_size.average()

# Plot comparison
mne.viz.plot_compare_evokeds(
    evoked_by_load,
    picks='eeg',
    combine='mean'
)

In [None]:
# Compare match vs non-match
evoked_match = epochs['match == 1'].average()
evoked_nonmatch = epochs['match == 0'].average()

mne.viz.plot_compare_evokeds(
    {'Match': evoked_match, 'Non-match': evoked_nonmatch},
    picks='eeg',
    combine='mean'
)

## 10. Reaction Time Analysis with Metadata

In [9]:
# RT summary statistics
print("Reaction Time Statistics:")
print(f"  Mean: {metadata['response_time'].mean():.3f}s")
print(f"  Median: {metadata['response_time'].median():.3f}s")
print(f"  SD: {metadata['response_time'].std():.3f}s")
print(f"  Min: {metadata['response_time'].min():.3f}s")
print(f"  Max: {metadata['response_time'].max():.3f}s")

Reaction Time Statistics:
  Mean: 1.305s
  Median: 1.090s
  SD: 0.670s
  Min: 0.591s
  Max: 5.585s


In [None]:
# RT by accuracy
rt_correct = metadata_all[metadata_all['correct'] == 1]['response_time']
rt_incorrect = metadata_all[metadata_all['correct'] == 0]['response_time']

print(f"\nRT by Accuracy:")
print(f"  Correct: {rt_correct.mean():.3f}s (SD: {rt_correct.std():.3f}s)")
print(f"  Incorrect: {rt_incorrect.mean():.3f}s (SD: {rt_incorrect.std():.3f}s)")

# Statistical test
from scipy.stats import ttest_ind
t_stat, p_value = ttest_ind(rt_correct.dropna(), rt_incorrect.dropna())
print(f"  t-test: t = {t_stat:.3f}, p = {p_value:.4f}")

In [None]:
# RT by memory load
print("\nRT by Set Size:")
for set_size in sorted(metadata_all['set_size'].dropna().unique()):
    rt_size = metadata_all[metadata_all['set_size'] == set_size]['response_time']
    print(f"  Set size {int(set_size)}: {rt_size.mean():.3f}s (SD: {rt_size.std():.3f}s, n={len(rt_size)})")

In [None]:
# Plot RT distributions
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Overall distribution
axes[0, 0].hist(metadata_all['response_time'].dropna(), bins=30, edgecolor='black', alpha=0.7)
axes[0, 0].set_xlabel('Reaction Time (s)')
axes[0, 0].set_ylabel('Count')
axes[0, 0].set_title('Overall RT Distribution')
axes[0, 0].axvline(metadata_all['response_time'].mean(), color='red', 
                    linestyle='--', label='Mean')
axes[0, 0].legend()

# 2. By accuracy
axes[0, 1].hist([rt_correct.dropna(), rt_incorrect.dropna()], 
                bins=30, label=['Correct', 'Incorrect'], alpha=0.7)
axes[0, 1].set_xlabel('Reaction Time (s)')
axes[0, 1].set_ylabel('Count')
axes[0, 1].set_title('RT by Accuracy')
axes[0, 1].legend()

# 3. By set size
rt_by_size = [metadata_all[metadata_all['set_size'] == size]['response_time'].dropna() 
              for size in sorted(metadata_all['set_size'].dropna().unique())]
axes[1, 0].boxplot(rt_by_size, 
                   labels=[f'{int(s)}' for s in sorted(metadata_all['set_size'].dropna().unique())])
axes[1, 0].set_xlabel('Set Size')
axes[1, 0].set_ylabel('Reaction Time (s)')
axes[1, 0].set_title('RT by Memory Load')

# 4. Over trials
axes[1, 1].plot(metadata_all['response_time'], 'o-', alpha=0.5, markersize=3)
axes[1, 1].set_xlabel('Trial Number')
axes[1, 1].set_ylabel('Reaction Time (s)')
axes[1, 1].set_title('RT Over Time')

# Add session boundaries
if 'session' in metadata_all.columns:
    session_boundaries = metadata_all.groupby('session').size().cumsum()[:-1]
    for boundary in session_boundaries:
        axes[1, 1].axvline(boundary, color='red', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

## 11. Accuracy Analysis

In [None]:
# Overall accuracy
overall_acc = metadata_all['correct'].mean() * 100
print(f"Overall Accuracy: {overall_acc:.1f}%")

# Accuracy by session
print("\nAccuracy by Session:")
session_acc = metadata_all.groupby('session')['correct'].agg(['mean', 'count'])
session_acc['mean'] *= 100
session_acc.columns = ['Accuracy (%)', 'N trials']
display(session_acc)

# Accuracy by set size
print("\nAccuracy by Set Size:")
size_acc = metadata_all.groupby('set_size')['correct'].agg(['mean', 'count'])
size_acc['mean'] *= 100
size_acc.columns = ['Accuracy (%)', 'N trials']
display(size_acc)

# Accuracy by match condition
print("\nAccuracy by Match Condition:")
match_acc = metadata_all.groupby('match')['correct'].agg(['mean', 'count'])
match_acc['mean'] *= 100
match_acc.columns = ['Accuracy (%)', 'N trials']
match_acc.index = ['Non-match', 'Match']
display(match_acc)

In [None]:
# Plot accuracy
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# By session
session_acc = metadata_all.groupby('session')['correct'].mean() * 100
axes[0].bar(session_acc.index, session_acc.values)
axes[0].set_xlabel('Session')
axes[0].set_ylabel('Accuracy (%)')
axes[0].set_title('Accuracy by Session')
axes[0].set_ylim([0, 100])

# By set size
size_acc = metadata_all.groupby('set_size')['correct'].mean() * 100
axes[1].bar(size_acc.index, size_acc.values)
axes[1].set_xlabel('Set Size')
axes[1].set_ylabel('Accuracy (%)')
axes[1].set_title('Accuracy by Memory Load')
axes[1].set_ylim([0, 100])

# By match
match_acc = metadata_all.groupby('match')['correct'].mean() * 100
axes[2].bar(['Non-match', 'Match'], match_acc.values)
axes[2].set_ylabel('Accuracy (%)')
axes[2].set_title('Accuracy by Match Condition')
axes[2].set_ylim([0, 100])

plt.tight_layout()
plt.show()

## 12. Advanced: Single-Trial Analysis

In [None]:
# Example: Correlate single-trial ERP amplitude with RT
# Pick a channel and time window of interest
channel = 'Fz'  # Change to your channel
time_window = (0.3, 0.5)  # Time window in seconds

# Get data for correct trials only
epochs_correct = epochs['correct == 1']

# Extract single-trial amplitudes
data = epochs_correct.get_data(picks=channel)
times = epochs_correct.times

# Find time indices
time_mask = (times >= time_window[0]) & (times <= time_window[1])
amplitudes = data[:, 0, time_mask].mean(axis=1)  # Mean amplitude in window

# Get corresponding RTs
rts = epochs_correct.metadata['response_time'].values

# Compute correlation
from scipy.stats import pearsonr
r, p = pearsonr(amplitudes, rts)

print(f"\nSingle-Trial Correlation:")
print(f"  Channel: {channel}")
print(f"  Time window: {time_window[0]}-{time_window[1]}s")
print(f"  r = {r:.3f}, p = {p:.4f}")

# Plot
plt.figure(figsize=(8, 6))
plt.scatter(amplitudes * 1e6, rts, alpha=0.5)
plt.xlabel(f'{channel} Amplitude ({time_window[0]}-{time_window[1]}s) (µV)')
plt.ylabel('Response Time (s)')
plt.title(f'Single-Trial ERP-RT Correlation\nr = {r:.3f}, p = {p:.4f}')

# Add regression line
from scipy.stats import linregress
slope, intercept, _, _, _ = linregress(amplitudes, rts)
x_line = np.array([amplitudes.min(), amplitudes.max()])
y_line = slope * x_line + intercept
plt.plot(x_line * 1e6, y_line, 'r-', linewidth=2, label='Regression line')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

## 13. Save Results

In [None]:
# Save epochs with metadata
# This preserves both the neural data and behavioral metadata
epochs.save('subject_epochs-epo.fif', overwrite=True)

print("✓ Saved epochs with metadata to 'subject_epochs-epo.fif'")
print("  You can reload later with: epochs = mne.read_epochs('subject_epochs-epo.fif')")
print("  Metadata will be accessible via: epochs.metadata")

In [27]:
"""
Diagnostic: Check Event Extraction and STI Channel
"""

import mne
import numpy as np
import h5py

def check_events_in_fif(fif_path):
    """Check if events are properly encoded in FIF file."""
    
    print("="*80)
    print("CHECKING EVENTS IN FIF FILE")
    print("="*80)
    
    # Load raw
    raw = mne.io.read_raw_fif(fif_path, preload=True, verbose=False)
    
    print(f"\nFile: {fif_path}")
    print(f"Channels: {len(raw.ch_names)}")
    print(f"Duration: {raw.times[-1]:.2f}s")
    
    # Check for STI channel
    print("\n" + "-"*80)
    print("CHECKING STI CHANNEL")
    print("-"*80)
    
    stim_channels = [ch for ch in raw.ch_names if 'STI' in ch.upper()]
    
    if len(stim_channels) == 0:
        print("❌ NO STI CHANNEL FOUND!")
        print(f"   Available channels: {raw.ch_names}")
        return False
    else:
        print(f"✓ Found {len(stim_channels)} STI channel(s): {stim_channels}")
        
        # Check STI data
        for stim_ch in stim_channels:
            stim_data = raw.get_data(picks=stim_ch)[0]
            
            print(f"\n  Channel: {stim_ch}")
            print(f"    Min: {stim_data.min()}")
            print(f"    Max: {stim_data.max()}")
            print(f"    Unique values: {np.unique(stim_data)}")
            print(f"    Non-zero samples: {np.sum(stim_data != 0)}")
            
            if np.all(stim_data == 0):
                print(f"    ❌ STI channel is all zeros!")
            else:
                print(f"    ✓ STI channel has events")
    
    # Try to extract events
    print("\n" + "-"*80)
    print("EXTRACTING EVENTS")
    print("-"*80)
    
    try:
        events = mne.find_events(raw, stim_channel='STI', verbose=False)
        
        if len(events) == 0:
            print("❌ No events found!")
        else:
            print(f"✓ Found {len(events)} events")
            print(f"\n  Event codes: {np.unique(events[:, 2])}")
            print(f"  Event counts:")
            for code in np.unique(events[:, 2]):
                n = np.sum(events[:, 2] == code)
                print(f"    Code {code}: {n} events")
            
            print(f"\n  First 5 events:")
            print("  Sample  | Time (s) | Code")
            print("  " + "-"*35)
            for i, event in enumerate(events[:5]):
                sample, _, code = event
                time = sample / raw.info['sfreq']
                print(f"  {sample:6d}  | {time:8.2f} | {code:4d}")
    
    except Exception as e:
        print(f"❌ Error extracting events: {e}")
    
    # Check annotations
    print("\n" + "-"*80)
    print("CHECKING ANNOTATIONS")
    print("-"*80)
    
    if len(raw.annotations) == 0:
        print("❌ No annotations found!")
    else:
        print(f"✓ Found {len(raw.annotations)} annotations")
        print(f"\n  Annotation types:")
        unique_desc = np.unique(raw.annotations.description)
        for desc in unique_desc:
            n = np.sum(raw.annotations.description == desc)
            print(f"    {desc}: {n} annotations")
        
        print(f"\n  First 5 annotations:")
        print("  Time (s) | Description")
        print("  " + "-"*40)
        for i in range(min(5, len(raw.annotations))):
            print(f"  {raw.annotations.onset[i]:8.2f} | {raw.annotations.description[i]}")
    
    # Check event_id in info
    print("\n" + "-"*80)
    print("CHECKING EVENT_ID")
    print("-"*80)
    
    if 'temp' in raw.info and 'event_id' in raw.info['temp']:
        event_id = raw.info['temp']['event_id']
        print(f"✓ Found event_id mapping:")
        for name, code in event_id.items():
            print(f"    {name}: {code}")
    else:
        print("❌ No event_id mapping found in raw.info")
    
    return True


def check_events_in_nix(nix_path):
    """Check events in NIX file."""
    
    print("\n" + "="*80)
    print("CHECKING EVENTS IN NIX FILE")
    print("="*80)
    
    with h5py.File(nix_path, 'r') as file:
        if 'data' not in file:
            print("❌ No data section")
            return
        
        block_name = list(file['data'].keys())[0]
        block = file['data'][block_name]
        
        print(f"\nBlock: {block_name}")
        
        # Check for Events group
        if 'groups' not in block:
            print("❌ No groups section")
            return
        
        if 'Events' not in block['groups']:
            print("❌ No Events group")
            print(f"   Available groups: {list(block['groups'].keys())}")
            return
        
        events_group = block['groups']['Events']
        print(f"✓ Found Events group")
        
        # Check for multi_tags
        if 'multi_tags' not in events_group:
            print("❌ No multi_tags in Events group")
            return
        
        multi_tags = events_group['multi_tags']
        n_tags = len(multi_tags)
        print(f"✓ Found {n_tags} multi_tags (trials)")
        
        # Show first trial events
        first_tag_key = list(multi_tags.keys())[0]
        first_tag = multi_tags[first_tag_key]
        
        print(f"\n  First trial: {first_tag_key}")
        
        if 'positions' in first_tag:
            positions = first_tag['positions']
            
            if 'data' in positions:
                event_times = positions['data'][()]
                print(f"    Event times: {event_times}")
            
            if 'dimensions' in positions:
                for dim_key in positions['dimensions'].keys():
                    dim = positions['dimensions'][dim_key]
                    if 'labels' in dim:
                        labels = dim['labels'][()]
                        
                        def safe_decode(val):
                            if isinstance(val, bytes):
                                return val.decode('utf-8')
                            elif isinstance(val, np.ndarray) and len(val) > 0:
                                val = val[0]
                                if isinstance(val, bytes):
                                    return val.decode('utf-8')
                            return val
                        
                        event_names = [safe_decode(label) for label in labels]
                        print(f"    Event names: {event_names}")
                        
                        print(f"\n    Event timing:")
                        for name, time in zip(event_names, event_times):
                            print(f"      {name}: {time:.3f}s")


if __name__ == '__main__':
    import sys
    import os
    
    # Check FIF file
    fif_file = 'Data_converted_MetaData\Subject_01\Data_Subject_01_Session_04.h5_seeg_raw.fif'  # Change to your FIF file
    
    if os.path.exists(fif_file):
        check_events_in_fif(fif_file)
    else:
        print(f"FIF file not found: {fif_file}")
        print("Please provide path to your FIF file")
    
    print("\n" + "="*80)
    
    # Check NIX file
    nix_file = 'data_nix/Data_Subject_01_Session_01.h5'  # Change to your NIX file
    
    if os.path.exists(nix_file):
        check_events_in_nix(nix_file)
    else:
        print(f"\nNIX file not found: {nix_file}")
        print("Please provide path to your NIX file")

CHECKING EVENTS IN FIF FILE

File: Data_converted_MetaData\Subject_01\Data_Subject_01_Session_04.h5_seeg_raw.fif
Channels: 20
Duration: 400.00s

--------------------------------------------------------------------------------
CHECKING STI CHANNEL
--------------------------------------------------------------------------------
✓ Found 1 STI channel(s): ['STI']

  Channel: STI
    Min: 0.0
    Max: 0.0
    Unique values: [0.]
    Non-zero samples: 0
    ❌ STI channel is all zeros!

--------------------------------------------------------------------------------
EXTRACTING EVENTS
--------------------------------------------------------------------------------
❌ No events found!

--------------------------------------------------------------------------------
CHECKING ANNOTATIONS
--------------------------------------------------------------------------------
❌ No annotations found!

--------------------------------------------------------------------------------
CHECKING EVENT_ID
--------

  fif_file = 'Data_converted_MetaData\Subject_01\Data_Subject_01_Session_04.h5_seeg_raw.fif'  # Change to your FIF file


In [None]:
# Save processed metadata
metadata_all.to_csv('subject_metadata_processed.csv', index=False)

print("✓ Saved processed metadata to 'subject_metadata_processed.csv'")

## Summary

This notebook demonstrated:

1. **Loading multi-session data**: Load and concatenate FIF files and CSV metadata from all sessions
2. **Metadata integration**: Attach behavioral metadata to epochs for advanced trial selection
3. **Flexible trial selection**: Use pandas-style queries to select trials by condition
4. **Condition-based ERP analysis**: Compare ERPs across accuracy, memory load, and match conditions
5. **Behavioral analysis**: Analyze RT and accuracy patterns
6. **Single-trial analysis**: Correlate neural activity with behavior on a trial-by-trial basis

### Key advantages of metadata approach:

- **Flexible trial selection**: `epochs['correct == 1 and set_size >= 6']`
- **Easy subgroup analysis**: Compare conditions without re-epoching
- **Single-trial correlations**: Link neural activity to RT, accuracy, etc.
- **Preserved alignment**: Neural data and behavior stay perfectly aligned

### Next steps:

- Apply to multiple subjects
- Implement more sophisticated statistical tests
- Perform time-frequency analysis by condition
- Run source reconstruction
- Test early vs. late neural correlates of consciousness hypotheses!