# Complete EEG-LFP Preprocessing Pipeline

This notebook demonstrates the complete workflow from BIDS-format data to fully preprocessed data.

## Pipeline Overview
1. Data Inspection and Validation
2. General Cleaning (Detrending, Filtering, Resampling)
3. EEG Preprocessing (Bad Channel Detection, Re-referencing, ICA Artifact Removal, Epoching, Source Reconstruction)
4. LFP Preprocessing (Stimulus Artifact Removal, Electrode Management, Noise Reduction)
5. Joint Processing (Temporal Alignment, Frequency Band Decomposition, Normalization)
6. Quality Control and Saving

In [1]:
import pyprep, autoreject, mne_icalabel
print("pyprep version:", pyprep.__version__)
print("autoreject version:", autoreject.__version__)
print("mne-icalabel version:", mne_icalabel.__version__)



2025-11-19 17:08:39,524 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.


pyprep version: 0.5.0
autoreject version: 0.4.3
mne-icalabel version: 0.8.1


In [2]:
# 导入必要的库
import sys
sys.path.append('/workspace/src')

import numpy as np
import mne
import matplotlib.pyplot as plt
from pathlib import Path


# 导入预处理模块
from preprocessing import DataValidator, EEGCleaner, LFPCleaner,EEGPreprocessor,LFPPreprocessor,JointPreprocessor,QualityControl, BIDSDerivativesSaver

# 设置
mne.set_log_level('WARNING')
%matplotlib inline
plt.rcParams['figure.figsize'] = (12, 6)

## 1. 数据检查与验证

In [3]:
# 设置BIDS根目录
from pathlib import Path
import os

project_root = Path(os.getcwd())
print(f"Current working directory: {project_root}")

# Define paths (use relative paths)
DATA_ROOT = project_root / 'shared' / 'data' / 'raw'
RESULTS_ROOT = project_root / 'shared' / 'results'
BIDS_ROOT = project_root / 'shared' / 'data' / 'bids_dataset'

# Subject information
SUBJECT_ID = 'Wue01'
STIM_FREQ = 55  # Hz

bids_root = project_root / 'shared' / 'data' / 'bids_dataset'

data_root = bids_root / 'derivatives' / 'mne-python'
subject = 'sub-001'
session = 'ses-01'
task = 'task-StimOn55HzFull2'


# 创建验证器
validator = DataValidator(data_root)

# 运行完整验证
validation_results = validator.run_full_validation(
    subject=subject,
    session=session,
    task=task,
    validate_lfp=True
)

# 生成验证报告
report = validator.generate_validation_report()
print(report)

2025-11-19 17:08:52,100 - preprocessing.data_validation - INFO - 开始数据验证
2025-11-19 17:08:52,107 - preprocessing.data_io - INFO - 检测到文件格式: fif (.fif)


Current working directory: /workspace


2025-11-19 17:08:53,496 - preprocessing.data_io - INFO - ✓ 已加载EEG数据: fif格式
2025-11-19 17:08:53,501 - preprocessing.data_io - INFO -   采样率: 500.0 Hz
2025-11-19 17:08:53,503 - preprocessing.data_io - INFO -   通道数: 281
2025-11-19 17:08:53,506 - preprocessing.data_io - INFO -   时长: 95.33 秒
2025-11-19 17:08:53,507 - preprocessing.data_io - INFO - 检测到文件格式: fif (.fif)
2025-11-19 17:08:53,528 - preprocessing.data_io - INFO - ✓ 已加载LFP数据: fif格式
2025-11-19 17:08:53,531 - preprocessing.data_io - INFO -   采样率: 250.0 Hz
2025-11-19 17:08:53,532 - preprocessing.data_io - INFO -   通道数: 4
2025-11-19 17:08:53,534 - preprocessing.data_io - INFO -   时长: 95.23 秒
2025-11-19 17:08:53,541 - preprocessing.data_validation - INFO -   建议重采样至统一采样率
2025-11-19 17:08:53,560 - preprocessing.data_validation - INFO - ✓ 元数据一致性良好
2025-11-19 17:08:53,567 - preprocessing.data_validation - INFO - 验证完成（EEG + LFP）


EEG-LFP 数据验证报告

1. 采样率检查
   EEG采样率: 500.0 Hz
   LFP采样率: 250.0 Hz
   状态: 不匹配

2. 时间对齐检查
   时间偏移: 0.000 ms
   时长差异: 102.000 ms
   状态: 需要对齐

3. 事件同步检查
   EEG事件: N/A
   LFP事件: N/A

4. 元数据一致性
   状态: 通过


In [4]:
# 加载数据供后续使用
eeg_raw, eeg_metadata = validator.load_eeg_data(subject, session, task)
lfp_raw, lfp_metadata = validator.load_lfp_data(subject, session, task)

print(f"\nEEG数据信息:")
print(eeg_raw.info)
print(f"\nLFP数据信息:")
print(lfp_raw.info)

2025-11-19 17:08:59,393 - preprocessing.data_io - INFO - 检测到文件格式: fif (.fif)
2025-11-19 17:08:59,764 - preprocessing.data_io - INFO - ✓ 已加载EEG数据: fif格式
2025-11-19 17:08:59,767 - preprocessing.data_io - INFO -   采样率: 500.0 Hz
2025-11-19 17:08:59,770 - preprocessing.data_io - INFO -   通道数: 281
2025-11-19 17:08:59,778 - preprocessing.data_io - INFO -   时长: 95.33 秒
2025-11-19 17:08:59,782 - preprocessing.data_io - INFO - 检测到文件格式: fif (.fif)
2025-11-19 17:08:59,803 - preprocessing.data_io - INFO - ✓ 已加载LFP数据: fif格式
2025-11-19 17:08:59,806 - preprocessing.data_io - INFO -   采样率: 250.0 Hz
2025-11-19 17:08:59,808 - preprocessing.data_io - INFO -   通道数: 4
2025-11-19 17:08:59,813 - preprocessing.data_io - INFO -   时长: 95.23 秒



EEG数据信息:
<Info | 12 non-empty values
 bads: []
 ch_names: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, ...
 chs: 281 EEG
 custom_ref_applied: False
 device_info: 1 item (dict)
 dig: 284 items (3 Cardinal, 281 EEG)
 file_id: 4 items (dict)
 highpass: 0.0 Hz
 lowpass: 250.0 Hz
 meas_date: 2023-11-04 04:28:41 UTC
 meas_id: 4 items (dict)
 nchan: 281
 projs: []
 sfreq: 500.0 Hz
 utc_offset: +0000
>

LFP数据信息:
<Info | 10 non-empty values
 bads: []
 ch_names: LFP_L, LFP_R, STIM_L, STIM_R
 chs: 2 sEEG, 2 Stimulus
 custom_ref_applied: False
 device_info: 2 items (dict)
 file_id: 4 items (dict)
 highpass: 0.0 Hz
 lowpass: 125.0 Hz
 meas_date: 2025-11-07 15:37:06 UTC
 meas_id: 4 items (dict)
 nchan: 4
 projs: []
 sfreq: 250.0 Hz
>


## 2. 通用清洗

In [5]:
# Create cleaners
eeg_cleaner = EEGCleaner()
lfp_cleaner = LFPCleaner()

# Clean EEG
print("\n=== Cleaning EEG ===")
eeg_raw_clean = eeg_cleaner.apply_eeg_cleaning(
    eeg_raw,
    target_sfreq=250.0,  # Downsample to match LFP
    line_freq=50.0,
    l_freq=1.0,
    h_freq=100.0
)
print(eeg_cleaner.get_processing_summary())

# Clean LFP (现在会使用你的参数了！)
print("\n=== Cleaning LFP ===")
lfp_raw_clean = lfp_cleaner.apply_lfp_cleaning(
    lfp_raw,
    target_sfreq=250.0,  # Keep at 250 Hz (or None to keep original)
    line_freq=50.0,
    l_freq=1.0,
    h_freq=100.0  # Now respects your parameter! (< 125 Hz Nyquist)
)
print(lfp_cleaner.get_processing_summary())

print("\n✅ Both EEG and LFP cleaned successfully!")

2025-11-19 17:09:03,775 - preprocessing.signal_cleaning - INFO - STARTING EEG CLEANING PIPELINE



=== Cleaning EEG ===


2025-11-19 17:09:05,263 - preprocessing.signal_cleaning - INFO - STANDARD CLEANING PIPELINE (EEG)
2025-11-19 17:09:05,265 - preprocessing.signal_cleaning - INFO - Original sampling rate: 500.0 Hz
2025-11-19 17:09:05,266 - preprocessing.signal_cleaning - INFO - Nyquist frequency: 250.0 Hz
2025-11-19 17:09:05,267 - preprocessing.signal_cleaning - INFO - 
[1] Resampling: 500.0 Hz → 250.0 Hz
2025-11-19 17:09:09,266 - preprocessing.signal_cleaning - INFO - ✓ Resampled to 250.0 Hz (new Nyquist: 125.0 Hz)
2025-11-19 17:09:09,267 - preprocessing.signal_cleaning - INFO - 
[2] Applying bandpass filter: 1.0-100.0 Hz
2025-11-19 17:09:10,821 - preprocessing.signal_cleaning - INFO - ✓ Applied bandpass filter: 1.0-100.0 Hz (fir)
2025-11-19 17:09:10,822 - preprocessing.signal_cleaning - INFO - 
[3] Applying notch filter at: [50.0, 100.0] Hz
2025-11-19 17:09:11,097 - preprocessing.signal_cleaning - INFO - ✓ Applied notch filter at: [50.0, 100.0] Hz
2025-11-19 17:09:11,100 - preprocessing.signal_cleanin

Processing History:
  1. resample_250.0Hz
  2. bandpass_1.0-100.0Hz
  3. notch_[50.0, 100.0]Hz


=== Cleaning LFP ===
Processing History:
  1. bandpass_1.0-100.0Hz
  2. notch_[50.0, 100.0]Hz


✅ Both EEG and LFP cleaned successfully!


## 3. EEG专用预处理

In [10]:
if 'preprocessing.eeg_preprocessing' in sys.modules:
    del sys.modules['preprocessing.eeg_preprocessing']
    from preprocessing.eeg_preprocessing import EEGPreprocessor



In [11]:
preprocessor2 = EEGPreprocessor()

# Step 1: 检测坏导（只用 EEG 通道，排除 REF CZ 和 misc）
print("\n=== Step 1: 坏导检测 ===")
eeg_temp2, bad_channels2 = preprocessor2.mark_bad_channels(
    eeg_raw_clean,
    method='pyprep',
    ransac=True,      # False = 更快更稳定
    copy=True
)
print(f"检测到 {len(bad_channels2)} 个坏导: {bad_channels2}")

2025-11-19 17:39:43,988 - preprocessing.eeg_preprocessing - INFO - DETECTING BAD CHANNELS WITH PYPREP
2025-11-19 17:39:44,007 - preprocessing.eeg_preprocessing - INFO - Total channels: 281
2025-11-19 17:39:44,012 - preprocessing.eeg_preprocessing - INFO - EEG channels (by type): 281
2025-11-19 17:39:44,013 - preprocessing.eeg_preprocessing - INFO - EEG channels (filtered): 280


2025-11-19 17:39:44,122 - preprocessing.eeg_preprocessing - INFO - Channels for pyprep: 280
2025-11-19 17:39:44,125 - preprocessing.eeg_preprocessing - INFO - Detection methods: correlation, deviation, HF noise, RANSAC



=== Step 1: 坏导检测 ===


2025-11-19 17:39:44,694 - preprocessing.eeg_preprocessing - INFO - [1] Correlation detection...
2025-11-19 17:39:47,576 - preprocessing.eeg_preprocessing - INFO - [2] Deviation detection...
2025-11-19 17:39:47,886 - preprocessing.eeg_preprocessing - INFO - [3] HF noise detection...
2025-11-19 17:39:48,369 - preprocessing.eeg_preprocessing - INFO - [4] RANSAC detection...
2025-11-19 17:40:05,916 - preprocessing.eeg_preprocessing - INFO - 
2025-11-19 17:40:05,917 - preprocessing.eeg_preprocessing - INFO - DETECTION RESULTS
2025-11-19 17:40:05,919 - preprocessing.eeg_preprocessing - INFO - Found 39 bad channels:
2025-11-19 17:40:05,920 - preprocessing.eeg_preprocessing - INFO -   ['31', '18', '19', '228', '71', '168', '50', '245', '52', '173', '3', '167', '251', '56', '246', '243', '21', '55', '69', '70', '261', '166', '61', '58', '279', '190', '82', '174', '74', '179', '28', '22', '1', '172', '250', '27', '65', '60', '244']
2025-11-19 17:40:05,922 - preprocessing.eeg_preprocessing - INFO

检测到 39 个坏导: ['31', '18', '19', '228', '71', '168', '50', '245', '52', '173', '3', '167', '251', '56', '246', '243', '21', '55', '69', '70', '261', '166', '61', '58', '279', '190', '82', '174', '74', '179', '28', '22', '1', '172', '250', '27', '65', '60', '244']


In [6]:
# ================================================
# Step-by-Step Preprocessing (WITH ICLabel & Autoreject)
# ================================================

preprocessor = EEGPreprocessor()

# Step 1: 检测坏导（只用 EEG 通道，排除 REF CZ 和 misc）
print("\n=== Step 1: 坏导检测 ===")
eeg_temp, bad_channels = preprocessor.mark_bad_channels(
    eeg_raw_clean,
    method='pyprep',
    ransac=False,      # False = 更快更稳定
    copy=True
)
print(f"检测到 {len(bad_channels)} 个坏导: {bad_channels}")

# Step 2: 插值坏导
if bad_channels:
    print("\n=== Step 2: 插值坏导 ===")
    eeg_temp = preprocessor.interpolate_bad_channels(eeg_temp, copy=False)
else:
    print("\n=== Step 2: 无坏导需要插值 ===")

# Step 3: 重参考（3步法）
print("\n=== Step 3: 重参考（在 ICA 之前）===")
print("✓ ICLabel 推荐在 ICA 之前应用平均参考")
eeg_temp = preprocessor.apply_average_reference(
    eeg_temp,
    ref_channel='REF CZ',
    copy=False
)

# Step 4: ICA - 已使用 infomax（ICLabel 推荐）
print("\n=== Step 4: ICA ===")
eeg_temp, ica = preprocessor.apply_ica(
    eeg_temp, 
    n_components=30, 
    method='infomax',
    copy=False
)

# Step 5: 使用 ICLabel 自动检测伪迹（无需 EOG/ECG 通道）
print("\n=== Step 5: ICLabel 自动分类伪迹 ===")
eeg_preprocessed = preprocessor.apply_ica_cleaning(
    eeg_temp, 
    ica, 
    auto_detect=True,
    use_iclabel=True,           # 使用 ICLabel（推荐）
    brain_threshold=0.5,        # brain 阈值
    artifact_threshold=0.5,     # 伪迹阈值
    exclude_labels=['eye', 'heart', 'muscle', 'line_noise', 'channel_noise'],
    copy=False
)


=== Step 1: 坏导检测 ===


2025-11-19 17:11:46,566 - preprocessing.eeg_preprocessing - INFO - DETECTING BAD CHANNELS WITH PYPREP
2025-11-19 17:11:46,608 - preprocessing.eeg_preprocessing - INFO - Total channels: 281
2025-11-19 17:11:46,611 - preprocessing.eeg_preprocessing - INFO - EEG channels (by type): 281
2025-11-19 17:11:46,625 - preprocessing.eeg_preprocessing - INFO - EEG channels (filtered): 280
2025-11-19 17:11:47,574 - preprocessing.eeg_preprocessing - INFO - Channels for pyprep: 280
2025-11-19 17:11:47,576 - preprocessing.eeg_preprocessing - INFO - Detection methods: correlation, deviation, HF noise
2025-11-19 17:11:51,023 - preprocessing.eeg_preprocessing - INFO - [1] Correlation detection...
2025-11-19 17:11:54,128 - preprocessing.eeg_preprocessing - INFO - [2] Deviation detection...
2025-11-19 17:11:55,340 - preprocessing.eeg_preprocessing - INFO - [3] HF noise detection...
2025-11-19 17:11:56,306 - preprocessing.eeg_preprocessing - INFO - [4] RANSAC: skipped (not needed for filtered channels)
2025

检测到 22 个坏导: ['18', '228', '71', '168', '3', '56', '246', '243', '55', '69', '70', '166', '58', '190', '179', '28', '22', '1', '172', '250', '65', '60']

=== Step 2: 插值坏导 ===


2025-11-19 17:11:56,803 - preprocessing.eeg_preprocessing - INFO - ✓ Interpolated 22 channels
2025-11-19 17:11:56,812 - preprocessing.eeg_preprocessing - INFO - APPLYING AVERAGE REFERENCE
2025-11-19 17:11:56,815 - preprocessing.eeg_preprocessing - INFO - 
[Step 1] Re-reference to REF CZ
2025-11-19 17:11:56,915 - preprocessing.eeg_preprocessing - INFO - ✓ Re-referenced to REF CZ
2025-11-19 17:11:56,917 - preprocessing.eeg_preprocessing - INFO - 
[Step 2] Set REF CZ as 'misc'
  raw.set_channel_types({ref_channel: 'misc'})
2025-11-19 17:11:56,920 - preprocessing.eeg_preprocessing - INFO - ✓ REF CZ has been set to 'misc'
2025-11-19 17:11:56,921 - preprocessing.eeg_preprocessing - INFO - 
[Step 3] Average reference
2025-11-19 17:11:56,954 - preprocessing.eeg_preprocessing - INFO - ✓ Applied average reference



=== Step 3: 重参考（在 ICA 之前）===
✓ ICLabel 推荐在 ICA 之前应用平均参考


2025-11-19 17:11:57,198 - preprocessing.eeg_preprocessing - INFO - APPLYING ICA
2025-11-19 17:11:57,205 - preprocessing.eeg_preprocessing - INFO - Components: 30
2025-11-19 17:11:57,209 - preprocessing.eeg_preprocessing - INFO - Method: infomax
2025-11-19 17:11:57,210 - preprocessing.eeg_preprocessing - INFO - Fitting ICA...



=== Step 4: ICA ===


2025-11-19 17:12:05,065 - preprocessing.eeg_preprocessing - INFO - ✓ Fitted 30 components
2025-11-19 17:12:05,659 - preprocessing.eeg_preprocessing - INFO - Fraction of eeg variance explained by all components: 0.997 (99.7%)
2025-11-19 17:12:05,661 - preprocessing.eeg_preprocessing - INFO - 
2025-11-19 17:12:05,662 - preprocessing.eeg_preprocessing - INFO - DETECTING ARTIFACTS
2025-11-19 17:12:05,667 - preprocessing.eeg_preprocessing - INFO - Method: ICLabel automatic classification
2025-11-19 17:12:05,669 - preprocessing.eeg_preprocessing - INFO - 
2025-11-19 17:12:05,671 - preprocessing.eeg_preprocessing - INFO - CLASSIFYING ICA COMPONENTS WITH ICLabel
2025-11-19 17:12:05,676 - preprocessing.eeg_preprocessing - INFO - 
[1] Running ICLabel neural network classifier...
  ic_labels = label_components(raw, ica, method='iclabel')
  ic_labels = label_components(raw, ica, method='iclabel')



=== Step 5: ICLabel 自动分类伪迹 ===


2025-11-19 17:12:10,250 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred type: <class 'list'>
2025-11-19 17:12:10,252 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred length: 30
2025-11-19 17:12:10,253 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred_proba shape: (30,)
2025-11-19 17:12:10,256 - preprocessing.eeg_preprocessing - INFO - Detected 1D max probabilities (new mne-icalabel format)
2025-11-19 17:12:10,259 - preprocessing.eeg_preprocessing - INFO - ✓ Will use labels_pred directly for component selection
2025-11-19 17:12:10,260 - preprocessing.eeg_preprocessing - INFO -   (This is the correct behavior for newer versions)
2025-11-19 17:12:10,262 - preprocessing.eeg_preprocessing - INFO - 
[2] Classification results:
2025-11-19 17:12:10,266 - preprocessing.eeg_preprocessing - INFO - 
[3] Component details:
2025-11-19 17:12:10,273 - preprocessing.eeg_preprocessing - INFO -   IC 0: brain                (p=0.990)
2025-11-19 17:12:10,277 - preproc

In [None]:
# ================================================
# Step-by-Step Preprocessing (WITH ICLabel & Autoreject)
# ================================================

preprocessor = EEGPreprocessor()

# Step 1: 检测坏导（只用 EEG 通道，排除 REF CZ 和 misc）
print("\n=== Step 1: 坏导检测 ===")
eeg_temp, bad_channels = preprocessor.mark_bad_channels(
    eeg_raw_clean,
    method='pyprep',
    ransac=False,      # False = 更快更稳定
    copy=True
)
print(f"检测到 {len(bad_channels)} 个坏导: {bad_channels}")

# Step 2: 插值坏导
if bad_channels:
    print("\n=== Step 2: 插值坏导 ===")
    eeg_temp = preprocessor.interpolate_bad_channels(eeg_temp, copy=False)
else:
    print("\n=== Step 2: 无坏导需要插值 ===")

# Step 3: 重参考（3步法）
print("\n=== Step 3: 重参考（在 ICA 之前）===")
print("✓ ICLabel 推荐在 ICA 之前应用平均参考")
eeg_temp = preprocessor.apply_average_reference(
    eeg_temp,
    ref_channel='REF CZ',
    copy=False
)

# Step 4: ICA - 已使用 infomax（ICLabel 推荐）
print("\n=== Step 4: ICA ===")
eeg_temp, ica = preprocessor.apply_ica(
    eeg_temp, 
    n_components=30, 
    method='infomax',
    copy=False
)

# Step 5: 使用 ICLabel 自动检测伪迹（无需 EOG/ECG 通道）
print("\n=== Step 5: ICLabel 自动分类伪迹 ===")
eeg_preprocessed = preprocessor.apply_ica_cleaning(
    eeg_temp, 
    ica, 
    auto_detect=True,
    use_iclabel=True,           # 使用 ICLabel（推荐）
    brain_threshold=0.5,        # brain 阈值
    artifact_threshold=0.5,     # 伪迹阈值
    exclude_labels=['eye', 'heart', 'muscle', 'line_noise', 'channel_noise'],
    copy=False
)

# Step 6: 创建固定长度 epochs（可选）
print("\n=== Step 6: 创建 Epochs ===")
epochs = preprocessor.create_fixed_length_epochs(
    eeg_preprocessed,
    duration=2.0,       # 2秒 epoch
    overlap=0.5,        # 重叠
    copy=False
)

# Step 7: Autoreject 自动清理 epochs（可选）- ✅ 使用 'mark' 模式
print("\n=== Step 7: Autoreject 清理 ===")
print("✓ 使用 'mark' 模式保持所有 epochs")
epochs_clean, ar, reject_log, bad_epochs_idx = preprocessor.apply_autoreject(
    epochs,
    reject_mode='drop', # 'mark': 保持所有 epochs,只标记坏的
    n_jobs=2             # 使用 4 个并行任务（根据你的 CPU 调整）
)

# 查看结果
print("\n" + "="*60)
print("预处理完成！")
print("="*60)
print(f"总 epochs: {len(epochs_clean)}")
print(f"坏 epochs: {len(bad_epochs_idx)}")
print(f"坏 epochs 索引: {bad_epochs_idx}")
print(f"好 epochs: {len(epochs_clean) - len(bad_epochs_idx)}")

# 查看处理历史
print(preprocessor.get_processing_summary())

2025-10-25 14:05:19,735 - preprocessing.eeg_preprocessing - INFO - DETECTING BAD CHANNELS WITH PYPREP


2025-10-25 14:05:19,754 - preprocessing.eeg_preprocessing - INFO - Total channels: 280
2025-10-25 14:05:19,756 - preprocessing.eeg_preprocessing - INFO - EEG channels (by type): 280
2025-10-25 14:05:19,758 - preprocessing.eeg_preprocessing - INFO - EEG channels (filtered): 280
2025-10-25 14:05:19,871 - preprocessing.eeg_preprocessing - INFO - Channels for pyprep: 280
2025-10-25 14:05:19,874 - preprocessing.eeg_preprocessing - INFO - Detection methods: correlation, deviation, HF noise



=== Step 1: 坏导检测 ===


2025-10-25 14:05:20,411 - preprocessing.eeg_preprocessing - INFO - [1] Correlation detection...
2025-10-25 14:05:24,923 - preprocessing.eeg_preprocessing - INFO - [2] Deviation detection...
2025-10-25 14:05:26,126 - preprocessing.eeg_preprocessing - INFO - [3] HF noise detection...
2025-10-25 14:05:26,567 - preprocessing.eeg_preprocessing - INFO - [4] RANSAC: skipped (not needed for filtered channels)
2025-10-25 14:05:26,568 - preprocessing.eeg_preprocessing - INFO - 
2025-10-25 14:05:26,569 - preprocessing.eeg_preprocessing - INFO - DETECTION RESULTS
2025-10-25 14:05:26,571 - preprocessing.eeg_preprocessing - INFO - Found 14 bad channels:
2025-10-25 14:05:26,571 - preprocessing.eeg_preprocessing - INFO -   ['19', '31', '256', '280', '64', '17', '267', '232', '33', '233', '274', '75', '57', '51']
2025-10-25 14:05:26,572 - preprocessing.eeg_preprocessing - INFO -   Correlation: ['17', '19', '31', '33', '51', '57', '64', '75', '232', '233', '256', '267', '274', '280']
2025-10-25 14:05:26

检测到 14 个坏导: ['19', '31', '256', '280', '64', '17', '267', '232', '33', '233', '274', '75', '57', '51']

=== Step 2: 插值坏导 ===


2025-10-25 14:05:27,245 - preprocessing.eeg_preprocessing - INFO - ✓ Interpolated 14 channels
2025-10-25 14:05:27,261 - preprocessing.eeg_preprocessing - INFO - APPLYING AVERAGE REFERENCE
2025-10-25 14:05:27,267 - preprocessing.eeg_preprocessing - INFO - Direct average reference
2025-10-25 14:05:27,377 - preprocessing.eeg_preprocessing - INFO - ✓ Applied average reference
2025-10-25 14:05:27,383 - preprocessing.eeg_preprocessing - INFO - APPLYING ICA
2025-10-25 14:05:27,385 - preprocessing.eeg_preprocessing - INFO - Components: 30
2025-10-25 14:05:27,388 - preprocessing.eeg_preprocessing - INFO - Method: infomax
2025-10-25 14:05:27,389 - preprocessing.eeg_preprocessing - INFO - Fitting ICA...



=== Step 3: 重参考（在 ICA 之前）===
✓ ICLabel 推荐在 ICA 之前应用平均参考

=== Step 4: ICA ===


2025-10-25 14:05:35,106 - preprocessing.eeg_preprocessing - INFO - ✓ Fitted 30 components
2025-10-25 14:05:35,110 - preprocessing.eeg_preprocessing - INFO - 
2025-10-25 14:05:35,111 - preprocessing.eeg_preprocessing - INFO - DETECTING ARTIFACTS
2025-10-25 14:05:35,111 - preprocessing.eeg_preprocessing - INFO - Method: ICLabel automatic classification
2025-10-25 14:05:35,112 - preprocessing.eeg_preprocessing - INFO -   (Does not require EOG/ECG channels)
2025-10-25 14:05:35,112 - preprocessing.eeg_preprocessing - INFO - 
2025-10-25 14:05:35,113 - preprocessing.eeg_preprocessing - INFO - CLASSIFYING ICA COMPONENTS WITH ICLabel
2025-10-25 14:05:35,113 - preprocessing.eeg_preprocessing - INFO - Note: This method does NOT require EOG/ECG channels
2025-10-25 14:05:35,113 - preprocessing.eeg_preprocessing - INFO -       It works with pure EEG data using learned patterns
2025-10-25 14:05:35,114 - preprocessing.eeg_preprocessing - INFO - 
[1] Running ICLabel neural network classifier...
  ic_la


=== Step 5: ICLabel 自动分类伪迹 ===


2025-10-25 14:05:42,594 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred type: <class 'list'>
2025-10-25 14:05:42,595 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred length: 30
2025-10-25 14:05:42,597 - preprocessing.eeg_preprocessing - INFO - Debug: labels_pred_proba shape: (30,)
2025-10-25 14:05:42,597 - preprocessing.eeg_preprocessing - INFO - Detected 1D max probabilities (new mne-icalabel format)
2025-10-25 14:05:42,599 - preprocessing.eeg_preprocessing - INFO - ✓ Will use labels_pred directly for component selection
2025-10-25 14:05:42,600 - preprocessing.eeg_preprocessing - INFO -   (This is the correct behavior for newer versions)
2025-10-25 14:05:42,601 - preprocessing.eeg_preprocessing - INFO - 
[2] Classification results:
2025-10-25 14:05:42,604 - preprocessing.eeg_preprocessing - INFO - 
[3] Component details:
2025-10-25 14:05:42,607 - preprocessing.eeg_preprocessing - INFO -   IC 0: brain                (p=0.944)
2025-10-25 14:05:42,608 - preproc


=== Step 6: 创建 Epochs ===

=== Step 7: Autoreject 清理 ===
✓ 使用 'mark' 模式保持所有 epochs


2025-10-25 14:09:54,890 - preprocessing.eeg_preprocessing - INFO - ✓ AutoReject fitted
2025-10-25 14:09:54,892 - preprocessing.eeg_preprocessing - INFO - 
[2] Applying cleaning...
2025-10-25 14:10:11,170 - preprocessing.eeg_preprocessing - INFO - 
2025-10-25 14:10:11,171 - preprocessing.eeg_preprocessing - INFO - AUTOREJECT RESULTS
2025-10-25 14:10:11,173 - preprocessing.eeg_preprocessing - INFO - Total epochs: 63
2025-10-25 14:10:11,173 - preprocessing.eeg_preprocessing - INFO -   - Good/Interpolated: 62
2025-10-25 14:10:11,174 - preprocessing.eeg_preprocessing - INFO -   - Rejected: 1 (1.6%)
2025-10-25 14:10:11,175 - preprocessing.eeg_preprocessing - INFO - 
Rejected epoch indices: [0]
2025-10-25 14:10:11,176 - preprocessing.eeg_preprocessing - INFO - Rejected epoch numbers: [0]
2025-10-25 14:10:11,178 - preprocessing.eeg_preprocessing - INFO - 
Mode: MARK - Keeping all epochs, marking bad ones
2025-10-25 14:10:11,208 - preprocessing.eeg_preprocessing - INFO - ✓ All 63 epochs retaine


预处理完成！
总 epochs: 63
坏 epochs: 1
坏 epochs 索引: [0]
好 epochs: 62

Processing History:
  1. marked_14_bad
  2. interpolated_14
  3. avg_ref_direct
  4. ica_30
  5. removed_2_ica
  6. epochs_2.0s_63
  7. autoreject_1marked


In [6]:
# ================================================
# Complete EEG Preprocessing
# ================================================

from preprocessing.eeg_preprocessing import preprocess_eeg_complete

reject_mode = 'mark' # 'mark'

result = preprocess_eeg_complete(
    eeg_raw_clean,              # Filtered data

    # Bad channel detection
    detect_bad_channels=True,   # Use pyprep
    ransac=False,               # False = more stable (recommended)
    interpolate=True,           # Interpolate bad channels

    # ICA settings
    apply_ica=True,             # Apply ICA
    n_ica_components=30,        # Number of ICA components

    # ICLabel automatic classification (no EOG/ECG needed)
    use_iclabel=True,           # Use ICLabel (recommended)
    brain_threshold=0.5,        # Brain threshold
    artifact_threshold=0.5,     # Artifact threshold
    exclude_labels=['eye', 'heart', 'muscle', 'line_noise', 'channel_noise'],

    # Reference BEFORE ICA (ICLabel recommendation)
    apply_reference=True,       # Apply average reference
    ref_channel='REF CZ',       # Reference channel
    drop_reference_channel=False,  # Keep the REF channel in the data
    reference_before_ica=True,  # Apply BEFORE ICA

    # Epoch creation and cleaning
    create_epochs=True,         # Create fixed-length epochs
    epoch_overlap=0.5,          # 0.5-second overlap
    epoch_tmin=-0.5,            # Start 0.5 s before the event
    epoch_tmax=2.0,             # End 2.0 s after the event
    epoch_baseline=(-0.5, 0),   # Apply baseline correction

    # Autoreject cleaning
    apply_autoreject=True,      # Use Autoreject
    autoreject_reject_mode=reject_mode,  # Keep all epochs, mark bad ones
    autoreject_n_jobs=4         # Parallel jobs
)

# Access results
epochs = result['epochs']                    # Epochs before Autoreject
epochs_clean = result['epochs_clean']        # Clean epochs (after Autoreject)
bad_idx = result['bad_epochs_idx']           # Indices of bad epochs: [3, 7, 15, ...]
preprocessor = result['preprocessor']       # Preprocessor object
ar = result['autoreject']                    # AutoReject object
reject_log = result['reject_log']            # Rejection log

print(" EEG Preprocessing Completed!")
print(f"Bad Channels: {preprocessor.bad_channels}")
print(preprocessor.get_processing_summary())

# 查看 Autoreject 结果
if epochs_clean is not None:
    print(f"  Total epochs: {len(epochs)}")
    print(f"  Before cleaning: {len(epochs_clean)}")
    print(f"  Reject Rate: {100*(len(epochs)-len(epochs_clean))/len(epochs):.1f}%")



2025-10-28 01:33:26,447 - preprocessing.eeg_preprocessing - INFO - 
2025-10-28 01:33:26,449 - preprocessing.eeg_preprocessing - INFO - EEG PREPROCESSING PIPELINE (WITH ICLabel & Autoreject)
2025-10-28 01:33:26,451 - preprocessing.eeg_preprocessing - INFO - 
[STEP 1] Detecting bad channels
2025-10-28 01:33:26,458 - preprocessing.eeg_preprocessing - INFO - DETECTING BAD CHANNELS WITH PYPREP
2025-10-28 01:33:26,469 - preprocessing.eeg_preprocessing - INFO - Total channels: 281
2025-10-28 01:33:26,476 - preprocessing.eeg_preprocessing - INFO - EEG channels (by type): 280
2025-10-28 01:33:26,483 - preprocessing.eeg_preprocessing - INFO - EEG channels (filtered): 280
2025-10-28 01:33:26,541 - preprocessing.eeg_preprocessing - INFO - Channels for pyprep: 280
2025-10-28 01:33:26,543 - preprocessing.eeg_preprocessing - INFO - Detection methods: correlation, deviation, HF noise
2025-10-28 01:33:26,999 - preprocessing.eeg_preprocessing - INFO - [1] Correlation detection...
2025-10-28 01:33:30,381

 EEG Preprocessing Completed!
Bad Channels: ['1', '190', '70', '18', '179', '65', '71', '58', '166', '243', '172', '228', '22', '28', '168', '60', '56', '3', '246', '69', '55', '250']

Processing History:
  1. marked_22_bad
  2. interpolated_22
  3. avg_ref_via_REF CZ
  4. ica_30
  5. removed_14_ica
  6. epochs_2.5s_47
  7. autoreject_0marked
  Total epochs: 47
  Before cleaning: 47
  Reject Rate: 0.0%


In [7]:
# 获取结果
eeg_preprocessed = result['raw']          # 清理后的连续数据
epochs = result['epochs']                  # 原始 epochs
epochs_clean = result['epochs_clean']      # Autoreject 清理后
preprocessor = result['preprocessor']      # 预处理器对象
ica = result['ica']                        # ICA 对象
ar = result['autoreject']                  # AutoReject 对象
reject_log = result['reject_log']          # 拒绝日志

print("\n 预处理完成!")
print(f"坏导: {preprocessor.bad_channels}")
print(preprocessor.get_processing_summary())


# 查看 Autoreject 结果
if epochs_clean is not None:
    print(f"\nAutoreject 统计:")
    print(f"  总 epochs: {len(epochs)}")
    print(f"  清理后: {len(epochs_clean)}")
    print(f"  拒绝率: {100*(len(epochs)-len(epochs_clean))/len(epochs):.1f}%")



 预处理完成!
坏导: ['228', '250', '60', '166', '190', '168', '58', '55', '1', '56', '18', '22', '246', '65', '243', '172', '179', '71', '28', '70', '3', '69']

Processing History:
  1. marked_22_bad
  2. interpolated_22
  3. avg_ref_via_REF CZ
  4. ica_30
  5. removed_14_ica
  6. epochs_2.5s_47
  7. autoreject_8marked

Autoreject 统计:
  总 epochs: 47
  清理后: 47
  拒绝率: 0.0%


In [7]:
# 检查 epoch 时间窗口与基线设置
print(f"tmin: {epochs_clean.tmin}")
print(f"tmax: {epochs_clean.tmax}")
print(f"Baseline: {epochs_clean.baseline}")



tmin: -0.5
tmax: 1.996
Baseline: (-0.5, 0.0)


In [None]:
from pathlib import Path
import mne

# 创建保存目录
save_dir = Path('/workspace/shared/temp')
save_dir.mkdir(parents=True, exist_ok=True)

# 保存预处理后的epochs
epochs_file = save_dir / 'epochs_clean-epo.fif'
epochs_clean.save(epochs_file, overwrite=True)
print(f"✓ Epochs saved: {epochs_file}")

## 读取部分
# from pathlib import Path
# import mne

# save_dir = Path('/workspace/shared/temp')
# epochs_clean = mne.read_epochs(save_dir / 'epochs_clean-epo.fif', preload=True)
# print(f"✓ Loaded {len(epochs_clean)} epochs")

✓ Epochs saved: /workspace/shared/temp/epochs_clean-epo.fif


In [5]:
# 读取部分
from pathlib import Path
import mne

save_dir = Path('/workspace/shared/temp')
epochs_clean = mne.read_epochs(save_dir / 'epochs_clean-epo.fif', preload=True)
print(f"✓ Loaded {len(epochs_clean)} epochs")

✓ Loaded 47 epochs


## Source Reconstruction



In [9]:
from preprocessing.align_headmodel import make_trans_from_coordinates

# 路径配置
SUBJECT_NAME = 'Roessner_Gerhard'
EEG_FILE = DATA_ROOT / SUBJECT_NAME / 'eeg' / 'Stim_On_55Hz_Full2.mff'
HEAD_MODEL = DATA_ROOT / SUBJECT_NAME / 'headmodel_ROESSNER.mat'
COORD_XML = EEG_FILE / 'coordinates.xml'
OUT_TRANS = f"/workspace/shared/data/bids_dataset/derivatives/mne-python/sub-001/{SUBJECT_NAME}-trans.fif"

# 运行ICP配准
trans, mean_mm, p95_mm = make_trans_from_coordinates(
    raw=epochs_clean,  # 或者使用raw
    coordinates_xml=COORD_XML,
    ft_headmodel_mat=HEAD_MODEL,
    out_trans_path=OUT_TRANS,
    max_iter=60,
    overwrite=True
)

print(f"Saved trans: {OUT_TRANS}")
print(f"Coarse alignment QA — mean distance: {mean_mm:.1f} mm, 95th pct: {p95_mm:.1f} mm")

Saved trans: /workspace/shared/data/bids_dataset/derivatives/mne-python/sub-001/Roessner_Gerhard-trans.fif
Coarse alignment QA — mean distance: 1.2 mm, 95th pct: 3.3 mm


### BUG: The current implementation does not use FEM headmodels for forward calculations; MNE also doesn't natively support FEM forward solutions, so we temporarily fall back to using a spherical model.

### Solution:
1. Install Duneuro (pyduneuro)
2. SimBio / OpenMEEG


In [8]:
import sys
import importlib

# 确保加载最新的源重建模块
if 'preprocessing.source_reconstruction' in sys.modules:
    importlib.reload(sys.modules['preprocessing.source_reconstruction'])
else:
    importlib.import_module('preprocessing.source_reconstruction')

from preprocessing.source_reconstruction import run_source_reconstruction_pipeline

SUBJECT_NAME = 'Roessner_Gerhard'
OUT_TRANS = f"/workspace/shared/data/bids_dataset/derivatives/mne-python/sub-001/{SUBJECT_NAME}-trans.fif"

# 为了避免内存占用过高，可限制参与源重建的 epoch 数量
MAX_EPOCHS_FOR_SOURCE = 47

results = run_source_reconstruction_pipeline(
    epochs=epochs_clean,
    headmodel_file='/workspace/shared/data/raw/Roessner_Gerhard/headmodel_ROESSNER.mat',
    atlas_dir='/workspace/shared/data/raw/AAL3v2_for_SPM12/AAL3',
    trans_file=OUT_TRANS,
    method='sLORETA',
    lambda2=1.0/9.0,
    noise_cov_method='auto',
    noise_cov_reg=0.1,
    max_epochs=MAX_EPOCHS_FOR_SOURCE,
    random_state=42,
    n_jobs=2
)

epoch_subset_indices = results['epoch_indices']
print(f"使用 {len(epoch_subset_indices)}/{len(epochs_clean)} 个 epoch 进行源重建")
print(f"✓ 源数: {results['fwd']['nsource']}")
print(f"✓ ROI 数量: {len(results['roi_timeseries'])}")
print(f"噪声协方差策略: {results['noise_cov_strategy']}")


INFO:preprocessing.source_reconstruction:
INFO:preprocessing.source_reconstruction:SOURCE RECONSTRUCTION PIPELINE v2.0
INFO:preprocessing.source_reconstruction:With Real Coregistration
INFO:preprocessing.source_reconstruction:
[STEP 0] Validating electrode positions...
INFO:preprocessing.source_reconstruction:  ✓ Found 280 EEG positions and 3 fiducials
INFO:preprocessing.source_reconstruction:Epochs to process: 47
INFO:preprocessing.source_reconstruction:
[STEP 0c] Preparing channel list for forward/inverse modelling...
INFO:preprocessing.source_reconstruction:  Dropping non-EEG channels: REF CZ
INFO:preprocessing.source_reconstruction:  EEG data marked with custom reference; creating average-reference copy
INFO:preprocessing.source_reconstruction:  Added average reference projector for inverse modelling
INFO:preprocessing.source_reconstruction:  Channels retained for modelling: 280
INFO:preprocessing.source_reconstruction:
[STEP 0.5] Loading head->MRI transform...
INFO:preprocessing.s

使用 47/47 个 epoch 进行源重建
✓ 源数: 166
✓ ROI 数量: 166
噪声协方差策略: baseline


In [18]:
# Access results
stc = results['stc']                      # Average source estimate
stcs_epochs = results['stcs_epochs']      # Per-epoch estimates
roi_timeseries = results['roi_timeseries']  # ROI time series

In [15]:
# 确保加载最新的源重建模块
if 'preprocessing.validate_source_reconstruction' in sys.modules:
    importlib.reload(sys.modules['preprocessing.validate_source_reconstruction'])
else:
    importlib.import_module('preprocessing.validate_source_reconstruction')

from preprocessing.validate_source_reconstruction import validate_preparation

# 基本验证
validate_preparation(epochs_clean)

# # 完整验证
# validate_preparation(
#     epochs=epochs_clean,
#     trans_file='/workspace/shared/data/bids_dataset/derivatives/mne-python/sub-001/sub-001-trans.fif',
#     headmodel_file='/workspace/shared/data/raw/Roessner_Gerhard/headmodel_ROESSNER.mat',
#     coord_xml='/workspace/shared/data/raw/Roessner_Gerhard/eeg/Stim_On_55Hz_Full2.mff/coordinates.xml',
#     plot=True
# )


源重建准备验证工具 v2.0

[检查 1/5] Epochs基本信息
  通道数: 281
  Epochs数: 47
  采样率: 250.0 Hz
  时间窗口: [-0.500, 1.996] s
  ✓ Epochs信息正常

[检查 2/5] 电极位置 (关键!)
  ✓ 找到 280 个EEG电极位置
  ✓ 找到 3 个基准点 (fiducials)
  位置统计:
    中心: [-0.0000, -0.0074, -0.0763] m
    最大半径: 0.1280 m
  ✓ 电极位置图已保存

[检查 3/5] 头-MRI变换 (Trans文件)
  ⚠️  未提供trans文件
  → 将使用identity transform (精度降低)
  → 建议运行ICP配准:
     from preprocessing.align_headmodel import make_trans_from_coordinates
     trans, mean_mm, p95_mm = make_trans_from_coordinates(...)

[检查 4/5] 头模型
  ⚠️  未提供头模型文件

[检查 5/5] 数据质量
  数据形状: (47, 281, 625)
  数据范围: [-4.048929e-04, 4.556057e-04]
  标准差: 6.581642e-05
  ✓ 数据无NaN/Inf
  ✓ 数据幅度合理

验证总结


True

In [12]:
# 确保加载最新的源重建模块
if 'preprocessing.validate_source_reconstruction' in sys.modules:
    importlib.reload(sys.modules['preprocessing.validate_source_reconstruction'])
else:
    importlib.import_module('preprocessing.validate_source_reconstruction')
    
from preprocessing.validate_source_reconstruction import validate_results

validate_results(results)


源重建结果验证工具 v1.0

[检查 1/5] 结果字典关键字段
  ✓ stc: 加权平均的源时序 (mne.SourceEstimate)
  ✓ stcs_epochs: 逐epoch的源时序列表
  ✓ roi_timeseries: ROI 聚合后的时间序列
  ✓ src: 离散源空间定义
  ✓ fwd: 正向模型
  ✓ inv: 逆算子
  ✓ noise_covariance: 噪声协方差

[检查 2/5] 平均源时序 (stc)
  ❌ stc 不是 mne.SourceEstimate 类型

[检查 3/5] 逐epoch源时序 (stcs_epochs)
  ❌ 以下 epoch 的源时序包含问题: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

[检查 4/5] ROI 时间序列
  ✓ ROI 数量: 166，时间长度一致

[检查 5/5] 元数据完整性
  ✓ 源空间包含 1 个脑区，合计 166 个顶点
  ✓ 检测到 166 个 ROI 索引
  ✓ 正向模型维度: 280 通道 × 498 自由度
  ✓ 逆算子已生成
  ✓ 噪声协方差对角线范围: [4.581e-10, 2.592e-08]

  ✓ ROI 时间序列预览图已保存为 source_reconstruction_roi_preview.png

❌ 结果检查存在问题，请根据提示修复


False

In [16]:
# 1. Check forward solution
fwd = results['fwd']
leadfield = fwd['sol']['data']

print("Forward solution check:")
print(f"  Leadfield shape: {leadfield.shape}")
print(f"  Leadfield range: [{np.min(leadfield):.2e}, {np.max(leadfield):.2e}]")
print(f"  Leadfield mean: {np.mean(leadfield):.2e}")
print(f"  Leadfield std: {np.std(leadfield):.2e}")

# 2. Check noise covariance
inv = results['inv']
noise_cov = results['noise_covariance']

print("\nNoise covariance check:")
print(f"  Covariance shape: {noise_cov['data'].shape}")
print(f"  Covariance range: [{np.min(noise_cov['data']):.2e}, {np.max(noise_cov['data']):.2e}]")
print(f"  Diagonal mean: {np.mean(np.diag(noise_cov['data'])):.2e}")

# 3. Check epochs data range
epochs_data = epochs_clean.get_data()
print("\nEpochs data check:")
print(f"  Data shape: {epochs_data.shape}")
print(f"  Data range: [{np.min(epochs_data):.2e}, {np.max(epochs_data):.2e}]")
print(f"  Data mean: {np.mean(epochs_data):.2e}")
print(f"  Data std: {np.std(epochs_data):.2e}")

# 4. Check baseline
baseline_data = epochs_clean.copy().crop(tmin=-0.5, tmax=0).get_data()
print("\nBaseline data check:")
print(f"  Baseline shape: {baseline_data.shape}")
print(f"  Baseline std: {np.std(baseline_data):.2e}")

Forward solution check:
  Leadfield shape: (280, 498)
  Leadfield range: [-4.76e+02, 5.92e+02]
  Leadfield mean: 1.20e+00
  Leadfield std: 3.96e+01

Noise covariance check:
  Covariance shape: (280, 280)
  Covariance range: [-1.52e-08, 2.59e-08]
  Diagonal mean: 4.76e-09

Epochs data check:
  Data shape: (47, 281, 625)
  Data range: [-4.05e-04, 4.56e-04]
  Data mean: -9.50e-16
  Data std: 6.58e-05

Baseline data check:
  Baseline shape: (47, 281, 126)
  Baseline std: 6.57e-05


## 4. LFP专用预处理

In [18]:
# 创建LFP预处理器
lfp_prep = LFPPreprocessor()

# 4.1 解析电极接触点
print("\n=== Parse electrode contacts ===")
electrode_info = lfp_prep.parse_electrode_contacts(lfp_raw_clean)

print(f"\nLeft electrodes: {electrode_info['left']}")
print(f"Right electrodes: {electrode_info['right']}")

INFO:preprocessing.lfp_preprocessing:解析电极接触点信息...
INFO:preprocessing.lfp_preprocessing:✓ 左侧电极: 2 个接触点
INFO:preprocessing.lfp_preprocessing:✓ 右侧电极: 2 个接触点



=== Parse electrode contacts ===

Left electrodes: ['LFP_L', 'STIM_L']
Right electrodes: ['LFP_R', 'STIM_R']


In [18]:
# 4.2 去除刺激伪迹（如果有DBS刺激）
has_stimulation = False  # 如果有刺激，设置为True

if has_stimulation:
    print("\n=== 去除刺激伪迹 ===")
    
    # 提取刺激事件
    stim_events = None  # 需要从数据中提取
    
    lfp_raw_clean = lfp_prep.remove_stimulation_artifacts(
        lfp_raw_clean,
        stim_events=stim_events,
        method='template',
        window=(-0.005, 0.01),
        copy=False
    )

In [19]:
# 4.3 应用双极参考
print("\n=== 应用双极参考 ===")
lfp_raw_bipolar = lfp_prep.apply_bipolar_reference(
    lfp_raw_clean,
    copy=True  # 保留单极数据
)


=== 应用双极参考 ===


INFO:preprocessing.lfp_preprocessing:✓ 已应用双极参考，生成 2 个双极通道


In [20]:
# 4.4 增强信噪比
print("\n=== 增强信噪比 ===")
lfp_raw_enhanced = lfp_prep.enhance_snr(
    lfp_raw_bipolar,
    method='car',
    copy=True
)

INFO:preprocessing.lfp_preprocessing:✓ 已应用共平均参考(CAR)



=== 增强信噪比 ===


In [21]:
# 4.5 可选：小波去噪或平滑
use_wavelet_denoising = False  # 可选
use_smoothing = False  # 可选

if use_wavelet_denoising:
    print("\n=== 小波去噪 ===")
    lfp_raw_enhanced = lfp_prep.apply_wavelet_denoising(
        lfp_raw_enhanced,
        wavelet='db4',
        level=4,
        copy=False
    )

if use_smoothing:
    print("\n=== 平滑 ===")
    lfp_raw_enhanced = lfp_prep.apply_smoothing(
        lfp_raw_enhanced,
        window_length=11,
        polyorder=3,
        copy=False
    )

print(lfp_prep.get_processing_summary())

LFP处理步骤:
  1. parse_electrode_contacts
  2. apply_bipolar_reference
  3. enhance_snr_car



In [None]:
# 4.6 基于EEG事件创建对齐的epochs
print("\n=== 基于EEG事件创建对齐的epochs ===")

(eeg_epochs_sync,lfp_epochs_sync,kept_epoch_indices,dropped_epoch_indices) = joint_prep.align_lfp_to_eeg_epochs(
    eeg_epochs=epochs_clean,
    lfp_raw=lfp_raw_enhanced,
    preload=True,
    drop_bad_from_eeg=True,
)

print(f"保留 {len(eeg_epochs_sync)} / {len(epochs_clean)} 个epochs")
if len(dropped_epoch_indices) > 0:
    print(f"丢弃的EEG epochs索引: {dropped_epoch_indices.tolist()}")

# -----------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [22]:
# 4.6 创建LFP epochs
print("\n=== 创建LFP epochs ===")

lfp_epochs = mne.Epochs(
    lfp_raw_enhanced,
    events,
    event_id,
    tmin=-0.5,
    tmax=1.5,
    baseline=(-0.5, 0),
    preload=True,
    verbose=False
)

print(f"保留 {len(lfp_epochs)} 个epochs")


=== 创建LFP epochs ===


NameError: name 'events' is not defined

## 5. 联合处理

In [None]:
# 创建联合处理器
joint_prep = JointPreprocessor()

# 5.1 对齐时间窗口
print("\n=== 对齐时间窗口 ===")
eeg_aligned, lfp_aligned = joint_prep.align_time_windows(
    eeg_raw_clean,
    lfp_raw_enhanced,
    crop_to='shorter'
)

In [None]:
# 5.2 同步epochs
print("\n=== 同步epochs ===")
eeg_epochs_sync, lfp_epochs_sync = joint_prep.synchronize_epochs(
    eeg_epochs,
    lfp_epochs,
    tolerance=0.001
)

In [None]:
# 5.3 频段分解
print("\n=== 频段分解 ===")

# 定义感兴趣的频段
freq_bands = {
    'theta': (4, 8),
    'alpha': (8, 13),
    'low_beta': (13,),
    'high_beta': (13, 30),
    'gamma': (30, 100)
}

# 提取EEG频段
eeg_bands = joint_prep.extract_frequency_bands(
    eeg_aligned,
    bands=freq_bands,
    method='filter'
)

# 提取LFP频段
lfp_bands = joint_prep.extract_frequency_bands(
    lfp_aligned,
    bands=freq_bands,
    method='filter'
)

In [None]:
# 5.4 计算频段功率
print("\n=== 计算频段功率 ===")

eeg_band_power = joint_prep.compute_band_power(
    eeg_epochs_sync,
    bands=freq_bands,
    method='welch'
)

lfp_band_power = joint_prep.compute_band_power(
    lfp_epochs_sync,
    bands=freq_bands,
    method='welch'
)

# 显示频段功率形状
for band in freq_bands.keys():
    print(f"  EEG {band}: {eeg_band_power[band].shape}")
    print(f"  LFP {band}: {lfp_band_power[band].shape}")

In [None]:
# 5.5 标准化
print("\n=== 标准化 ===")

# 标准化频段功率
for band in freq_bands.keys():
    eeg_band_power[band] = joint_prep.normalize_signals(
        eeg_band_power[band],
        method='zscore',
        axis=-1
    )
    
    lfp_band_power[band] = joint_prep.normalize_signals(
        lfp_band_power[band],
        method='zscore',
        axis=-1
    )

In [None]:
# 5.6 准备连接性分析数据
print("\n=== 准备连接性分析数据 ===")

connectivity_data = joint_prep.prepare_connectivity_data(
    eeg_epochs_sync,
    lfp_epochs_sync
)

print(f"连接性数据形状: {connectivity_data['data'].shape}")
print(f"EEG通道数: {connectivity_data['n_eeg']}")
print(f"LFP通道数: {connectivity_data['n_lfp']}")

print(joint_prep.get_processing_summary())

## 6. 质量控制与保存

In [None]:
# 创建质量控制器
output_dir = './qc_outputs'
qc = QualityControl(output_dir=output_dir)

# 6.1 绘制功率谱对比
print("\n=== 生成质量控制图 ===")

qc.plot_psd_comparison(
    eeg_raw_orig,
    eeg_raw_clean,
    save_path=f'{output_dir}/eeg_psd_comparison.png'
)

qc.plot_psd_comparison(
    lfp_raw_orig,
    lfp_raw_enhanced,
    save_path=f'{output_dir}/lfp_psd_comparison.png'
)

In [None]:
# 6.2 绘制信号对比
qc.plot_signal_comparison(
    eeg_raw_orig,
    eeg_raw_clean,
    duration=5.0,
    channel_idx=0,
    save_path=f'{output_dir}/eeg_signal_comparison.png'
)

qc.plot_signal_comparison(
    lfp_raw_orig,
    lfp_raw_enhanced,
    duration=5.0,
    channel_idx=0,
    save_path=f'{output_dir}/lfp_signal_comparison.png'
)

In [None]:
# 6.3 绘制epochs质量
qc.plot_epochs_quality(
    eeg_epochs_sync,
    save_path=f'{output_dir}/eeg_epochs_quality.png'
)

qc.plot_epochs_quality(
    lfp_epochs_sync,
    save_path=f'{output_dir}/lfp_epochs_quality.png'
)

In [None]:
# 6.4 绘制频段分解
qc.plot_frequency_bands(
    eeg_bands,
    channel_idx=0,
    duration=5.0,
    save_path=f'{output_dir}/eeg_frequency_bands.png'
)

qc.plot_frequency_bands(
    lfp_bands,
    channel_idx=0,
    duration=5.0,
    save_path=f'{output_dir}/lfp_frequency_bands.png'
)

In [None]:
# 6.5 计算信噪比
print("\n=== 计算信噪比 ===")

eeg_snr = qc.compute_snr(eeg_raw_clean)
lfp_snr = qc.compute_snr(lfp_raw_enhanced)

In [None]:
# 6.6 生成质量报告
print("\n=== 生成质量报告 ===")

# 收集所有处理步骤
all_processing_steps = (
    eeg_cleaner.processing_history +
    eeg_prep.processing_log +
    lfp_cleaner.processing_history +
    lfp_prep.processing_log +
    joint_prep.processing_log
)

# 收集质量指标
quality_metrics = {
    'n_eeg_epochs': len(eeg_epochs_sync),
    'n_lfp_epochs': len(lfp_epochs_sync),
    'eeg_mean_snr': np.mean(list(eeg_snr.values())),
    'lfp_mean_snr': np.mean(list(lfp_snr.values())),
    'n_bad_channels': len(bad_channels)
}

# 生成报告
qc_report = qc.generate_qc_report(
    preprocessing_steps=all_processing_steps,
    metrics=quality_metrics,
    save_path=f'{output_dir}/quality_control_report.txt'
)

print(qc_report)

In [None]:
# 6.7 保存为BIDS derivatives
print("\n=== 保存BIDS derivatives ===")

saver = BIDSDerivativesSaver(
    bids_root=bids_root,
    derivatives_name='preprocessing'
)

# 保存预处理后的原始数据
saver.save_preprocessed_raw(
    eeg_raw_clean,
    subject=subject,
    session=session,
    task=task,
    datatype='eeg',
    suffix='eeg',
    run=run,
    description='clean'
)

saver.save_preprocessed_raw(
    lfp_raw_enhanced,
    subject=subject,
    session=session,
    task=task,
    datatype='ieeg',
    suffix='ieeg',
    run=run,
    description='clean'
)

# 保存epochs
saver.save_epochs(
    eeg_epochs_sync,
    subject=subject,
    session=session,
    task=task,
    datatype='eeg',
    run=run,
    description='clean'
)

saver.save_epochs(
    lfp_epochs_sync,
    subject=subject,
    session=session,
    task=task,
    datatype='ieeg',
    run=run,
    description='clean'
)

# 保存处理元数据
processing_info = {
    'preprocessing_steps': all_processing_steps,
    'quality_metrics': quality_metrics,
    'bad_channels': bad_channels,
    'artifact_components': artifact_comps['all'],
    'frequency_bands': freq_bands
}

saver.save_derivative_metadata(
    processing_info,
    subject=subject,
    session=session
)

print("\n✓ 预处理完成！所有结果已保存。")

## 总结

预处理流程已完成，包括：

1. ✓ 数据验证与检查
2. ✓ 通用清洗（去趋势、滤波、重采样）
3. ✓ EEG预处理（坏导插值、重参考、ICA、分段）
4. ✓ LFP预处理（伪迹去除、双极参考、增强SNR）
5. ✓ 联合处理（时间对齐、频段分解、标准化）
6. ✓ 质量控制与保存

下一步可以进行：
- 跨通道连接性分析
- 时频分析
- 相位-振幅耦合（PAC）
- 功能性连接
- 统计分析