A modular pipeline for classifying High-Frequency Oscillations (HFOs) in intracranial EEG (iEEG) as artifactual or non-artifactual. The full pipeline is implemented in MATLAB, with a Python implementation available for the classification step.
📄 This pipeline is described in: Tan, S. B., et al. (2026). A comprehensive, physician-trained algorithm to remove artifactual false positive High Frequency Oscillations in long-term intracranial EEG. Journal of Neural Engineering. (accepted)
MIAF was evaluated on a fully held-out test cohort (2024) after all training and feature selection were completed on a separate cohort (2022). Key results on the held-out test set:
| Metric | Before MIAF | After MIAF |
|---|---|---|
| Precision (PPV) | 86% | 98% (p < 0.001) |
| Specificity | 0% | 92% [89–95%] |
| Sensitivity (naHFOs) | 100% | 73% [71–75%] |
| Sensitivity (trueHFOs) | 100% | 86% [84–88%] |
| AUROC | — | 0.92 [0.90–0.93] |
| AUPRC | — | 0.99 [0.98–0.99] |
Confidence intervals computed with Clopper-Pearson (specificity/sensitivity) and bootstrapping (AUROC/AUPRC).
- Removed 92.4% of artifacts while retaining 73.1% of non-artifactual HFOs
- Reduced artifact contamination from 13.7% → 1.6% of all detections (12.1% drop)
- After filtering, 98.4% of remaining HFOs were non-artifactual
Applied to 49+ million HFO detections from the full hospitalization of all 35 patients:
- Improved HFO correlation with seizure onset zone (SOZ) in 76.5% of patients (26/34)
- Improved HFO correlation with resected volume in 88.9% of patients (16/18)
- Enables HFO analysis across all states of vigilance (not restricted to NREM sleep)
Compared logistic regression, SVM, support vector regression, and a 3-layer neural network across 6 signal source configurations. Binary logistic regression on intracranial data (HFO channel + intracranial CAR) was selected as the final model — statistically comparable to top alternatives, with superior interpretability and no dependence on scalp EEG.
MIAF outperformed two alternative artifact detectors (qHFO detector alone; qHFO + muscle artifact + background activity filters) on both ROC and precision-recall curves.
A. Overview of the Pipeline
B. Repository Structure
C. Installation & Setup
D. Running the Pipeline
D1. Full MATLAB Pipeline
D2. HFO Classification Step Only (Python)
E. Step-by-Step Usage (1–3)
1. Preprocessing
2. User-Defined HFO Detection
3. HFO Classification
F. Output Files
G. Example Datasets for Visualization
H. Visualizing EEG + HFO Events in EEGLAB
I. Citation
The MIAF pipeline consists of three modular stages:
-
EEG Preprocessing
Remove bad channels, apply common average reference (CAR), and bandpass filter signals. -
HFO Detection (User-Defined)
You supply your own algorithm that outputs candidate HFO events. -
HFO Classification
A logistic regression classifier assigns each event to:- artifact
- non-artifactual HFO (naHFO)
- invalid
graph TD
A[Raw EEG] --> B[Step 1: Preprocessing]
B --> C[Preprocessed EEG]
C --> D[Step 2: User HFO Detection]
D --> E[Detected HFO Events]
E --> F[Step 3: HFO Classification]
F --> G[Classification .csv Output]
MIAF/
├── input_data/
├── data_for_visualization/
├── matlab/
│ ├── demo_full_pipeline.m
│ ├── preprocess_eeg_for_hfo_detection.m
│ ├── classify_events.m
│ ├── preprocessing/
│ ├── features/
│ ├── classification/
│ ├── utils/
│ └── requirements/
├── python/
│ ├── demo_classify_events.py
│ ├── classify_events.py
│ ├── classification/
│ ├── utils/
│ └── requirements.txt
├── model/
│ └── model.json
└── outputs/
Note: The outputs/ folder is created during runtime and is not tracked in the repository.
Option 1 — Download ZIP
- Click Code → Download ZIP
- Extract and open the folder in MATLAB
Option 2 — Git clone
git clone https://github.com/tshibei/MIAF.git
cd MIAFIn MATLAB, run from matlab/requirements/:
check_requirementspython -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r python/requirements.txtIn MATLAB, run all steps (preprocessing, feature extraction, and classification) from matlab/:
demo_full_pipelineThis script:
- Loads example inputs
- Runs preprocessing
- Extracts event features
- Performs HFO classification
- Writes outputs to the
outputs/folder
Generated files include:
preprocessed_eeg.matpreprocessed_channel_info.matclassified_hfo_events.csvhfo_event_features.mat
If preprocessing and detection have already been completed, you may run only Step 3 using Python.
Run from python/:
python demo_classify_events.pyThis script:
- Loads detected HFO events
- Reads preprocessed EEG (MATLAB v7.3 via
h5py) - Extracts features
- Applies logistic regression using
model.json - Produces the same outputs as the MATLAB version
MATLAB and Python produce identical classification results.
Below are the pipeline’s three steps in detail.
Place the following files in input_data/:
1. Raw EEG (raw_eeg.mat)
-
Format: MATLAB v7.3
-
Example:
example_raw_eeg.mat -
It must contain the variable:
Variable Description datatime × channels matrix of EEG recordings
• Minimum recommended duration: ~10 minutes
• Rows represent time points, columns represent channels
2. Channel Information (channel_info.mat)
-
Example:
example_channel_info.mat -
It must contain the variables:
Variable Description ecog_chan_idx1-based indices of ECoG channels (column vector), or empty array []if none existdepth_chan_idx1-based indices of depth (SEEG) channels (column vector), or empty array []if none exist
3. Sampling Rate (sampling_rate.mat)
- Example:
example_sampling_rate.mat - It must contain the variable:
Variable Description fsSampling rate in Hz (float)
preprocess_eeg_for_hfo_detection( ...
'input_data_dir', 'input_data', ...
'raw_eeg_file', 'raw_eeg.mat', ...
'channel_info_file', 'channel_info.mat', ...
'sampling_rate_file', 'sampling_rate.mat', ...
'output_dir', 'outputs', ...
'preprocessed_eeg_file', 'preprocessed_eeg.mat', ...
'preprocessed_channel_info_file', 'preprocessed_channel_info.mat');The following outputs are generated in the outputs/ folder.
preprocessed_eeg.mat
| Variable | Description |
|---|---|
ecog_hfo_raw |
CAR-referenced ECoG signals (raw, unfiltered) |
ecog_car_raw |
Common average reference (CAR) signal for ECoG |
ecog_hfo_filt |
Bandpass-filtered ECoG signals (80–500 Hz) |
ecog_car_filt |
Bandpass-filtered ECoG CAR signal |
depth_hfo_raw |
CAR-referenced depth signals (raw, unfiltered) |
depth_car_raw |
Common average reference signal for depth channels |
depth_hfo_filt |
Bandpass-filtered depth signals (80–500 Hz) |
depth_car_filt |
Bandpass-filtered depth CAR signal |
All matrices preserve the original time × channels organization.
preprocessed_channel_info.mat
| Field | Description |
|---|---|
ecog_chan_idx |
1-based indices of ECoG channels |
depth_chan_idx |
1-based indices of depth (SEEG) channels |
good_chan_mask |
Logical vector indicating valid channels (true = good, false = excluded), determined using automated noise detection via invalid_epoch_detector |
You may use any HFO detection method. Detection must operate on the following variables from preprocessed_eeg.mat in the outputs/ folder generated in Step 1.
| Variable | Description |
|---|---|
ecog_hfo_filt |
[time × channels] matrix of common average referenced and bandpass filtered ECoG channels |
depth_hfo_filt |
[time × channels] matrix of common average referenced and bandpass filtered depth channels |
Detection should be performed separately for ECoG and depth channels, and the outputs should be concatenated (see below).
Save the following variables into detected_hfo_events.mat in the input_data/ folder (see example in example_detected_hfo_events.mat).
| Variable | Description |
|---|---|
start_idx |
Start sample index (1-based), column vector |
end_idx |
End sample index (1-based), column vector |
chan_idx |
Channel index (1-based), column vector |
All variables must be column vectors of equal length.
Available in both MATLAB and Python.
The model file stored in the model/ folder, model.json, contains a pre-trained logistic regression model stored as a dictionary with the following fields:
| Field | Type | Description |
|---|---|---|
coefficients |
List[float] |
Regression weights (β) for each feature, ordered to match features. |
intercept |
float |
Intercept term (β₀) added to the linear predictor. |
threshold |
float |
Probability cutoff for binary classification (e.g., probability ≥ threshold → class 1). |
features |
List[str] |
Ordered list of feature names used by the model. |
mean |
List[float] |
Feature-wise mean values used for z-score normalization. |
std |
List[float] |
Feature-wise standard deviation values used for z-score normalization. |
In MATLAB, run from matlab/
classify_events( ...
'input_data_dir', 'input_data', ...
'detected_hfo_events_file', 'detected_hfo_events.mat', ...
'sampling_rate_file', 'sampling_rate.mat', ...
'model_dir', 'model', ...
'model_file', 'model.json', ...
'output_dir', 'outputs', ...
'preprocessed_eeg_file', 'preprocessed_eeg.mat', ...
'preprocessed_channel_info_file', 'preprocessed_channel_info.mat', ...
'classified_hfo_events_file', 'classified_hfo_events.csv', ...
'hfo_event_features_file', 'hfo_event_features.mat');Alternatively,
Run Python code from python/
python classify_events.py \
--input-data-dir input_data \
--detected-hfo-events-file detected_hfo_events.mat \
--sampling-rate-file sampling_rate.mat \
--model-dir model \
--model-file model.json \
--output-dir outputs \
--preprocessed-eeg-file preprocessed_eeg.mat \
--preprocessed-channel-info-file preprocessed_channel_info.mat \
--classified-hfo-events-file classified_hfo_events.csv \
--hfo-event-features-file hfo_event_features.matoutputs/hfo_event_features.mat
| Variable | Description |
|---|---|
X |
[events × features] matrix of extracted features for each detected HFO event • Rows represent individual HFO events • Columns represent feature values used for classification |
outputs/classified_hfo_events.csv
| Column | Description |
|---|---|
start_idx |
Event start index |
end_idx |
Event end index |
chan_idx |
Channel |
prob_HFO |
Predicted probability |
is_HFO_thresh_0_95 |
Binary label |
is_bad_chan |
1 = bad channel, 0 = good channel |
is_nan_event |
1 = invalid event (invalid scores or event detected on bad channels), 0 = valid event |
Located in:
data_for_visualization/
Each example contains:
example_raw_eeg_{n}.set: EEG signalsexample_detected_hfo_events_{n}.set: Detected HFO events with ground-truth labels (brain / artifact)example_classified_hfo_events_{n}.set: MIAF-classified HFO events with classification labels (artifact / non-artifactual HFO (naHFO) / invalid)
These are EEGLAB .set datasets for visualization only.
Download from:
https://sccn.ucsd.edu/eeglab/download.php
- In EEGLAB: File → Load existing dataset(s)
- Open the
.setfiles indata_for_visualization/
Open the scroll viewer:
Plot → Channel data (scroll)
Use the arrows to scroll through the EEG.
Three representative EEG epochs are shown to illustrate how the classifier behaves under different signal conditions:
This epoch contains one labeled artifact that is visually obvious. The classifier successfully identified the labeled artifact as well as additional unlabeled artifacts present in the segment. This demonstrates the model’s ability to generalize to artifact morphology even beyond the manually labeled examples.
This epoch includes one labeled HFO arising from neural activity. The classifier correctly identified the labeled HFO; however, several additional brain-generated HFOs occurring within a brief window were misclassified as artifacts. These HFOs are subtle and not superimposed on clear spikes, making them difficult to distinguish from artifacts, especially given that the detector was tuned conservatively to prioritize artifact removal.
This epoch contains a mixture of muscle artifacts and brain-generated HFOs. A scalp EEG trace is also displayed for reference. Prominent HFOs appear on sharp waves, but significant muscle activity is present as well, making it difficult to differentiate the two using a single channel alone. The classifier successfully distinguished the muscle artifacts from the brain-generated HFOs.
If you use MIAF in your research, please cite:
@article{tan2026miaf,
author = {Tan, Shi Bei and Gliske, Stephen V. and John, Neha Sara and Kerr, Wesley T. and Mihaylova, Temenuzhka and Smith, Garnett and McNamara, Nancy and Beimer, Nicholas and Romanowski, Erin Fedak and Stacey, William C.},
title = {A comprehensive, physician-trained algorithm to remove artifactual false positive {High Frequency Oscillations} in long-term intracranial {EEG}},
journal = {Journal of Neural Engineering},
year = {2026},
note = {Accepted}
}See LICENSE.md for full license details.