# End-to-End Blink Refinement and Validation

This notebook demonstrates the complete workflow of loading a raw EOG signal, refining blink annotations, and validating the results against a ground truth. We will use the `prepare_refined_segments` function, which encapsulates the entire process.

In [1]:
from pathlib import Path
import pandas as pd
from pyear.utils import prepare_refined_segments

# Get the project root directory
PROJECT_ROOT = Path().resolve().parent

# Define file paths
RAW_FILE = PROJECT_ROOT / "unitest" / "ear_eog.fif"
GROUND_TRUTH_FILE = PROJECT_ROOT / "unitest" / "ear_eog_blink_count_epoch.csv"

# Load ground truth data
ground_truth = pd.read_csv(GROUND_TRUTH_FILE)

## The `prepare_refined_segments` function

The `prepare_refined_segments` function is a high-level utility that handles the entire blink refinement process. Here’s a breakdown of what it does behind the scenes:

1.  **Loads the Raw Data:** If you provide a file path, it loads the `mne.io.Raw` object.
2.  **Slices into Epochs:** It divides the continuous recording into 30-second segments (or epochs).
3.  **Refines Blink Timings:** For each segment, it identifies the precise start, peak, and end of every blink using the `refine_blinks_from_epochs` function. This is the core of the refinement process, where it analyzes the signal to find the exact moments of eye closure and opening.
4.  **Updates Annotations:** Finally, it replaces the original, rough annotations in each segment with the new, precise ones.

The function returns two things: a list of the processed `mne.io.Raw` segments and a list of dictionaries containing the detailed refined blink information.

In [3]:
segments, refined_blinks = prepare_refined_segments(RAW_FILE, channel="EOG-EEG-eog_vert_left")

INFO:pyear.utils.raw_preprocessing:Preparing raw segments for blink features
  raw = mne.io.read_raw_fif(raw, preload=False, verbose=False)
INFO:pyear.utils.epochs:Slicing raw into epochs (30.0s)
Cropping epochs: 100%|██████████| 60/60 [00:00<00:00, 82.93epoch/s]
INFO:pyear.utils.refinement:Refining blinks across 60 segments
INFO:pyear.utils.refinement:Refined 132 blink annotations
INFO:pyear.utils.raw_preprocessing:Updating annotations for 60 segments
Segments:   0%|          | 0/60 [00:00<?, ?it/s]
Seg 0 annotations:   0%|          | 0/2 [00:00<?, ?it/s]
                                                        
Seg 1 annotations:   0%|          | 0/1 [00:00<?, ?it/s]
                                                        
Seg 2 annotations: 0it [00:00, ?it/s]
                                     
Seg 3 annotations: 0it [00:00, ?it/s]
                                     
Seg 4 annotations:   0%|          | 0/1 [00:00<?, ?it/s]
                                                        


## Validating the Results

Now that we have the refined segments, we can count the blinks in each one and compare the counts to our ground truth data. This allows us to verify that the refinement process correctly identified all blinks.

In [4]:
# Count blinks in each segment
refined_counts = [len(segment.annotations) for segment in segments]

# Create a DataFrame for comparison
validation_df = pd.DataFrame({
    'Epoch': ground_truth['epoch_id'],
    'Ground Truth Blinks': ground_truth['blink_count'],
    'Refined Blinks': refined_counts[:len(ground_truth)]
})

# Check if the counts match
validation_df['Match'] = validation_df['Ground Truth Blinks'] == validation_df['Refined Blinks']

print("Validation Results:")
display(validation_df)

Validation Results:


Unnamed: 0,Epoch,Ground Truth Blinks,Refined Blinks,Match
0,0,2,2,True
1,1,1,1,True
2,2,0,0,True
3,3,0,0,True
4,4,1,1,True
5,5,0,0,True
6,6,1,1,True
7,7,0,0,True
8,8,1,1,True
9,9,1,1,True


## Known Limitation: Boundary-Spanning Blinks

As you can see in the validation table, the blink counts for epochs 31 and 55 do not match the ground truth. This is a known limitation in the current version of the processing pipeline.

The discrepancy occurs because a single blink annotation can sometimes span across the boundary of two consecutive 30-second segments. For example, a blink might start at 29.9 seconds in epoch 31 and end at 30.1 seconds in epoch 32. The current refinement logic does not yet handle this specific edge case correctly, leading to an inaccurate count in the affected epochs.

**TODO:** Future work will address this by implementing a mechanism to merge or correctly attribute blinks that cross epoch boundaries.

### Visualizing the Discrepancy

To better understand the issue, let's plot the EOG signal for the problematic epochs (31 and 55) and their subsequent epochs (32 and 56). We will overlay the refined blink annotations to see exactly where the refinement process is placing the blink markers.