# Persyst Spike Detection

This is a simple test to see how sensitive Persyst's spike detection is to bias caused by artifacts. If the analysis is windowed like it appears to be visually, we expect there to be little to no bias.

This test is done in three stages:
1. Find 3 artifact heavy eeg recordings. Define a window of artifact-free eeg and create a clipped file that contains just that window.
2. Run Persyst's spike detector on both sets of files (not shown here).
3. Analyze the detected spikes for both files within that good window and check to see if they are the same

## 0. Basic Setup

In [1]:
from pathlib import Path
import pandas as pd
import numpy as np

import mne

import datetime

In [2]:
root = Path("D:/OneDriveParent/OneDrive - Johns Hopkins/persystSpikeCheck")
sourcedir = root / "whole_recordings"
outdir = root / "clipped_recordings"

Here is where we define the good windows. We manually selected these windows through prior visual analysis

In [3]:
# only want to export onece
export = False
# [start, stop] times in seconds
patient_map = {
    "3": [0, 300],
    "20": [0, 300],
    "29": [30, 330],
}

EDF export currently requires manual channel type setting. This is accomplished here

In [4]:
eeg_channels = ["Fp1", "Fp2", "F3", "F4", "F7", "F8", "P3", "P4", "C3", "C4", "P7", "P8", "O1", "O2", "T3", "T4", "T5",
            "T6", "T7", "T8", "Cz", "Pz", "Fz"]

In [5]:
def _get_real_type(ch_name):
    for eeg_ch in eeg_channels:
        if eeg_ch in ch_name and "POL" not in ch_name:
            return "eeg"
    return "misc"

In [6]:
def get_real_types(base_names):
    type_dict = dict()
    for ch in base_names:
        type_dict[ch] = _get_real_type(ch)
    return type_dict

## 1. Clip and Export

In [7]:
if export:
    for subject, (start_sec, stop_sec) in  patient_map.items():
        raw_fpath = sourcedir / f"{subject}.edf"
        out_fpath = outdir / f"{subject}.edf"
        raw = mne.io.read_raw_edf(raw_fpath)
        raw_crop = raw.crop(start_sec, stop_sec)
        # If there is a meas date set, we want to shift the cropped start time by the same amount
        if raw.info['meas_date'] is not None:
            tdelta = datetime.timedelta(0,start_sec)
            raw_crop.set_meas_date(raw.info['meas_date'] + tdelta)
        # Manually set the channel types to eeg or misc for non-eeg.
        raw_crop.set_channel_types(get_real_types(raw_crop.ch_names))
        raw_crop.export(out_fpath)

## 3. Compare Detections

In [8]:
subjects = list(patient_map.keys())
whole_fpaths = [sourcedir / f"{sub}-archive.lay" for sub in subjects]
clipped_fpaths = [outdir / f"{sub}-archive.lay" for sub in subjects]

Only want to look at the annotations that were for spikes

In [9]:
def filter_annotations_for_spikes(annotations):
    inds = [ind for ind,annot in enumerate(annotations) if 'spike ' not in annot['description']]
    return inds     

In [15]:
for subject, w_fpath, c_fpath in zip(subjects, whole_fpaths, clipped_fpaths):
    raw = mne.io.read_raw_persyst(w_fpath)
    raw_clip = mne.io.read_raw_persyst(c_fpath)
    # Simple way of setting time scales the same, since we just care about annotations within the cropped frame.
    # Deletes other annotations
    raw = raw.crop(patient_map[subject][0], patient_map[subject][1])
    raw_annot = raw.annotations
    # Crop's annotation reset only is applied if the orig_time is None. So we will do it if this not the case
    if raw_annot.orig_time is not None:
        raw_annot.onset -= patient_map[subject][0]
    clip_annot = raw_clip.annotations
    raw_annot.delete(filter_annotations_for_spikes(raw_annot))
    clip_annot.delete(filter_annotations_for_spikes(clip_annot))
    
    
    raw_df = pd.DataFrame(raw_annot)
    clip_df = pd.DataFrame(clip_annot)

    print(f"Looking at: {subject}")
    confirm_df = pd.merge(clip_df, raw_df, on=['onset','duration', 'description'], how='outer', indicator='Found').sort_values('onset').reset_index(drop=True)
    confirm_df = confirm_df[confirm_df.columns.drop(list(confirm_df.filter(regex='orig_time')))]
    display(confirm_df)
    

Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\whole_recordings\3-archive.lay
Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\clipped_recordings\3-archive.lay
Looking at: 3


Unnamed: 0,onset,duration,description,Found
0,0.017,0.0,polyspike admixed,left_only
1,26.165,0.0,spike t3-at3 0.84,both
2,120.915,0.0,spike f7-af7 1.01,both
3,172.0,0.0,polyspike admixed,right_only
4,172.57,0.0,spike fz-afz 1.02,both
5,172.845,0.0,spike f7-af7 0.85,both
6,212.4,0.0,spike f7-af7 0.94,both


Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\whole_recordings\20-archive.lay
Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\clipped_recordings\20-archive.lay
Looking at: 20


Unnamed: 0,onset,duration,description,Found
0,17.865,0.0,spike fz-afz 0.87,both
1,18.965,0.0,spike fz-afz 1.05,both
2,25.435,0.0,spike fz-afz 1.05,both
3,26.145,0.0,spike fz-afz 0.98,left_only
4,26.145,0.0,spike fz-afz 0.96,right_only
5,27.65,0.0,spike fz-afz 0.88,both
6,28.67,0.0,spike cz-acz 0.99,left_only
7,28.67,0.0,spike cz-acz 1.00,right_only
8,29.365,0.0,spike fz-afz 1.03,both
9,37.51,0.0,spike fz-afz 0.85,both


Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\whole_recordings\29-archive.lay
Loading D:\OneDriveParent\OneDrive - Johns Hopkins\persystSpikeCheck\clipped_recordings\29-archive.lay
Looking at: 29


Unnamed: 0,onset,duration,description,Found
0,0.01,0.0,z spike wave r front,left_only
1,14.165,0.0,spike t5-at5 0.91,left_only
2,14.165,0.0,spike t5-at5 0.89,right_only
3,66.765,0.0,spike t5-at5 0.89,left_only
4,66.765,0.0,spike t5-at5 0.88,right_only
5,68.79,0.0,spike t5-at5 1.05,both
6,69.62,0.0,spike f3-af3 0.84,both
7,70.48,0.0,spike f7-af7 0.85,left_only
8,70.48,0.0,spike f7-af7 0.83,right_only
9,71.045,0.0,spike t5-at5 1.04,left_only


### The recording for patient 29 is by far the worst. Spike onsets are often identical, with slight variation in the durations. This is often not a problem since we rarely care about a spike's duration, unless we want to classify a spike as a true spike or sharp wave

In [19]:
raw = mne.io.read_raw_edf("D:/OneDriveParent/OneDrive - Johns Hopkins/30Hz_FilterCheck/raw/7- lots of artifact.edf")
raw.set_channel_types(get_real_types(raw.ch_names))
raw.export("D:/OneDriveParent/OneDrive - Johns Hopkins/30Hz_FilterCheck/sourcedata/7.edf")

Extracting EDF parameters from D:\OneDriveParent\OneDrive - Johns Hopkins\30Hz_FilterCheck\raw\7- lots of artifact.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 260199  =      0.000 ...  1300.995 secs...


POL SpO2, POL EtCO2, POL Pulse, POL CO2Wave
  raw = mne.io.read_raw_edf("D:/OneDriveParent/OneDrive - Johns Hopkins/30Hz_FilterCheck/raw/7- lots of artifact.edf")
POL SpO2, POL EtCO2, POL Pulse, POL CO2Wave
  raw = mne.io.read_raw_edf("D:/OneDriveParent/OneDrive - Johns Hopkins/30Hz_FilterCheck/raw/7- lots of artifact.edf")
