# Using DSP to improve the accuracy of an EEG classifier

**Authors**
1. Kasra Lekan (kl5sq)
2. Derek Johnson (dej3tc)
3. Fiji Marcelin (fm4cg)

## Experimental Setup
**Signal Data**: We applied the a peak finding algorithm along with a bandpass and butterworth filter to sleep stage EEG data. This data is taken from a study looking at “slow-wave microconfinuity” during sleep [1]. 

**MNE Package**: The MNE package ...

In [7]:
from mne_data_setup import *

## Method 1: Peak Finder
### Theory
I implemented four peak finder algorithms (naive_logical_find_peaks, naive_mathematical_find_peaks, peak_typing_finder, noisy_find_peaks). Note that the first two naive implementations were designed by me while the other two follow from algorithms written by others. Also note that all of these algorithms can easily be converted to also identify valleys by multiplying the signal data by -1 before running the algorithm. Additionally, only noisy_find_peaks is specifically designed to handle noisy signal data like eeg.

Each algorithm was compared to the baseline of the MNE implementation of peaking finding on a sin wave signal and the eeg data described above. 

1. naive_logical_find_peaks – Compares if values are greater than their neighbors in the signal. The number of neighbors in the comparison is controlled by a parameter.
2. naive_mathematical_find_peaks – Performs peak detection on three steps: 1. root mean square 2. peak to average ratios 3. first order logic. Thus, the method assumes that the underlying data follows a particular distribution, i.e. peaks will occur when the squared value of signal value divided by the root mean square (RMS) is larger than its neighbor values. By using a threshold, the algorithm attempts to handle any noise present in the dataset.
3. peak_typing_finder – This algorithm is noteable for its time efficiency and that it handles various kinds of peaks based on edge (e.g. None, 'rising', 'falling', 'both'). Otherwise, the algorithm is quite similar to naive_logical_find_peaks. 
4. noisy_find_peaks – 

### Implementation & Performance

#### MNE Baselines

In [8]:
from peak_finder import PeakFinder as pf

t = np.arange(0, 3, 0.01)
signal_sin = np.sin(np.pi*t) - np.sin(0.5*np.pi*t)
mne_sin_peak_locs, mne_sin_peak_mags = mne.preprocessing.peak_finder(signal_sin) 

signal_eeg = raw_train.get_data()[0]
format_percent = lambda x, y: np.round(len(x)/len(y), 4)
mne_eeg_peak_locs, mne_eeg_peak_mags = mne.preprocessing.peak_finder(raw_train.get_data()[0])

def success_metrics(results, signal='eeg', string=""):
    if signal == 'eeg':
        signal = signal_eeg
        mne_peak_locs = mne_eeg_peak_locs
    elif signal == 'sin':
        signal = signal_sin
        mne_peak_locs = mne_sin_peak_locs

    common_peaks = np.intersect1d(results, mne_peak_locs)
    common_peaks_len = len(common_peaks)

    results_len = len(results)
    peak_to_signal_ratio = format_percent(results, signal)

    actual_to_predicted_peak_count_ratio = format_percent(results, mne_peak_locs)

    print(string + f"Peaks: {results_len} ({peak_to_signal_ratio}), Intersect Num: {common_peaks_len} ({actual_to_predicted_peak_count_ratio})")
    return None

Found 2 significant peaks
Found 29454 significant peaks


**MNE Commentary**: The MNE algorithm performs well on noisy data while keeping time complexity low.

#### naive_logical_find_peaks

In [9]:
peaks_eeg = {}
success_metrics(mne_eeg_peak_locs, signal='eeg', string="MNE: ")
distances = [15, 35, 50, 100, 155]
for distance in distances:
    peaks_eeg[distance] = pf.naive_logical_find_peaks(signal_eeg, min_distance=distance)
    success_metrics(peaks_eeg[distance], signal='eeg', string=f"Distance: {distance}, ")

print('\n')

peaks_sin = {}
success_metrics(mne_sin_peak_locs, signal='sin', string="MNE: ")
for distance in distances:
    peaks_sin[distance] = pf.naive_logical_find_peaks(signal_sin, min_distance=distance)
    success_metrics(peaks_sin[distance], signal='sin', string=f"Distance: {distance}, ")

MNE: Peaks: 29454 (0.0037), Intersect Num: 29454 (1.0)
Distance: 15, Peaks: 188047 (0.0237), Intersect Num: 28906 (6.3844)
Distance: 35, Peaks: 87758 (0.011), Intersect Num: 27407 (2.9795)
Distance: 50, Peaks: 63595 (0.008), Intersect Num: 25730 (2.1591)
Distance: 100, Peaks: 34871 (0.0044), Intersect Num: 21150 (1.1839)
Distance: 155, Peaks: 26413 (0.0033), Intersect Num: 18193 (0.8968)


MNE: Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Distance: 15, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Distance: 35, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Distance: 50, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Distance: 100, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Distance: 155, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)


**Results Commentary**: In terms of time, naive_logical_find_peaks did not perform as well as MNE. Additionally, it performed poorly at distinguishing peaks in the noisy EEG dataset. 

#### naive_mathematical_find_peaks

In [10]:
success_metrics(mne_eeg_peak_locs, signal='eeg', string="MNE: ")
ind_eeg_naive_mathematical_find_peaks = pf.naive_mathematical_find_peaks(signal_eeg)
success_metrics(ind_eeg_naive_mathematical_find_peaks, signal='eeg')

print('\n')

success_metrics(mne_sin_peak_locs, signal='sin', string="MNE: ")
ind_sin_naive_mathematical_find_peaks = pf.naive_mathematical_find_peaks(signal_sin)
success_metrics(ind_sin_naive_mathematical_find_peaks, signal='sin')

MNE: Peaks: 29454 (0.0037), Intersect Num: 29454 (1.0)
Peaks: 2128087 (0.2677), Intersect Num: 29070 (72.2512)


MNE: Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Peaks: 2 (0.0067), Intersect Num: 1 (1.0)


**Results Commentary**: In terms of time, naive_mathematical_find_peaks performed similarly to MNE. However, it performed poorly at distinguishing peaks in the noisy EEG dataset with 72.25 times as many peaks identified compared to MNE. 

#### peak_typing_finder

In [11]:
minimum_height = 4e-5
edges = ['rising', 'falling', 'both', None]
success_metrics(mne_eeg_peak_locs, signal='eeg', string="MNE: ")
for edge in edges:
    ind_eeg_peak_typing_finder = pf.peak_typing_finder(signal_eeg, minimum_height=minimum_height, minimum_distance=1, edge=edge)
    success_metrics(ind_eeg_peak_typing_finder, signal='eeg', string=f"Edge: {edge}, ")

print('\n')

success_metrics(mne_sin_peak_locs, signal='sin', string="MNE: ")
for edge in edges:
    ind_sin_peak_typing_finder = pf.peak_typing_finder(signal_sin, minimum_height=minimum_height, minimum_distance=1, edge=edge)
    success_metrics(ind_sin_peak_typing_finder, signal='sin', string=f"Edge: {edge}, ")

MNE: Peaks: 29454 (0.0037), Intersect Num: 29454 (1.0)
Edge: rising, Peaks: 161038 (0.0203), Intersect Num: 25484 (5.4674)
Edge: falling, Peaks: 161118 (0.0203), Intersect Num: 25323 (5.4702)
Edge: both, Peaks: 162401 (0.0204), Intersect Num: 25484 (5.5137)
Edge: None, Peaks: 159755 (0.0201), Intersect Num: 25323 (5.4239)


MNE: Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Edge: rising, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Edge: falling, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Edge: both, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)
Edge: None, Peaks: 2 (0.0067), Intersect Num: 2 (1.0)


**Results Commentary**: In terms of time, peak_typing_finder peformed as well or better than MNE. However, it performed moderately well at distinguishing peaks in the noisy EEG dataset with 5.47 times as many peaks identified compared to MNE while covering 87% of the peaks. The edge type did not significantly affect performance on this data, likely due to the noise in the EEG dataset.

@Derek and @Fiji

## Experiment Results

## Challenges
...

## Work Breakdown

Kasra Lekan: 
- Coding experimental setup (including background MNE research how to modify underlying data)
- Peak Detection Algorithm

Derek Johnson: 
- Butterworth Filter

Fiji Marcelin: 
- Band Pass Filter

All:
- Combining filters and testing classification accuracy

## References
1. B. Kemp, A. H. Zwinderman, B. Tuk, H. A. C. Kamphuisen, and J. J. L. Oberyé. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000. doi:10.1109/10.867928.
    Dataset for analysis
2. https://mne.tools/stable/index.html
    Implementation of foundational EEG signals pipleline
3. https://neuraldatascience.io/intro.html
    E-book that covers analysis of EEG data in the frequency domain.
4. https://en.wikipedia.org/wiki/Band-pass_filter
5. https://en.wikipedia.org/wiki/Butterworth_filter
6. https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html
7. http://www.scholarpedia.org/article/Electroencephalogram
    An overview of electroencephologram (EEG) collection. 
