# Filtering - Pre-Processing of the SEEG Signal (1/2)
This notebook presents the **pre-processing stage 1** the SEEG signal goes through before being fed to the SNN. The pre-processing stages are as follows:
1. **Filtering**: The SEEG signal is bandpass filtered to remove noise and artifacts. The bandpass filter is designed using the Butterworth filter and, since we are working with *iEEG*, the signal is filtered in the ripples and FR bands. The co-occurrence of HFOs in both bands is an optimal prediction of post-surgical seizure freedom by defining an optimal "HFO area" or EZ zone.
2. **Signal-to-Spike Conversion**: To interface and communicate with the silicon neurons in the SNN, the SEEG signal must be converted to spikes.

## Filtering
Depending on the EEG modality, the signal is filtered in different frequency bands. In this case, since we are handling *iEEG* or *sEEG* data, the signal is filtered in both the ripples (80-250Hz) and FR bands (250-500Hz). The co-occurrence of HFO in these bands represents an optimal prediction of post-surgical seizure freedom by defining an optimal "HFO area" or EZ zone.

The filter is implemented in different ways depending on the setup it will run on.
1. **Neuromorphic Hardware**: The filter is implemented using analog filters. 
2. **Software Simulation**: *Butterworth filters* are utilized since they are a good approximation of the tuned *Tow-Thomas* architectures implemented in hardware.

The frequency response of the *Butterworth filter* is maximally flat in the passband and rolls of towards 0 in the stopband.

### Check WD (change if necessary) and file loading

In [146]:
# Show current directory
import os
curr_dir = os.getcwd()
print(curr_dir)

# Check if the current WD is the file location
if "/src/hfo/filter" not in os.getcwd():
    # Set working directory to this file location
    file_location = f"{os.getcwd()}/thesis-lava/src/hfo/filter"
    print("File Location: ", file_location)

    # Change the current working Directory
    os.chdir(file_location)

    # New Working Directory
    print("New Working Directory: ", os.getcwd())

PATH_TO_FILE = '' # 'src/hfo/'  # This is needed if the WD is not the same as the file location

/home/monkin/Desktop/feup/thesis/thesis-lava/src/hfo/filter


In [147]:
import numpy as np
import math

seeg_file_name = "seeg_synthetic_humans.npy"
recorded_data = np.load(f"{PATH_TO_FILE}data/{seeg_file_name}")

print("Data shape: ", recorded_data.shape)
print("First time steps: ", recorded_data[:10])

Data shape:  (245760, 960)
First time steps:  [[ 3.2352024e-01 -1.3235390e+00 -5.9668809e-01 ... -1.9608999e+00
  -1.9769822e-01 -1.2078454e+00]
 [-6.9759099e-04 -3.5122361e+00 -4.8766956e-01 ... -5.8757830e+00
  -7.4400985e-01 -5.1096064e-01]
 [ 1.9026639e+00 -5.6726017e+00  9.8274893e-01 ... -6.6182971e+00
  -8.3053267e-01 -8.1596655e-01]
 ...
 [ 3.2172418e+00 -8.4650068e+00  1.5216088e+00 ... -4.1081657e+00
   2.0085973e-01 -4.7539668e+00]
 [ 1.7725919e+00 -9.4744024e+00  1.6776791e+00 ... -4.1469693e+00
   1.6412770e+00 -3.4672713e+00]
 [ 7.8109097e-01 -1.0500931e+01  2.3717029e+00 ... -5.1762242e+00
   1.0715837e+00 -4.4489903e+00]]


## Define the Filter

In [148]:
from scipy.signal import butter, lfilter

# ================================================================ #
# ============ Butterworth Filter Coefficients =================== #
# ================================================================ #
def butter_bandpass(lowcut, highcut, sampling_freq, order=5):
    """
    This function is used to generate the coefficients for lowpass, highpass and bandpass
    filtering for Butterworth filters.
    @lowcut, highcut (int): cutoff frequencies for the bandpass filter
    @sampling_freq (float): sampling_frequency frequency of the wideband signal
    @order (int): filter order

    - return b, a (float): filtering coefficients that will be applied on the wideband signal
    """
    nyq = 0.5 * sampling_freq   # Nyquist frequency
    low = lowcut / nyq          # Normalizing the cutoff frequencies
    high = highcut / nyq        # Normalizing the cutoff frequencies

    return butter(order, [low, high], btype='band')    

# ================================================================ #
# ====================== Butterworth Filters ===================== #
# ================================================================ #
def butter_bandpass_filter(data, lowcut, highcut, sampling_freq, order=5):
    """
    This function applies the filtering coefficients calculated above to the wideband signal (original signal).
    @data (array): Array with the amplitude values of the wideband signal.
    @lowcut, highcut (int): cutoff frequencies for the bandpass filter.
    @sampling_freq (float): sampling frequency of the original signal.
    @order (int): filter order.

    - return (array): Array with the amplitude values of the filtered signal.
    """
    coef_b, coef_a = butter_bandpass(lowcut, highcut, sampling_freq, order)

    return lfilter(coef_b, coef_a, data)
    

## Define Global Parameters of the Experiment

In [149]:
sampling_rate = 2048    # 2048 Hz
input_duration = 120 * (10**3)    # 120000 ms or 120 seconds
num_samples = recorded_data.shape[0]    # 2048 * 120 = 245760
num_channels = recorded_data.shape[1]   # 960

x_step = 1/sampling_rate * (10**3)  # 0.48828125 ms

### Extract a window of channels from the SEEG data
Let's define the window first.

If we want to extract a single channel, set the variable `is_single_channel` to `True` and the variable `min_channel_idx` to the desired channel number.

In [150]:
is_single_channel = False   # Set to True if you want to use only one channel

# Define the window of channels to be used
BRAIN_REGION_IDX = 1
BRAIN_REGION_OFFSET = BRAIN_REGION_IDX * 120
SNR_OFFSET = 90     # Choose the highest SNR (channels 90-120)
min_channel_idx =BRAIN_REGION_OFFSET + SNR_OFFSET
max_channel_idx = min_channel_idx + 30

if is_single_channel:
    # Set the window to size 1
    max_channel_idx = min_channel_idx + 1

In [151]:
from utils.io import preview_np_array
seeg_window = recorded_data[:, min_channel_idx:max_channel_idx]

preview_np_array(seeg_window, "SEEG Window")

SEEG Window Shape: (245760, 30).
Preview: [[ 9.6835774e-01  7.7345985e-01  7.6677883e-01  6.1160040e-01
  -8.8918343e-02 ...  1.3314213e+00  5.2814152e-02 -1.0629585e+00
   1.7152629e+00 -7.4064404e-01]
 [ 1.1097224e+00  2.0653048e+00  1.7075658e+00  1.0589859e+00
   4.1997379e-01 ...  5.9186798e-01  6.9132549e-01 -8.0105793e-01
   2.2942369e+00 -1.2228327e+00]
 [ 2.5909829e+00  1.7661601e+00  3.0228820e+00  9.3765438e-01
   2.8263158e-01 ...  1.2598464e+00  3.9316627e-01 -6.9141585e-01
   2.7291439e+00  7.2643161e-01]
 [ 3.4566371e+00  2.4618118e+00  3.3156486e+00  2.8687816e+00
   3.8806129e-01 ... -1.7202418e-01 -4.0790819e-02 -1.3614488e+00
   2.9467266e+00 -4.7984955e-01]
 [ 3.6298239e+00  3.0240128e+00  5.3321171e+00  4.7688065e+00
  -7.6529050e-01 ... -6.7479324e-01  1.1853703e-01 -1.5241061e+00
   2.5720155e+00  2.7392292e-01]
 ...
 [ 2.3135878e+01  4.8931633e+01  3.5224194e+01 -5.6878113e+01
   7.1059384e+00 ... -4.2149609e+01 -3.0201139e+01  3.3220646e+01
  -1.0734201e+01  1.

## Apply the Butterworth filter to each channel

In [152]:
# Apply the Butterworth filter to the window of channels in the Ripple Band
ripple_lowcut_freq = 80
ripple_highcut_freq = 250
BUTTER_FILTER_ORDER = 9

ripple_band_seeg_window = [ butter_bandpass_filter(seeg_window[:, i], ripple_lowcut_freq, ripple_highcut_freq, sampling_rate, BUTTER_FILTER_ORDER) for i in range(seeg_window.shape[1]) ]
ripple_band_seeg_window = np.array(ripple_band_seeg_window).T
preview_np_array(ripple_band_seeg_window, "Ripple Band SEEG Window", edge_items=3)

Ripple Band SEEG Window Shape: (245760, 30).
Preview: [[ 1.44691902e-06  1.15570282e-06  1.14572004e-06 ... -1.58827133e-06
   2.56294383e-06 -1.10666946e-06]
 [ 2.13031407e-05  1.87770931e-05  1.81070225e-05 ... -2.27610921e-05
   3.82254449e-05 -1.68525406e-05]
 [ 1.52432562e-04  1.45216651e-04  1.38967373e-04 ... -1.55646279e-04
   2.73891591e-04 -1.20129523e-04]
 ...
 [ 3.92131829e-01  2.53294667e+00 -1.61170218e+00 ... -1.66417493e+00
  -1.26089363e+00  2.79554420e-01]
 [ 6.56078410e-01  3.00621041e+00 -2.24920785e+00 ... -7.83048151e-01
  -2.38567670e+00  3.61910718e-01]
 [ 1.06512350e+00  2.94551573e+00 -2.64752234e+00 ...  1.46409238e-01
  -3.16311593e+00  2.17212640e-01]]


In [153]:
# Apply the Butterworth filter to the window of channels in the Fast Ripple Band
fr_lowcut_freq = 250
fr_highcut_freq = 500

fr_band_seeg_window = [ butter_bandpass_filter(seeg_window[:, i], fr_lowcut_freq, fr_highcut_freq, sampling_rate, BUTTER_FILTER_ORDER) for i in range(seeg_window.shape[1]) ]
fr_band_seeg_window = np.array(fr_band_seeg_window).T
preview_np_array(fr_band_seeg_window, "FR Band SEEG Window", edge_items=3)

FR Band SEEG Window Shape: (245760, 30).
Preview: [[ 2.80009862e-05  2.23653281e-05  2.21721399e-05 ... -3.07364566e-05
   4.95984597e-05 -2.14164276e-05]
 [ 1.99537996e-04  1.93467647e-04  1.81967995e-04 ... -2.06971117e-04
   3.62944891e-04 -1.63432224e-04]
 [ 4.39056782e-04  5.45779942e-04  5.19070343e-04 ... -3.47581087e-04
   7.80731229e-04 -3.22186277e-04]
 ...
 [ 9.96073296e-01  5.22409334e-01  8.91212760e-02 ...  5.44964048e-02
  -3.23298156e-01 -2.31611379e-03]
 [ 3.70795050e-01 -9.22678456e-02  7.12282639e-01 ...  8.55518893e-02
  -4.94126390e-01  4.93363841e-01]
 [-4.20540216e-01 -2.94000993e-01  7.46649220e-01 ...  2.16137002e-01
  -2.02953484e-01  7.06477583e-01]]


Apply the Butterworth filter in the combined Ripple+FR Band

In [154]:
# Apply the Butterworth filter to the window of channels in the Combined Ripple and Fast Ripple Band
both_band_seeg_window = [ butter_bandpass_filter(seeg_window[:, i], ripple_lowcut_freq, fr_highcut_freq, sampling_rate, BUTTER_FILTER_ORDER) for i in range(seeg_window.shape[1]) ]
both_band_seeg_window = np.array(both_band_seeg_window).T
preview_np_array(both_band_seeg_window, "Both Bands SEEG Window", edge_items=3)

Both Bands SEEG Window Shape: (245760, 30).
Preview: [[ 1.23342556e-03  9.85178420e-04  9.76668603e-04 ... -1.35392128e-03
   2.18478048e-03 -9.43379958e-04]
 [ 1.17188033e-02  1.08618422e-02  1.03350798e-02 ... -1.23323947e-02
   2.11761606e-02 -9.43953317e-03]
 [ 4.89985753e-02  5.12966914e-02  4.88565274e-02 ... -4.66048632e-02
   8.79189093e-02 -3.80077474e-02]
 ...
 [ 2.63210851e+00  2.64870860e+00 -3.09298858e+00 ...  3.10937766e-01
  -2.41210200e+00 -9.81901531e-01]
 [ 1.85221251e+00  2.34025783e+00 -3.61656477e+00 ...  1.01354877e-01
  -1.70941771e+00 -1.48195304e+00]
 [ 8.16442552e-01  1.40049197e+00 -3.17550705e+00 ...  9.22226037e-02
  -1.39753644e+00 -2.06974462e+00]]


## Import the Markers (Annotated Events) 
The markers are stored in a numpy array of shape (num_channels, events):
- Each row represents the events of a channel
- Each event is composed of the following 3 fields (Label, Position, Shape)

In [155]:
markers_seeg_file_name = "seeg_synthetic_humans_markers.npy"
markers = np.load(f"{PATH_TO_FILE}data/{markers_seeg_file_name}")

preview_np_array(markers, "Markers", edge_items=3)

Markers Shape: (960, 42).
Preview: [[('Spike+Ripple+Fast-Ripple',   1000.  , 0.)
  ('Spike+Ripple+Fast-Ripple',   4537.6 , 0.)
  ('Ripple+Fast-Ripple',   7610.84, 0.) ... ('Ripple', 113024.  , 0.)
  ('Fast-Ripple', 116549.  , 0.) ('Spike+Ripple', 119000.  , 0.)]
 [('Spike+Fast-Ripple',   1000.  , 0.)
  ('Spike+Ripple+Fast-Ripple',   3849.12, 0.)
  ('Ripple+Fast-Ripple',   7010.25, 0.) ...
  ('Fast-Ripple', 114176.  , 0.) ('Spike+Fast-Ripple', 116672.  , 0.)
  ('Fast-Ripple', 119000.  , 0.)]
 [('Fast-Ripple',   1000.  , 0.)
  ('Spike+Ripple+Fast-Ripple',   4357.42, 0.)
  ('Fast-Ripple',   7062.01, 0.) ... ('Spike+Fast-Ripple', 113759.  , 0.)
  ('Ripple+Fast-Ripple', 116295.  , 0.) ('Spike', 119000.  , 0.)]
 ...
 [('Spike+Fast-Ripple',   1000.  , 0.) ('Spike',   3671.88, 0.)
  ('Fast-Ripple',   6912.6 , 0.) ... ('Ripple', 114088.  , 0.)
  ('Spike', 116028.  , 0.) ('Spike+Ripple', 119000.  , 0.)]
 [('Spike+Fast-Ripple',   1000.  , 0.) ('Fast-Ripple',   3782.23, 0.)
  ('Ripple+Fast-Ripple'

### Define the set of channels the markers will be extracted from

In [156]:
channels_used = set(range(min_channel_idx, max_channel_idx))
print("Channels used: ", channels_used)

Channels used:  {210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239}


## Visualize the filtered signals

In [157]:
# Interactive Plot for the HFO detection
# bokeh docs: https://docs.bokeh.org/en/2.4.1/docs/first_steps/first_steps_1.html

from utils.line_plot import create_fig  # Import the function to create the figure
from bokeh.models import Range1d

# Define the x and y values
# Should the first input start at 0 or x_step?
# TODO: is it okay to create a range with floats?
x = [val for val in np.arange(x_step, input_duration + x_step, x_step)] 

## Create the Plot

In [158]:
# Create the plot
# List of tuples containing the y values and the legend label
hfo_y_arrays = []

PLOT_RIPPLE_BAND = False
PLOT_FR_BAND = False
PLOT_BOTH_BAND = True

if is_single_channel:
    # Add the Ripple and FR bands of the single channel
    hfo_y_arrays.append((ripple_band_seeg_window[:, 0], f"Ripple Band Ch. {min_channel_idx}"))
    hfo_y_arrays.append((fr_band_seeg_window[:, 0], f"Fast Ripple Band Ch. {min_channel_idx}"))
else:
    # Add the Ripple, FR and both bands of each channel in the range defined below
    min_hfo_idx = 0
    max_hfo_idx = 8
    if PLOT_RIPPLE_BAND:
        for hfo_idx in range(min_hfo_idx, max_hfo_idx, 1):
            hfo_y_arrays.append((ripple_band_seeg_window[:, hfo_idx], f"Ripple Band Ch. {min_channel_idx + hfo_idx}"))
    if PLOT_FR_BAND:
        for hfo_idx in range(min_hfo_idx, max_hfo_idx, 1):
                hfo_y_arrays.append((fr_band_seeg_window[:, hfo_idx], f"Fast Ripple Band Ch. {min_channel_idx + hfo_idx}"))
    if PLOT_BOTH_BAND:
        for hfo_idx in range(min_hfo_idx, max_hfo_idx, 1):
            hfo_y_arrays.append((both_band_seeg_window[:, hfo_idx], f"Both Bands Ch. {min_channel_idx + hfo_idx}"))


# Create the SEEG Voltage plot
hfo_plot = create_fig(
    title="SEEG Voltage dynamics of Filtered Both Bands", 
    x_axis_label='time (ms)', 
    y_axis_label='Voltage (μV)',
    x=x, 
    y_arrays=hfo_y_arrays, 
    sizing_mode="stretch_both", 
    tools="pan, box_zoom, wheel_zoom, hover, undo, redo, zoom_in, zoom_out, reset, save",
    tooltips="Data point @x: @y",
    legend_location="top_right",
    legend_bg_fill_color="navy",
    legend_bg_fill_alpha=0.1,
    # y_range=Range1d(-0.05, 1.05)
)

# If there are more than 30 channels, hide the legend
if len(hfo_y_arrays) > 30:
    # Hide the legend
    hfo_plot.legend.visible = False

## Add Box Annotations to the plot to identify the marked HFOs (ground truth)

In [159]:
from bokeh.models import BoxAnnotation
# from utils.line_plot import color_map

show_markers = False    # Boolean to show the markers

color_map = {                  
    'Spike': 'red',
    'Fast-Ripple': 'blue',
    'Ripple': 'green',  
    'Spike+Ripple': 'yellow',
    'Spike+Fast-Ripple': 'pink',
    'Ripple+Fast-Ripple': 'cyan',
    'Spike+Ripple+Fast-Ripple': 'black'
}

confidence_range = 100          # TODO: Check this value. When the duration is missing (0), we consider the 200ms window around the marked position 
visited_markers = {}    # Avoid inserting multiple boxes for the same marker (only one of each label)
use_visited = False     # Boolean controlling if we remove duplicate markers
plot_instant = True     # Boolean to plot the markers as instant events or as boxes
instant_width = 100 # 20       # Width of the instant event for visualization purposes

if show_markers:
    for ch_idx in channels_used:
        channel_markers = markers[ch_idx]
        # print("channel_markers", channel_markers)
        for idx2, marker in enumerate(channel_markers):
            # print("marker:", marker)
            
            if use_visited:
                # Check if the marker has already been visited and skip it if it has
                if marker['position'] in visited_markers:
                    visited_labels = visited_markers[marker['position']]    # Get the labels that already have an annotation for this position
                    if marker['label'] in visited_labels:
                        # print("Skipping marker", marker['position'], marker['label'])
                        continue    # Skip this marker
                    else:
                        visited_labels.append(marker['label'])  # Add the label to the visited labels
                else:
                    visited_markers[marker['position']] = [marker['label']] # Add the marker to the visited markers

            # Add a box annotation for each marker
            has_duration = marker['duration'] > 0
            
            confidence_constant = 0 if plot_instant or has_duration else confidence_range

            left = marker['position'] - confidence_constant
            right = marker['position'] + confidence_constant + instant_width
            box_color = color_map[marker['label']]  # Choose a color according to the label
            
            # if left < min_t or right > max_t:
            #     continue    # Skip this marker
            

            box = BoxAnnotation(left=left, right=right, fill_color=box_color, fill_alpha=0.35)
            # print("Added marker for channel: ", ch_idx, " at position: ", left)
            hfo_plot.add_layout(box)

## Show the Plot

In [160]:
import bokeh.plotting as bplt

showPlot = True
if showPlot:
    bplt.show(hfo_plot)

## Export the plot to a file

In [161]:
export = False
file_name = f"filtered_seeg_ch{min_hfo_idx}" if is_single_channel else f"filtered_seeg_ch{min_channel_idx}-{max_channel_idx - 1}"

if export:
    file_path = f"{PATH_TO_FILE}plots/synthetic/{file_name}.html"

    # Customize the output file settings
    bplt.output_file(filename=file_path, title="SEEG Data - Filtered Voltage dynamics across time")

    # Save the plot
    bplt.save(hfo_plot)

## Export the filtered signals to a numpy file

### Get the relevant Markers

In [162]:
# Save the relevant markers in a variable
relevant_markers = markers[min_channel_idx:max_channel_idx]
preview_np_array(relevant_markers, "Relevant Markers", edge_items=3)

Relevant Markers Shape: (30, 42).
Preview: [[('Spike+Fast-Ripple',   1000.  , 0.) ('Spike',   3550.29, 0.)
  ('Spike+Fast-Ripple',   7136.23, 0.) ... ('Spike', 112707.  , 0.)
  ('Ripple', 115751.  , 0.) ('Fast-Ripple', 119000.  , 0.)]
 [('Spike+Ripple+Fast-Ripple',   1000.  , 0.)
  ('Ripple+Fast-Ripple',   3881.35, 0.)
  ('Spike+Fast-Ripple',   6403.32, 0.) ... ('Ripple', 111854.  , 0.)
  ('Fast-Ripple', 115339.  , 0.) ('Spike', 119000.  , 0.)]
 [('Spike+Fast-Ripple',   1000.  , 0.)
  ('Ripple+Fast-Ripple',   3666.02, 0.)
  ('Spike+Ripple+Fast-Ripple',   6256.35, 0.) ...
  ('Spike', 114276.  , 0.) ('Ripple', 116753.  , 0.)
  ('Fast-Ripple', 119000.  , 0.)]
 ...
 [('Spike',   1000.  , 0.) ('Spike',   4355.96, 0.)
  ('Spike',   7033.69, 0.) ... ('Ripple', 113413.  , 0.)
  ('Ripple+Fast-Ripple', 116427.  , 0.)
  ('Spike+Ripple+Fast-Ripple', 119000.  , 0.)]
 [('Ripple',   1000.  , 0.) ('Ripple+Fast-Ripple',   3895.51, 0.)
  ('Spike',   7462.4 , 0.) ...
  ('Spike+Ripple+Fast-Ripple', 112991

In [163]:
EXPORT_FILTERED_SIGNAL = True
file_name = f"filtered_seeg_ch{min_channel_idx}" if is_single_channel else f"filtered_seeg_ch{min_channel_idx}-{max_channel_idx-1}"
if EXPORT_FILTERED_SIGNAL:
    # Export the filtered signals
    np.save(f"{PATH_TO_FILE}results/synthetic/{file_name}_ripple_band.npy", ripple_band_seeg_window)
    np.save(f"{PATH_TO_FILE}results/synthetic/{file_name}_fr_band.npy", fr_band_seeg_window)
    np.save(f"{PATH_TO_FILE}results/synthetic/{file_name}_both_bands.npy", both_band_seeg_window)

    # Export the markers
    np.save(f"{PATH_TO_FILE}results/synthetic/{file_name}_markers.npy", relevant_markers)