This notebook contains the functions we used and brief descriptions of what they do and how we used them in our implementation.

In [None]:
import mne, os
import xml.etree.ElementTree as ET
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np

The first set contains functions from premade python packages (os and mne). The main pages for these functions can be found here: 
https://docs.python.org/3/library/os.html
https://mne.tools/stable/index.html

The next block sets up the file path using the os package. The file path leads to an edf file, which contains the signal data we want to read in. 
Documentation: https://docs.python.org/3/library/os.path.html#module-os.path

In [None]:
data_root = os.path.join(os.getcwd(), 'Examples 2024', '00000016-APDx20974') # this is an example participant ID
data_path = os.path.join(data_root, '00000016-APDx20974[001].edf')
meta_path = os.path.join(data_root, '00000016-APDx20974.rml')

The next file loads the edf file of our choice (based on previous block), loads it into 'raw', and then prints information about it. We then select the channels we want to be able to plot/get information about. The channels we choose are loaded into 'selected_channels'. 
The package that lets us do this is the mne python package. The documentation can be seen here: 
https://mne.tools/stable/generated/mne.io.read_raw_edf.html
https://mne.tools/stable/generated/mne.Info.html

Sample implementation: 

In [None]:
# Load an EDF file
raw = mne.io.read_raw_edf(data_path, preload=True) # data path is the file's location

# Print information about the file
print(raw.info)

# Print all channel names to review them
print(raw.info['ch_names'])
# the channel names tell us which signals we are looking at

# Subset to only EEG Channels and print general data
selected_channels = raw.pick(['EEG C3-A2', 'EEG C4-A1', 'EEG O1-A2', 'EEG O2-A1'])

display(selected_channels)

Output: 

The next cell contains a plot of the eeg channels. 
Documentation: https://mne.tools/dev/generated/mne.io.Raw.html#mne.io.Raw.plot (see 'plot')

In [None]:
# Visualize the EEG time series for the selected channels
selected_channels.plot(start=0, duration=30, scalings='auto', title='Selected EEG Channels')
plt.show()

Output: 

The next block contains the plot of the multitaper spectral estimation method. 
More info on Multitapers can be found here: https://ieeexplore.ieee.org/document/6767046

In [None]:
# To visualize the multitaper spectral estimation
# Compute the Power Spectral Density (PSD) using the multitaper method
#channel_names = ['EEG C3-A2']
channel_names = ['EEG C3-A2']
selected_data = raw.copy().pick_channels(channel_names)
sfreq = selected_data.info['sfreq']
# Extract the data from the Raw object
data, times = selected_data[:, :]

psds, freqs = mne.time_frequency.psd_array_multitaper(data, sfreq=sfreq, fmin=0.1, fmax=40, adaptive=False, normalization='length', verbose=True)
# Plot the Power Spectral Density (PSD) for the selected EEG channels
plt.semilogy(freqs, psds.T, label='Multitaper PSD')
plt.xlabel('Frequency (Hz)')
plt.ylabel('Power Spectral Density (dB)')
plt.title('Multitaper PSD for EEG C3-A2')
plt.xlim([0.1, 40])  # Adjust the frequency range as needed
plt.legend()
plt.show()

Output: 

We will also use the 'make fixed events' function from the mne package to block our data into one-hour segments. This is the new idea that Dr. Kinney wants to implement in his paper. https://mne.tools/stable/generated/mne.make_fixed_length_events.html#mne.make_fixed_length_events

The next section contains functions that we created ourselves. 

This first function allows us to iterate through a dataframe and return the values in a specified column. In most cases, we use this to return a list of IDs associated with participants. 

In [5]:
def iterframe(frame, col):
    temp = []
    for item in frame[col]:
        temp.append(item)
    print(temp)
    return temp

Example implementation: 

In [3]:
df

Unnamed: 0,Timestamp,ID,Subject's sex at birth
0,2024-02-24 14:17:06.549,00000016-APDx20974,M
1,2024-02-24 14:17:11.560,00000020-APDx20974,F
2,2024-02-24 14:17:15.794,00000040-APDx20067,M
3,2024-02-24 14:17:20.771,00000057-APDx20067,F


In [8]:
IDs = iterframe(df, 'ID')

['00000016-APDx20974', '00000020-APDx20974', '00000040-APDx20067', '00000057-APDx20067']


The next function is used to filter the dataset based on a specified value in a specified column. It returns a new dataframe that only contains rows that have the desired value. A function like this could be used to filter the dataset based on smoking/non-smoking status. 

In [9]:
def extract_rows(frame, col, val):
    temp = frame[frame[col] == val]
    return temp

Example implementation: 

In [10]:
new_df = extract_rows(df, 'Subject\'s sex at birth', 'M')

In [11]:
new_df

Unnamed: 0,Timestamp,ID,Subject's sex at birth
0,2024-02-24 14:17:06.549,00000016-APDx20974,M
2,2024-02-24 14:17:15.794,00000040-APDx20067,M


We will add several new functions: 

The first will take the median of a subset of the data so we can plot the median power at each frequency for comparison. 

The second function will standardize the starting power of the functions so that each taper we plot starts at the same point, and we can look at the differences after that. 

The third function we need filters time based on the number of observations present in the dataset. We want to make sure that each sample has the same number of observations, or that we have the same amount of time sleeping for each person when we do comparisons. 

The last function we will implement blocks time into one-hour chunks. That is the new and cool addition to Dr. Kinney's paper. This function is mentioned earlier. It is the 'make fixed events' function from the mne package. 

Notes: 

We are missing a few functions because we had to show Dr. Kinney what we had done so far and get input based on how we were approaching the problem before we had everything we need. 

There is also potential that we need to implement another function that can filter out unwanted time segments in the data. Unwanted time segments could include times when the participant woke up during the night, for example. We are waiting to hear if these times have been removed from the data or not, and will proceed based on the response. 