# Data Preprocessing

#### This notebook...
* ...extracts meta-data of .edf file group
* ...converts EEG-data (.edf) into spectrogram, and saves them.

#### Future Considerations:
* Data cleaning --> remove invalid data
* Conversion of EEG.data? E.g. fourier transformation.
* Make sure that EEG-data spans the same time-frame --> uniform image size and scale
* Should we change the color-mapping?

---

In [None]:
import numpy as np
import pandas as pd
import os 
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy import signal
from scipy.fft import fftshift
import math
import mne
from IPython import get_ipython

In [None]:
base_dir = os.getcwd()
print(f'current directory: {base_dir}')

dataset_dir = os.path.join(base_dir, 'datasets/ZhangWamsley2019/Data/PSG')
print(f'datasets directory: {dataset_dir}')

### Look at file meta data

In [65]:
file_ex = 'subject010_Morning.edf'

data = mne.io.read_raw_edf(os.path.join(dataset_dir, file_ex), verbose=False)
raw_data = data.get_data()
info = data.info
channels = data.ch_names

# Display meta-data
info

0,1
Measurement date,"June 01, 2015 07:26:08 GMT"
Experimenter,Unknown
Participant,10

0,1
Digitized points,Not available
Good channels,63 EEG
Bad channels,
EOG channels,Not available
ECG channels,Not available

0,1
Sampling frequency,400.00 Hz
Highpass,0.00 Hz
Lowpass,200.00 Hz


In [62]:
# As seen in the metadata, the frequency is 400.00 Hz
freq = 400

### Convert EEG-data (.edf) to images

In [None]:
def read_edf_data(file_path) -> np.array(): # type: ignore
  """ Reads EEG-data in .edf format.
  """
  data = mne.io.read_raw_edf(file_path, verbose=False)
  raw_data = data.get_data()
  return raw_data

# ------------------------------------------------------------------------------------

def create_spectrogram(data, fs, impath= '', save=False) -> None:
  """ 
  Converts EEG-data into a spectrogram image.
  :param data: numpy array of EEG-data.
  :param fs: the frequency of the EEG-data.
  :param impath: path where the image will be same.
  :param save: determines if the image will be saved or just shown.
  """
  plt.specgram(data, Fs=fs, NFFT=1024)
  if save:
    plt.axis('off')
    plt.savefig(impath, dpi=300, pad_inches=0.0, transparent=True, bbox_inches='tight')
  else: plt.show()

# ------------------------------------------------------------------------------------

def covert_all(dataset_dir, base_dir, freq):
  """
  
  """
  for filename in os.listdir(dataset_dir):
    file_path = os.path.join(dataset_dir, filename)
    data = read_edf_data(file_path)
    create_spectrogram(data, freq, os.path.join(base_dir, f'datasets/images{filename[:-4]}'), True)

In [None]:
# Convert all images located at ´dataset_dir´
covert_all(dataset_dir, base_dir, freq)

Notebook by Alicia HH Larsen

MedAI, Artificial Intelligence Track, TU/e Honors

2024-03-21