## Unit operations to access the raw data and label file

This notebooks contains test cases and/or demos for functions and modules in the seizurecast package

In [8]:
from file_io import * 
import os
import numpy as np
import pandas as pd
import mne
import matplotlib.pyplot as plt

In [10]:
# relabeling config
LEN_PRE = 15
LEN_POS = 60
SEC_GAP = 0

In [13]:
train_path = '../tusz_1_5_2/edf/train'
tcp_type = '01_tcp_ar'
patient_group = '004'
patient = '00000492'
session = 's003_2003_07_18'
token = '00000492_s003_t001'
token_path = os.path.join(train_path, tcp_type, patient_group, patient, session, token)

# Read 1 token file
fsamp_mont, sig_mont, labels_mont = read_1_token(token_path)
np.shape(fsamp_mont), np.shape(sig_mont), np.shape(labels_mont)

((22,), (22, 73600), (22,))

In [14]:
# Sort channels if montages are different
sig_mont = sort_channel(sig_mont, labels_mont, STD_CHANNEL_01_AR)
np.shape(sig_mont)

(22, 73600)

In [15]:
# Intervals that have been annotated
# 00000492_s003_t001 0.0000 33.1425 bckg 1.0000
# 00000492_s003_t001 33.1425 53.0000 seiz 1.0000
# 00000492_s003_t001 53.0000 184.0000 bckg 1.0000
intvs, labls = load_tse_bi(token_path)
np.shape(intvs), np.shape(labls)

((3, 2), (3,))

In [16]:
# Relabel intervals by assigning pre-seizure stage
# pre-seizure stage is defined as SEC_GAP seconds preceding the seizure
intvs, labls = relabel_tse_bi(intvs=intvs, labels=labls, len_pre=LEN_PRE, len_post=LEN_POS, sec_gap=SEC_GAP)
np.shape(intvs), np.shape(labls)

((5, 2), (5,))

## Segment data into 1 second per piece

1. Chop seizure
Comparing the sampling rate, time and annotated time, we extract some chunks of seizure signal.
2. Chop pre-ictal 
Chop from 10 to 20 seconds preceding seizures.
3. Chop background
10 minutes preceding seizures and 10 minutes after seizures. 

In [19]:
# Segment data into 1 second per piece
fsamp = int(np.mean(fsamp_mont))
dataset, labels = signal_to_dataset(raw=sig_mont, fsamp=fsamp, intvs=intvs, labels=labls)
print('before:\t', np.shape(sig_mont))
print('after:\t', np.shape(dataset))
assert np.shape(dataset)[0] == np.shape(labels)[0]

before:	 (22, 73600)
after:	 (181, 22, 400)


# Gotcha: montage issue
For each edf file, we cannot assume the first channel is always the same physical location of electrode. 
1. Set some standard label and order.
Ideally I can use data.frame, however I will first see what format others used. The order can be arbitrary, but I will see what other used first.
2. Read edf file and its montage, 
This can be done using the aforementioned functions from pystream
3. Convert edf reading to standard format.
This can be done using numpy and panda