### Experiment: Pre-processing and frequency sampling

**Question**: Is it possible to train a model on unpreprocessed EEG data and still attain similar performance levels?

**Hypothesis**: The model will perform worse, but if still similar then the added value of not having to (manually) preprocess EEG data is very valuable and opens up a multitude of applications.

**Result**:

#### Part 1: Preparing data
To use hmp.utils.read_mne_data() and epoch the information, the files should be in .fif format, this replicates automated preprocessing as done in https://github.com/GWeindel/hsmm_mvpy/blob/main/tutorials/sample_data/eeg/0022.ipynb excepting resampling to 100Hz

In [1]:
import mne
from pathlib import Path
import hsmm_mvpy as hmp
import pandas as pd
import numpy as np
import xarray as xr
from shared.data import add_stage_dimension

2023-10-21 11:47:27.610894: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
# Set up paths and file locations
data_path = Path("/mnt/d/thesis/sat1/")
behavioral_data_path = data_path / "ExperimentData/ExperimentData"
output_path = Path("data/sat1/unpreprocessed")

subj_ids = [
    subj_id.name.split("-")[1][:4] for subj_id in (data_path / "eeg4").glob("*.vhdr")
]
subj_files = [
    str(output_path / f"unprocessed_{subj_id}_epo.fif") for subj_id in subj_ids
]
behavioral_files = [
    str(behavioral_data_path / f"{subj_id}-cnv-sat3_ET.csv") for subj_id in subj_ids
]

In [None]:
# Replacing preprocessing done in https://github.com/GWeindel/hsmm_mvpy/blob/main/tutorials/sample_data/eeg/0022.ipynb
# with only the necessary (non-manual) parts, like adding metadata for processing in HMP package, more info in link above
for subject_id in subj_ids:
    print(f"Processing subject: {subject_id}")
    subject_id_short = subject_id.replace("0", "")
    raw = mne.io.read_raw_brainvision(
        data_path / "eeg4" / f"MD3-{subject_id}.vhdr", preload=False
    )
    raw.set_channel_types(
        {"EOGh": "eog", "EOGv": "eog", "A1": "misc", "A2": "misc"}
    )  # Declare type to avoid confusion with EEG channels
    raw.rename_channels({"FP1": "Fp1", "FP2": "Fp2"})  # Naming convention
    raw.set_montage("standard_1020")  # Standard 10-20 electrode montage
    raw.rename_channels({"Fp1": "FP1", "Fp2": "FP2"})

    behavioral_path = behavioral_data_path / f"{subject_id}-cnv-sat3_ET.csv"
    behavior = pd.read_csv(behavioral_path, sep=";")[
        [
            "stim",
            "resp",
            "RT",
            "cue",
            "movement",
        ]
    ]
    behavior["movement"] = behavior.apply(
        lambda row: "stim_left"
        if row["movement"] == -1
        else ("stim_right" if row["movement"] == 1 else np.nan),
        axis=1,
    )
    behavior["resp"] = behavior.apply(
        lambda row: "resp_left"
        if row["resp"] == 1
        else ("resp_right" if row["resp"] == 2 else np.nan),
        axis=1,
    )
    # Merging together the exeperimental conditions info to have the format condition/stimulus/response
    behavior["trigger"] = (
        behavior["cue"] + "/" + behavior["movement"] + "/" + behavior["resp"]
    )
    # Filtering out < 300 and > 3000 Reaction times
    behavior["RT"] = behavior.apply(
        lambda row: 0
        if row["RT"] < 300
        else (0 if row["RT"] > 3000 else float(row["RT"]) / 1000),
        axis=1,
    )
    epochs = mne.io.read_epochs_fieldtrip(
        data_path / "eeg1" / f"data{subject_id_short}.mat", info=raw.info
    )
    epochs.rename_channels({"FP1": "Fp1", "FP2": "Fp2"})  # Naming convention
    epochs.set_montage("easycap-M1")
    epochs.filter(1, 35)  # Bandwidth filter from van Maanen, Portoles & Borst (2021)
    epochs.crop(tmin=-0.250)
    epochs.set_eeg_reference("average")
    epochs.metadata = behavior
    epochs.save(
        output_path / f"unprocessed_{subject_id}_epo.fif", overwrite=True, verbose=False
    )  # Saving EEG mne format

In [4]:
output_path_data = Path("data/sat1/data_unprocessed_500hz.nc")
# Run if data_unprocessed.nc does not exist or should be rewritten
data = hmp.utils.read_mne_data(
    subj_files,
    epoched=True,
    lower_limit_RT=0.2,
    upper_limit_RT=2,
    verbose=False,
    subj_idx=subj_ids,
    rt_col="RT",
)
data.to_netcdf(output_path_data)

Processing participant data/sat1/unpreprocessed/unprocessed_0001_epo.fif's epoched eeg
198 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0001_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0002_epo.fif's epoched eeg
200 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0002_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0003_epo.fif's epoched eeg
191 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0003_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0004_epo.fif's epoched eeg
200 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0004_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0005_epo.fif's epoched eeg
190 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0005_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0006_epo.fif's epoched eeg
200 trials were retaine

In [5]:
output_path_data = Path("data/sat1/data_unprocessed_100hz.nc")
# Run if data_unprocessed.nc does not exist or should be rewritten
data = hmp.utils.read_mne_data(
    subj_files,
    epoched=True,
    lower_limit_RT=0.2,
    upper_limit_RT=2,
    sfreq=100,
    verbose=False,
    subj_idx=subj_ids,
    rt_col="RT",
)
data.to_netcdf(output_path_data)

Processing participant data/sat1/unpreprocessed/unprocessed_0001_epo.fif's epoched eeg
198 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0001_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0002_epo.fif's epoched eeg
200 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0002_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0003_epo.fif's epoched eeg
191 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0003_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0004_epo.fif's epoched eeg
200 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0004_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0005_epo.fif's epoched eeg
190 trials were retained for participant data/sat1/unpreprocessed/unprocessed_0005_epo.fif
Processing participant data/sat1/unpreprocessed/unprocessed_0006_epo.fif's epoched eeg
200 trials were retaine

#### Use information from stage_data to split unprocessed data

##### 500Hz

In [7]:
data_path = Path("data/sat1/stage_data.nc")
merge_dataset = xr.load_dataset(Path("data/sat1/data_unprocessed_500hz.nc"))
output_data = add_stage_dimension(data_path, merge_dataset)

Finding stage changes
Combining segments


In [None]:
output_path = Path("data/sat1/split_stage_data_unprocessed_500hz.nc")
output_data.to_netcdf(output_path)

##### 100Hz

In [4]:
data_path = Path("data/sat1/stage_data.nc")
merge_dataset = xr.load_dataset(Path("data/sat1/data_unprocessed_100hz.nc"))
output_data = add_stage_dimension(data_path, merge_dataset)

Finding stage changes


Combining segments


In [None]:
output_path = Path("data/sat1/split_stage_data_unprocessed_100hz.nc")
output_data.to_netcdf(output_path)

### Part 2: Experiment

In [1]:
import tensorflow as tf
import gc
from pathlib import Path
from shared.data import add_stage_dimension
from shared.training import split_data_on_participants, train_and_evaluate, k_fold_cross_validate, get_compile_kwargs
from shared.normalization import *
from shared.models import SAT1Base, SAT1Topological, SAT1Deep
from shared.utilities import print_results
%env TF_FORCE_GPU_ALLOW_GROWTH=true
%env TF_GPU_ALLOCATOR=cuda_malloc_async

2023-10-22 12:42:41.782883: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


env: TF_FORCE_GPU_ALLOW_GROWTH=true
env: TF_GPU_ALLOCATOR=cuda_malloc_async


In [2]:
logs_path = Path("logs/exp_preprocessing/")

##### 2a: Processed 100Hz (control)

In [3]:
data_path = Path("data/sat1/split_stage_data.nc")
data = xr.load_dataset(data_path)

In [4]:
tf.keras.backend.clear_session()
model = SAT1Base(len(data.channels), len(data.samples), len(data.labels))
model.compile(**get_compile_kwargs())
train_kwargs = {
    "logs_path": logs_path,
    "additional_info": {"preprocessing": "default_100hz"},
    "additional_name": f"preprocessing-default_100hz",
}
results = k_fold_cross_validate(
    data, model, 5, normalization_fn=norm_dummy, train_kwargs=train_kwargs
)
print_results(results)
del model
gc.collect()

2023-10-22 11:52:16.341668: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 11:52:16.370608: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 11:52:16.370695: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 11:52:16.373135: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 11:52:16.373196: I tensorflow/compile

Fold 1: test fold: ['0009' '0017' '0001' '0024' '0012']
Epoch 1/20


2023-10-22 11:52:19.943660: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8600
2023-10-22 11:52:20.724324: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:606] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-10-22 11:52:21.079425: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f76cae28870 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-10-22 11:52:21.079465: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-10-22 11:52:21.086161: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-22 11:52:21.194820: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the p

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Fold 1: Accuracy: 0.8353838582677166
Fold 1: F1-Score: 0.8330978666130642
Fold 2: test fold: ['0010' '0014' '0002' '0023' '0006']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Fold 2: Accuracy: 0.84375
Fold 2: F1-Score: 0.8436405304267435
Fold 3: test fold: ['0003' '0013' '0016' '0004' '0005']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Fold 3: Accuracy: 0.811495983935743
Fold 3: F1-Score: 0.8151901275123891
Fold 4: test fold: ['0021' '0018' '0022' '0019' '0025']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Fold 4: Accuracy: 0.8394709543568465
Fold 4: F1-Score: 0.8426136161138619
Fold 5: test fold: ['0008' '0011' '0015' '0020' '0007']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Fold 5: Accuracy: 0.835
Fold 5: F1-Score: 0.83810667

2738

##### 2b: Unprocessed 100Hz

In [3]:
data_path = Path("data/sat1/split_stage_data_unprocessed_100hz.nc")
data = xr.load_dataset(data_path)

In [4]:
tf.keras.backend.clear_session()
model = SAT1Base(len(data.channels), len(data.samples), len(data.labels))
model.compile(**get_compile_kwargs())
train_kwargs = {
    "logs_path": logs_path,
    "additional_info": {"preprocessing": "unprocessed_100hz"},
    "additional_name": f"preprocessing-unprocessed_100hz",
}
results = k_fold_cross_validate(
    data, model, 5, normalization_fn=norm_dummy, train_kwargs=train_kwargs
)
print_results(results)
del model
gc.collect()

2023-10-22 12:03:15.749600: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 12:03:15.781017: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 12:03:15.781098: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 12:03:15.783884: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 12:03:15.783955: I tensorflow/compile

Fold 1: test fold: ['0009' '0017' '0001' '0024' '0012']
Epoch 1/20


2023-10-22 12:03:19.193036: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8600
2023-10-22 12:03:19.887180: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:606] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-10-22 12:03:20.207850: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f46992122a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-10-22 12:03:20.207886: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-10-22 12:03:20.214038: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-22 12:03:20.321117: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the p

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Fold 1: Accuracy: 0.8364043824701195
Fold 1: F1-Score: 0.8328285937549774
Fold 2: test fold: ['0010' '0014' '0002' '0023' '0006']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Fold 2: Accuracy: 0.8447265625
Fold 2: F1-Score: 0.8441836548868815
Fold 3: test fold: ['0003' '0013' '0016' '0004' '0005']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Fold 3: Accuracy: 0.8197791164658634
Fold 3: F1-Score: 0.8190640975705458
Fold 4: test fold: ['0021' '0018' '0022' '0019' '0025']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Fold 4: Accuracy: 0.8449170124481328
Fold 4: F1-Score: 0.8461616286010555
Fold 5: test fold: ['0008' '0011' '0015' '0020' '0007']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Fold 5: Accuracy: 0.82375
Fold 5: F1-Score: 0.8264685526435415
Average Accuracy: 0.8339154

2738

##### 2c: Unprocessed 500Hz

In [1]:
import tensorflow as tf
from pathlib import Path
from shared.data import add_stage_dimension, preprocess
from shared.training import split_data_on_participants, train_and_evaluate, k_fold_cross_validate, get_compile_kwargs
from shared.normalization import *
from shared.models import SAT1Base, SAT1Topological, SAT1Deep
from shared.utilities import print_results
import gc
import numpy as np
%env TF_FORCE_GPU_ALLOW_GROWTH=true
%env TF_GPU_ALLOCATOR=cuda_malloc_async

2023-10-22 14:05:47.320456: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


env: TF_FORCE_GPU_ALLOW_GROWTH=true
env: TF_GPU_ALLOCATOR=cuda_malloc_async


In [2]:
data_path = Path("data/sat1/split_stage_data_unprocessed_500hz.nc")
data = xr.load_dataset(data_path)
logs_path = Path("logs/exp_preprocessing/")
train_kwargs = {
    "logs_path": logs_path,
    "additional_info": {"preprocessing": "unprocessed_500hz"},
    "additional_name": f"preprocessing-unprocessed_500hz",
}

In [3]:
model = SAT1Deep(len(data.channels), len(data.samples), len(data.labels))
results = k_fold_cross_validate(
    data,
    model,
    5,
    normalization_fn=norm_dummy,
    train_kwargs=train_kwargs,
    fold_indices=[0, 1],
)
print_results(results)
del model
gc.collect()

2023-10-22 13:09:08.061577: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:09:08.162950: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:09:08.163030: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:09:08.165627: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:09:08.165695: I tensorflow/compile

Fold 1: test fold: ['0009' '0017' '0001' '0024' '0012']
Epoch 1/20


2023-10-22 13:09:17.135346: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8600
2023-10-22 13:09:19.200727: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:606] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-10-22 13:09:20.789983: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f1aa6b96a90 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-10-22 13:09:20.790024: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-10-22 13:09:20.819824: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-22 13:09:21.009145: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the p

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Fold 1: Accuracy: 0.841
Fold 1: F1-Score: 0.8219912512255565
Fold 2: test fold: ['0010' '0014' '0002' '0023' '0006']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Fold 2: Accuracy: 0.8558823529411764
Fold 2: F1-Score: 0.8338465274921759
Average Accuracy: 0.8484411764705881
Average F1-Score: 0.8279188893588663


3336

In [3]:
model = SAT1Deep(len(data.channels), len(data.samples), len(data.labels))
results = k_fold_cross_validate(
    data,
    model,
    5,
    normalization_fn=norm_dummy,
    train_kwargs=train_kwargs,
    fold_indices=[2, 3],
)
print_results(results)
del model
gc.collect()

2023-10-22 13:30:22.209215: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:30:22.232962: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:30:22.233045: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:30:22.234969: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 13:30:22.235042: I tensorflow/compile

Fold 3: test fold: ['0003' '0013' '0016' '0004' '0005']
Epoch 1/20


2023-10-22 13:30:29.239415: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8600
2023-10-22 13:30:30.556445: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:606] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-10-22 13:30:31.867626: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f7022c32ba0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-10-22 13:30:31.867664: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-10-22 13:30:31.873752: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-22 13:30:31.983137: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the p

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Fold 3: Accuracy: 0.8399697580645161
Fold 3: F1-Score: 0.8243986016414416
Fold 4: test fold: ['0021' '0018' '0022' '0019' '0025']
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Fold 4: Accuracy: 0.8697698744769874
Fold 4: F1-Score: 0.8549819536621215
Average Accuracy: 0.8548698162707518
Average F1-Score: 0.8396902776517816


3336

In [3]:
model = SAT1Deep(len(data.channels), len(data.samples), len(data.labels))
results = k_fold_cross_validate(
    data,
    model,
    5,
    normalization_fn=norm_dummy,
    train_kwargs=train_kwargs,
    fold_indices=[4],
)
print_results(results)
del model
gc.collect()

2023-10-22 14:06:00.253962: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 14:06:00.280441: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 14:06:00.280528: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 14:06:00.282928: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:07:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-22 14:06:00.282998: I tensorflow/compile

Fold 5: test fold: ['0008' '0011' '0015' '0020' '0007']
Epoch 1/20


2023-10-22 14:06:07.823139: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8600
2023-10-22 14:06:09.105714: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:606] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-10-22 14:06:10.426596: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f6002dab290 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-10-22 14:06:10.426629: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-10-22 14:06:10.432873: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-22 14:06:10.542178: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the p

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Fold 5: Accuracy: 0.8255522088353414
Fold 5: F1-Score: 0.806006600446103
Average Accuracy: 0.8255522088353414
Average F1-Score: 0.806006600446103


956

In [4]:
accuracies = [
    0.841,
    0.8558823529411764,
    0.8399697580645161,
    0.8697698744769874,
    0.8255522088353414
]
f1_scores = [
    0.8219912512255565,
    0.8338465274921759,
    0.8243986016414416,
    0.8549819536621215,
    0.806006600446103
]
print(np.mean(accuracies))
print(np.mean(f1_scores))

0.8464348388636042
0.8282449868934796


#### Deprecated due to memory leak issue

In [None]:
tf.keras.backend.clear_session()
model = SAT1Deep(len(data.channels), len(data.samples), len(data.labels))
# model.compile(**get_compile_kwargs())
train_kwargs = {
    "logs_path": logs_path,
    "additional_info": {"preprocessing": "unprocessed_500hz"},
    "additional_name": f"preprocessing-unprocessed_500hz",
}
results = k_fold_cross_validate(
    data, model, 5, normalization_fn=norm_dummy, train_kwargs=train_kwargs
)
print_results(results)
del model
gc.collect()

In [13]:
# View results in Tensorboard
! tensorboard --logdir logs/exp_preprocessing/


NOTE: Using experimental fast data loading logic. To disable, pass
    "--load_fast=false" and report issues on GitHub. More details:
    https://github.com/tensorflow/tensorboard/issues/4784

Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.13.0 at http://localhost:6006/ (Press CTRL+C to quit)
^C


: 