# Gesture Recognition with CAPG DB-a Dataset Using 3D CNN with EMGNet Architecture (one subject for testing)

In this preliminary effort, we will try to perform hand gesture recognition on CAPG DBA dataset.
We will use the EMGNet architecture and training procedure, but instead of CWT, we will use 3D CNN on sequences of 2D images.

In this version:

- EMG data is normalized with the recorded MVC data
- The **EMGNet** architecture will be used, along with the training procedure.
- A **3D CNN** architecture will be adopted into the EMGNet architecture.
- **Raw EMG data** will be used, there will be no preproccessing or feature engineering.
- **Training data:** 17 subjects
- **Test data:** 1 subject
- K-fold cross-validation will be performed.

**NOTE** This code has been tested with:
```
    numpy version:        1.23.5
    scipy version:        1.9.3
    sklearn version:      1.2.0
    seaborn version:      0.12.1
    pandas version:       1.5.2
    torch version:        1.12.1+cu113
    matplotlib version:   3.6.2
    CUDA version:         11.2
```

## 1- Preliminaries

### Imports

In [1]:
%pip install pyarrow
%pip install fastparquet

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install PyWavelets

Note: you may need to restart the kernel to use updated packages.


In [3]:
import sys, os
direc = os.getcwd()
print("Current Working Directory is: ", direc)
KUACC = False
if "scratch" in direc: # We are using the cluster
    KUACC = True
    homedir = os.path.expanduser("~")
    os.chdir(os.path.join(homedir,"comp541-project/capg_3dcnn/"))
    direc = os.getcwd()
    print("Current Working Directory is now: ", direc)
sys.path.append("../src/")
sys.path.append("../data/")
import torch
import torch.nn as nn
from datasets_torch import *
from models_torch import *
from utils_torch import *
from datetime import datetime
import pandas as pd
import numpy as np
import scipy as sp
import sklearn
import seaborn as sns
from sklearn.preprocessing import OneHotEncoder, StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, classification_report, confusion_matrix, accuracy_score, f1_score
import matplotlib
import matplotlib.pyplot as plt
from copy import deepcopy
import statistics
import json
from IPython.display import display
#from cwt import calculate_wavelet_vector, calculate_wavelet_dataset

# Print versions
print("numpy version:       ", np.__version__)
print("scipy version:       ", sp.__version__)
print("sklearn version:     ", sklearn.__version__)
print("seaborn version:     ", sns.__version__)
print("pandas version:      ", pd.__version__)
print("torch version:       ", torch.__version__)
print("matplotlib version:  ", matplotlib.__version__)


# Checking to see if CUDA is available for us
print("Checking to see if PyTorch recognizes GPU...")
print(torch.cuda.is_available())

# Whether to use latex rendering in plots throughout the notebook
USE_TEX = False
FONT_SIZE = 12

# Setting matplotlib plotting variables
if USE_TEX:
    plt.rcParams.update({
        "text.usetex": True,
        "font.size": FONT_SIZE,
        "font.family": "serif",
        "font.serif": ["Computer Modern Roman"]
    })
else:
    plt.rcParams.update({
        "text.usetex": False,
        "font.size": FONT_SIZE,
        "font.family": "serif",
        "font.serif": ["Times New Roman"]
    })

# Do not plot figures inline (only useful for cluster)
%matplotlib

Current Working Directory is:  /scratch/users/edincer22
Current Working Directory is now:  /scratch/users/edincer22/comp541-project/capg_3dcnn
numpy version:        1.24.1
scipy version:        1.9.3
sklearn version:      1.2.0
seaborn version:      0.12.2
pandas version:       1.5.3
torch version:        1.12.1+cu113
matplotlib version:   3.6.2
Checking to see if PyTorch recognizes GPU...
True
Using matplotlib backend: <object object at 0x2b09bdebd0b0>


In [4]:
import torch
print(torch.cuda.is_available())

True


## 2- Hyperparameters and Settings

### General settings of the study

In [23]:
k_fold_study = {
    'code':'capg_3dcnn/capg_dba_v003',
    'package':'torch',
    'dataset':'capg',
    'subdataset':'dba',
    "training_accuracies": [],
    "validation_accuracies": [],
    "testset_accuracies": [],
    "history_training_loss": [],
    "history_training_metrics": [],
    "history_validation_loss": [],
    "history_validation_metrics": [],
    "preprocessing":None,
    "feature_engineering":None,
    "k_fold_mode":"1 subject for testing",
    "global_downsampling":10
}

In [29]:
hparams = {
    "model_name": autoname("capg_3dcnn_dba_v003"),
    # General hyperparameters
    "in_features": 128,
    "out_features": 1,
    # Sequence hyperparameters
    "in_seq_len_sec": 0.16,
    "out_seq_len_sec": 0,
    "data_sampling_rate_Hz": 1000.0,
    "data_downsampling": 5,
    "sequence_downsampling": 1,
    "in_seq_len": 0,
    "out_seq_len": 0,
    # Convolution blocks
    "num_conv_blocks": 4,
    "conv_dim": 3,
    "conv_params": None,
    "conv_channels": [16, 32, 32, 64],
    "conv_kernel_size": 3,
    "conv_padding": "same",
    "conv_stride": 1,
    "conv_dilation": 1,
    "conv_activation": "ReLU",
    "conv_activation_params": None,#{"negative_slope": 0.1},
    "conv_norm_layer_type": "BatchNorm",
    "conv_norm_layer_position": "before",
    "conv_norm_layer_params": None,
    "conv_dropout": None,
    "pool_type": [None, None, None, "AdaptiveAvg"],
    "pool_kernel_size": 2,
    "pool_padding": 0,
    "pool_stride": 1,
    "pool_dilation": 1,
    "pool_params": None,
    "min_image_size": 1,
    "adaptive_pool_output_size": [1,1,1],
    # Fully connected blocks
    "dense_width": "auto",
    "dense_depth": 0,
    "dense_activation": "ReLU",
    "dense_activation_params": None,
    "output_activation": None,
    "output_activation_params": None,
    "dense_norm_layer_type": None,
    "dense_norm_layer_position": None,
    "dense_norm_layer_params": None,
    "dense_dropout": None,
    # Training procedure
    "l2_reg": 0.0001,
    "batch_size": 512,
    "epochs": 40,
    "validation_data": [0.05,'testset'],
    "validation_tolerance_epochs": 1000,
    "learning_rate": 0.01,
    "learning_rate_decay_gamma": 0.9,
    "loss_function": "CrossEntropyLoss",
    "optimizer": "Adam",
    "optimizer_params": None
}

## 3- Data Processing

### Load and concatenate data

In [30]:
data_dir = "../data/CAPG/parquet"
def load_single_capg_dataset(data_dir, db_str:str="dba"):
    data_lst = []
    for i,file in enumerate(os.listdir(data_dir)):
        if file.endswith(".parquet") and db_str in file:
            print("Loading file: ", file)
            data_lst.append(pd.read_parquet(os.path.join(data_dir, file)))
    data = pd.concat(data_lst, axis=0, ignore_index=True)
    return data
dba_tot = load_single_capg_dataset(data_dir, db_str="dba")
dba_mvc = dba_tot.loc[dba_tot["gesture"].isin([100, 101])]
dba = dba_tot.loc[~dba_tot["gesture"].isin([100, 101])]
print("dba_tot shape: ", dba_tot.shape)
print("dba_mvc shape: ", dba_mvc.shape)
print("dba shape: ", dba.shape)
print("Columns: ")
print(dba_tot.columns)
print("Description: ")
print(dba.iloc[:,:3].describe())

Loading file:  dba_subj_18.parquet
Loading file:  dba_subj_15.parquet
Loading file:  dba_subj_7.parquet
Loading file:  dba_subj_1.parquet
Loading file:  dba_subj_16.parquet
Loading file:  dba_subj_5.parquet
Loading file:  dba_subj_13.parquet
Loading file:  dba_subj_10.parquet
Loading file:  dba_subj_6.parquet
Loading file:  dba_subj_2.parquet
Loading file:  dba_subj_12.parquet
Loading file:  dba_subj_8.parquet
Loading file:  dba_subj_4.parquet
Loading file:  dba_subj_3.parquet
Loading file:  dba_subj_14.parquet
Loading file:  dba_subj_17.parquet
Loading file:  dba_subj_9.parquet
Loading file:  dba_subj_11.parquet
dba_tot shape:  (1476000, 131)
dba_mvc shape:  (36000, 131)
dba shape:  (1440000, 131)
Columns: 
Index(['subject', 'gesture', 'trial', 'b_1_c_1', 'b_1_c_2', 'b_1_c_3',
       'b_1_c_4', 'b_1_c_5', 'b_1_c_6', 'b_1_c_7',
       ...
       'b_8_c_7', 'b_8_c_8', 'b_8_c_9', 'b_8_c_10', 'b_8_c_11', 'b_8_c_12',
       'b_8_c_13', 'b_8_c_14', 'b_8_c_15', 'b_8_c_16'],
      dtype='obje

### Normalize EMG Data

Here the recorded MVC values will be used for normalizaing EMG data

In [31]:
max_mvc = dba_mvc.iloc[:,3:].max(axis=0)
del dba_mvc
# print("max_mvc for 5 first channels: ")
# print(max_mvc[:5])
# print("shape of max_mvc: ", max_mvc.shape)
# print("max of dba before normalization: (first five)")
# print(dba.iloc[:,3:].max(axis=0)[:5])
dba.iloc[:,3:] = dba.iloc[:,3:].div(max_mvc, axis=1)
# print("max of dba_norm after normalization: ")
# print(dba_norm.iloc[:,3:].max(axis=0)[:5])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dba.iloc[:,3:] = dba.iloc[:,3:].div(max_mvc, axis=1)


## 4- k-fold study

### EMGNet model

In [32]:

class EMGNet(PyTorchSmartModule):
    def __init__(self, hparams):
        super(EMGNet, self).__init__(hparams)
        self.prep_block = nn.Sequential(
            nn.BatchNorm3d(1),
            nn.ReLU()
        )
        self.main_block = Conv_Network(hparams)
    
    def forward(self, x):
        x = self.prep_block(x)
        x = self.main_block(x)
        return x

### Perform k-fold cross-validation study

In [33]:

# Define input columns
input_cols = list(dba.iloc[:,3:].columns)

# Hard-code total number of subjects
num_subjects = dba['subject'].nunique()

ds = k_fold_study['global_downsampling']


for k in range(num_subjects):
#for k in range(1):
    
    print("\n#################################################################")
    print("Using subject %d for testing ..." % (k+1))
    print("#################################################################\n")
    
    subj_for_testing = [k+1]
    
    # Un-Correct the output feature count (this is buggy behavior and should be fixed)
    hparams['out_features'] = 1
    
    # Get processed data cell
    # CWT: N x C x L --> N x C x H x L
    print("Generating data cell ...")
    data_processed = generate_cell_array(
        dba, hparams,
        subjects_column="subject", conditions_column="gesture", trials_column="trial",
        input_cols=input_cols, output_cols=["gesture"], specific_conditions=None,
        input_preprocessor=None,
        output_preprocessor=None,
        # Convert N x L x C data to N x C x L and then to N x C' x D x H x W where C'=1, D=L, H=8, W=16
        input_postprocessor=lambda arr: arr.reshape(arr.shape[0], 1, arr.shape[1], 8, 16),
        output_postprocessor = lambda arr:(arr-1).squeeze(), # torch CrossEntropyLoss needs (N,) array of 0-indexed class labels
        subjects_for_testing=subj_for_testing, 
        trials_for_testing=None,
        input_scaling=False, output_scaling=False, input_forward_facing=True, output_forward_facing=True, 
        data_squeezed=False,
        input_towards_future=False, output_towards_future=False, 
        output_include_current_timestep=True,
        use_filtered_data=False, #lpcutoff=CUTOFF, lporder=FILT_ORDER, lpsamplfreq=SAMPL_FREQ,
        return_data_arrays_orig=False,
        return_data_arrays_processed=False,
        return_train_val_test_arrays=False,
        return_train_val_test_data=True,
        verbosity=1
    )
    
    # Correct the output feature count (this is buggy behavior and should be fixed)
    hparams['out_features'] = 8
    
    print("Extracting downsampled input and output data from the datacell ...")
    # Inputs MUST have correct shape
    x_train = data_processed["x_train"][::ds]
    x_val = data_processed["x_val"][::ds]
    x_test = data_processed["x_test"][::ds]
    # Outputs MUST be zero-indexed class labels
    y_train = data_processed["y_train"][::ds]
    y_val = data_processed["y_val"][::ds]
    y_test = data_processed["y_test"][::ds]
    print("x_train shape: ", x_train.shape)
    print("x_val shape: ", x_val.shape)
    print("x_test shape: ", x_test.shape)
    print("y_train shape: ", y_train.shape)
    print("y_val shape: ", y_val.shape)
    print("y_test shape: ", y_test.shape)
    del data_processed
    # Make datasets from training, validation and test sets
    print("Generating the TensorDataset objects ...")
    train_set = TensorDataset(torch.from_numpy(x_train).float(), torch.from_numpy(y_train).long())
    val_set = TensorDataset(torch.from_numpy(x_val).float(), torch.from_numpy(y_val).long())
    test_set = TensorDataset(torch.from_numpy(x_test).float(), torch.from_numpy(y_test).long())
    
    # If it is the first iteration of the loop, save the hyperparameters dictionary in the k-fold study dictionary
    if k==0:
        k_fold_study['hparams'] = hparams
    
    # Construct model
    print("Constructing the model ...")
    hparams['input_shape'] = list(x_train.shape[1:])
    hparams['output_shape'] = [8]
    print("Model input shape: ", hparams['input_shape'])
    print("Model output shape: ", hparams['output_shape'])
    model = EMGNet(hparams)
    if k == 0: print(model)
    
    # Train model
    print("Training the model ...")
    # history = train_pytorch_model(
    #     model, [train_set, val_set], batch_size=1024, loss_str='crossentropy', optimizer_str='adam', 
    #     optimizer_params={'weight_decay':0.0001}, loss_function_params=None, learnrate=0.1, 
    #     learnrate_decay_gamma=0.95, epochs=200, validation_patience=1000000, 
    #     verbose=1, script_before_save=True, saveto=None, num_workers=0)
    history = model.train_model([train_set, val_set], verbose=1)    
    
    # Update relevant fields in the k-fold study dictionary
    print("Updating the dictinoary for logging ...")
    k_fold_study['history_training_loss'].append(history["training_loss"])
    k_fold_study["history_validation_loss"].append(history["validation_loss"])
    k_fold_study["history_training_metrics"].append(history["training_metrics"])
    k_fold_study["history_validation_metrics"].append(history["validation_metrics"])
    k_fold_study["training_accuracies"].append(history["training_metrics"][-1])
    k_fold_study["validation_accuracies"].append(history["validation_metrics"][-1])
    
    # Evaluate the model on the test set
    print("Evaluating the model on the test set ...")
    # results = evaluate_pytorch_model(model, test_set, loss_str='crossentropy', loss_function_params=None,
    # batch_size=1024, device_str="cuda", verbose=True, num_workers=0)
    results = model.evaluate_model(test_set, verbose=True)
    k_fold_study["testset_accuracies"].append(results["metrics"])
    print("Done with this fold of the K-fold study.")

print("Done with the K-fold study.")


#################################################################
Using subject 1 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [1]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Size of output dictionary in bytes:  9438404197
Done.

Extracting downsampled input and output data from the datacell ...
x_train shape:  (27200, 1, 64, 8, 16)
x_val shape:  (80, 1, 64, 8, 16)
x_test shape:  (1520, 1, 64, 8, 16)
y_train shape:  (27200,)
y_val shape:  (80,)
y_test shape:  (1520,)
Generating the TensorDataset

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:18<00:00, 67.96s/it]


Finished Training.
Training process took 2718.48 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 4.4478 | Accuracy: 0.2664
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 2 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [2]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:17<00:00, 67.93s/it]


Finished Training.
Training process took 2717.09 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 3.7788 | Accuracy: 0.3632
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 3 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [3]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:16<00:00, 67.91s/it]


Finished Training.
Training process took 2716.49 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 2.9185 | Accuracy: 0.3664
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 4 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [4]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:18<00:00, 67.96s/it]


Finished Training.
Training process took 2718.28 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 2.9183 | Accuracy: 0.4026
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 5 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [5]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:17<00:00, 67.93s/it]


Finished Training.
Training process took 2717.29 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 3.4492 | Accuracy: 0.3737
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 6 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [6]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:17<00:00, 67.94s/it]


Finished Training.
Training process took 2717.60 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 6.0269 | Accuracy: 0.2026
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 7 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [7]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:14<00:00, 67.86s/it]


Finished Training.
Training process took 2714.44 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 5.6618 | Accuracy: 0.2651
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 8 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [8]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:16<00:00, 67.92s/it]


Finished Training.
Training process took 2716.71 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 3.5937 | Accuracy: 0.2625
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 9 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [9]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...
Si

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:14<00:00, 67.85s/it]


Finished Training.
Training process took 2714.07 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 1.8230 | Accuracy: 0.4974
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 10 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [10]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:14<00:00, 67.86s/it]


Finished Training.
Training process took 2714.28 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 5.8002 | Accuracy: 0.1586
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 11 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [11]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:17<00:00, 67.94s/it]


Finished Training.
Training process took 2717.61 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 10.8547 | Accuracy: 0.0993
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 12 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [12]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...

Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:18<00:00, 67.96s/it]


Finished Training.
Training process took 2718.22 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 4.7330 | Accuracy: 0.2757
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 13 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [13]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:18<00:00, 67.95s/it]


Finished Training.
Training process took 2718.02 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 4.3196 | Accuracy: 0.2461
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 14 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [14]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:17<00:00, 67.95s/it]


Finished Training.
Training process took 2717.84 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 8.9482 | Accuracy: 0.1836
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 15 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [15]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress: 100%|████████████████████████████████████████████| 40/40 [45:14<00:00, 67.86s/it]


Finished Training.
Training process took 2714.35 seconds.
Done training.
Updating the dictinoary for logging ...
Evaluating the model on the test set ...
Preparing data...
selected device:  cuda
Evaluating model...
[██████████████████████████] Loss: 1.9904 | Accuracy: 0.3691
Done.
Done with this fold of the K-fold study.

#################################################################
Using subject 16 for testing ...
#################################################################

Generating data cell ...
# subjects:    18
# conditions:  8
# trials:      10


subjects used for testing:    [16]
conditions used for testing:  []
trials used for testing:      []


Iterating through all trials ...

Concatenating arrays and generating outputs ...
Validation data source:   testset
Validation data portion:  0.05
x_train:  (272000, 1, 64, 8, 16)
y_train:  (272000,)
x_val:  (800, 1, 64, 8, 16)
y_val:  (800,)
x_test:  (15200, 1, 64, 8, 16)
y_test:  (15200,)
Constructing output dictionary ...


Training Progress:  98%|██████████████████████████████████████████▉ | 39/40 [44:54<01:09, 69.09s/it]


KeyboardInterrupt: 

In [None]:
x_test.shape

(1520, 1, 64, 8, 16)

In [None]:
import os
os.getcwd()

'/scratch/users/edincer22/comp541-project/capg_3dcnn'

In [None]:
torch.save(model, "capg_3dcnn_new")

In [None]:
model_1 = torch.load("capg_3dcnn_new")

In [None]:
model_1

EMGNet(
  (prep_block): Sequential(
    (0): BatchNorm3d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (1): ReLU()
  )
  (main_block): Conv_Network(
    (net): Sequential(
      (0): Conv_Block(
        (net): Sequential(
          (0): Conv3d(1, 16, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=same)
          (1): BatchNorm3d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU()
        )
      )
      (1): Conv_Block(
        (net): Sequential(
          (0): Conv3d(16, 32, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=same)
          (1): BatchNorm3d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU()
        )
      )
      (2): Conv_Block(
        (net): Sequential(
          (0): Conv3d(32, 32, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=same)
          (1): BatchNorm3d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU()
        )
  

### Saving k-fold study

In [None]:
print("Dumping the JSON file ...")
json.dump(k_fold_study, open(make_path("../results/"+hparams['model_name']+"/k_fold_study.json"), "w"), indent=4)
print("Saved the JSON file.")

### Saving general statistics

In [None]:
print("Saving the general statistics ...")
trn_acc_arr = np.array(k_fold_study["training_accuracies"])
val_acc_arr = np.array(k_fold_study["validation_accuracies"])
tst_acc_arr = np.array(k_fold_study["testset_accuracies"])
general_dict = {"training_accuracy":trn_acc_arr, "validation_accuracy":val_acc_arr, "testset_accuracy":tst_acc_arr}
general_results = pd.DataFrame(general_dict)
print("Description of general results:")
general_results_describe = general_results.describe()
display(general_results_describe)
general_results_describe.to_csv(
    make_path("../results/"+hparams['model_name']+"/general_results.csv"), header=True, index=True)
print("Saved general statistics.")

### Plotting training histories

In [None]:
# import numpy as np
# import json
# import pandas as pd

In [None]:
# k_fold_study = json.load(open("../results/capg_replica_dba_v002_2023_01_07_20_07_25/k_fold_study.json", "r"))

In [None]:
print("Plotting the taining curve ...")
train_loss = np.array(k_fold_study["history_training_loss"])
val_loss = np.array(k_fold_study["history_validation_loss"])
train_acc = np.array(k_fold_study["history_training_metrics"])
val_acc = np.array(k_fold_study["history_validation_metrics"])

print("Shape of train_loss: ", train_loss.shape)

train_loss_mean = np.mean(train_loss, axis=0)
train_loss_std = np.std(train_loss, axis=0)# / 2
val_loss_mean = np.mean(val_loss, axis=0)
val_loss_std = np.std(val_loss, axis=0)# / 2
train_acc_mean = np.mean(train_acc, axis=0)
train_acc_std = np.std(train_acc, axis=0)# / 2
val_acc_mean = np.mean(val_acc, axis=0)
val_acc_std = np.std(val_acc, axis=0)# / 2

print("Shape of train_loss_mean: ", train_loss_mean.shape)
print("Shape of train_loss_std: ", train_loss_std.shape)

epochs = train_loss_mean.shape[0]
epochs = np.arange(1, epochs+1)
plt.figure(figsize=(8,8), dpi=100)
plt.subplot(2,1,1)
plt.grid(True)
plt.plot(epochs, train_loss_mean, label="Training", color="blue")
plt.fill_between(epochs, train_loss_mean-train_loss_std, train_loss_mean+train_loss_std, 
                 color='blue', alpha=0.2)
plt.plot(epochs, val_loss_mean, label="Validation", color="orange")
plt.fill_between(epochs, val_loss_mean-val_loss_std, val_loss_mean+val_loss_std,
                 color='orange', alpha=0.2)
plt.ylabel("Loss")
plt.legend(loc="upper right")
plt.subplot(2,1,2)
plt.grid(True)
plt.plot(epochs, train_acc_mean, color="blue")
plt.fill_between(epochs, train_acc_mean-train_acc_std, train_acc_mean+train_acc_std,
                 color='blue', alpha=0.2)
plt.plot(epochs, val_acc_mean, color="orange")
plt.fill_between(epochs, val_acc_mean-val_acc_std, val_acc_mean+val_acc_std,
                 color='orange', alpha=0.2)
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.subplots_adjust(hspace=0.2)
plt.savefig(make_path("../results/"+k_fold_study['hparams']['model_name']+"/training_history.png"), dpi=300)

print("Done plotting the training curve.")
print("ALL DONE. GOOD BYE!")

# Here is the TL part