<a href="https://colab.research.google.com/github/bermanlabemory/gait_signatures/blob/main/Gait_Signatures_Script_7_Save_Hyperparameter_Results_Single_Folder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## This script saves hyperaparameter results from all models in a single folder. 

Otherwise, you have to scour Google Drive folders (if doing this online) to manually pull each file.
____________
____________
**NOTE**: This file was ONLY used to select model hyperparameters and generate supplementary figures.

First, Scripts 1-6 must be run to generate the models that are analyzed here.
____________
____________
**Steps:** 
1. Load training and validation loss for each model hyperparameter combination
  1. Concatenate all training and validation loss values into a single file, along with model specs, and save 
1. Load short/long-time gait signature alignment results for each model hyperparameter combination
  1. Concatenate all training and validation loss values into a single file, along with model specs, and save 

____________
The  gait signature alignment results are stored in a [subject x intialization x model] array.
____________

**Created by**: Michael C. Rosenberg

**Date**: 11/11/22

**Step 0**: Mount (connect to) your google drive folder where you want to save the simulation results and model parameters.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
#drive.mount("/content/drive", force_remount=True)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# check python version 
from platform import python_version

print(python_version())

# check tensorflow version
import tensorflow as tf
print(tf.__version__)

# GPU/ram
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

3.8.16
2.9.2
Your runtime has 27.3 gigabytes of available RAM

You are using a high-RAM runtime!


**Step 1**: Import necessary packages and functions to develop model

In [None]:
from keras.models import Sequential
from keras.layers import Dense
# from keras.layers.recurrent import LSTM # Deprecated :(
from tensorflow.python.keras.layers.recurrent import LSTM
from sklearn.model_selection import train_test_split
import sklearn.model_selection as model_selection
import matplotlib.pyplot as plt
import math
import keras as k
import pandas as pd
import numpy as np
from copy import copy
import scipy.io
from sklearn.decomposition import PCA

from scipy.signal import find_peaks
from scipy import interpolate
from scipy.special import iv
from numpy import sin,cos,pi,array,linspace,cumsum,asarray,dot,ones
from pylab import plot, legend, axis, show, randint, randn, std,lstsq

from sklearn.manifold import MDS
import csv
import os
from tqdm import tqdm 

In [None]:
# Define functions
%cd /content/drive/My Drive/NeuromechanicsLab/GaitSignatures/

# Ensure fourierseries.py is in the pathway
!ls -l fourierseries.py

import fourierseries
import util
import phaser
import dataloader
# Preprocess data for a single subject - to be send to modeling frameworks
def find_phase(k):
    """
    Detrend and compute the phase estimate using Phaser
    INPUT:
      k -- dataframe
    OUTPUT:
      k -- dataframe
    """
    #l = ['hip_flexion_l','hip_flexion_r'] # Phase variables = hip flexion angles
    y = np.array(k)
    print(y.shape)
    y = util.detrend(y.T).T
    print(y.shape)
    phsr = phaser.Phaser(y=y)
    k[:] = phsr.phaserEval(y)[0,:]
    return k

# Interpolation function
def vonMies(t,t_0, b):
    out = np.exp(b*np.cos(t-t_0))/(2*pi*iv(0, b))
    return out

/content/drive/My Drive/NeuromechanicsLab/GaitSignatures
-rw------- 1 root root 7259 May 29  2020 fourierseries.py


**Step 2**: Load module in Google Drive

In [None]:
# The path to save the models and read data from
path = '/content/drive/My Drive/NeuromechanicsLab/GaitSignatures/'

# Insert the directory
import sys
sys.path.insert(0,path)

**Step 3**: Load in data and specify variables/parameters

In [None]:
# Non-changing variables 
newSavePath = '/content/drive/My Drive/SETYOURFILEPATH/GaitSignatures/00_HyperparamResults/' # Path where you want to save your results

# number of trials in dataset 
trialnum = 72 # 72 total trials

# number of samples in each trial
trialsamp = 1500

# number of features collected per trial
feats = 6

#Batch size - same as the number of traintrials
batch_size = trialnum

# Number of Layers
numlayers = 1

# Choose the number of iterations to train the model- if this script has been run previously enter a value greater than was 
# inputted before and rerun the script. 
finalepoch = 10000

# load the input data/kinematics
datafilepath = path + 'PareticvsNonP_RNNData.csv' #input data
all_csvnp = np.loadtxt(datafilepath,delimiter=',').T

# reshape all the input data into a tensor
all_inputdata_s = all_csvnp.reshape(trialnum,trialsamp,feats) 
csvnp = all_inputdata_s
print('original input data shape is: '+ str(all_csvnp.shape ))
print('input data reshaped is: '+ str(all_inputdata_s.shape))

original input data shape is: (108000, 6)
input data reshaped is: (72, 1500, 6)


# Training vs validation loss

In [None]:
# generate a list of models and corresponding parameters to test 
test_model_nodes = [128, 256,512,1024] 
seqs = [249,499,749] #lookback parameter

test_model_nodes = [ 512] 
seqs = [499] #lookback parameter

# run multiple model architechtures many times to test stability of cost function outputs
runs = 1 # Number of times to train recurrent neural network (RNN) models, starting from random initial conditions. We used this for hyperparameter selection
test_model_seq = np.repeat(seqs, runs) # Specify each model's hyperparameters

count = np.arange(runs)

All_nodes = np.empty([0,1], dtype='int')
All_seq = np.empty([0,1],dtype='int')
All_valseg = np.empty([0,1],dtype='int')
All_trainseg = np.empty([0,1],dtype='int')
All_modelname = []
All_mod_name = []
count = np.empty([0,1],dtype='int'); #initialize model run -- this serves as the model run ID number
ct = 0
for a in test_model_nodes:
  for b in test_model_seq:
    count = np.append(count,  ct + 1 )
    #if statement for valseg, trainseg based on sequence length
    if int(b) == 249:
      trainseg = 4
      valseg = 2
    elif int(b) == 499: 
      trainseg = 2
      valseg = 1
    elif int(b) == 749:
      trainseg = 1
      valseg = 1

    # Store resulting model structures and training plans
    All_nodes = np.append(All_nodes, a) 
    All_seq = np.append(All_seq, int(b))
    All_valseg = np.append(All_valseg, valseg)
    All_trainseg = np.append(All_trainseg, trainseg)
    All_modelname = np.append(All_modelname, 'UNIT_' + str(a) + '_LB_' + str(b) + '_run_' + str(count[-1]) + '/' )
    All_mod_name = np.append(All_mod_name, 'UNIT_' + str(a) + '_LB_' + str(b) + '_run_' + str(count[-1]) )

    if ct+1 < runs:
      ct += 1
    else:
      ct = 0
print(All_mod_name)

**Load training and validation loss**

In [None]:
TR = np.empty([0,1], dtype='int') # Val Loss [subject x runs x models]
VAL = np.empty([0,1], dtype='int') # Val Loss [subject x runs x models]
ND = np.empty([0,1], dtype='int') # nodes
LB = np.empty([0,1], dtype='int') # Lookback

for j in range(len(All_mod_name)):
  if j != 47:
    # extract path to store each model and generated data
    savepath = path + All_modelname[j]
    mod_name = All_mod_name[j]

    #print(savepath + mod_name + '_MIN_train_loss.npy')
    if os.path.exists(savepath + mod_name + '_MIN_train_loss.npy'):
      tempTr= np.load(savepath + mod_name + '_MIN_train_loss.npy')
      tempVal= np.load(savepath + mod_name + '_MIN_val_loss.npy')
      TR = np.append(TR,tempTr)
      VAL = np.append(VAL,tempVal)
      LB = np.append(LB, int(mod_name[mod_name.find('LB')+3:mod_name.find('LB')+6]))
      ND = np.append(ND, int(mod_name[mod_name.find('IT_')+3:mod_name.find('_LB')]))

scipy.io.savemat(newSavePath + 'trainValLoss.mat',{'TR':TR, 'VAL':VAL, 'numnode':ND, 'numLB':LB})

# Gait Signature alignment

In [None]:
# generate a list of models and corresponding parameters to test 
test_model_nodes = [128, 256,512,1024] 
seqs = [249,499,749] #lookback parameter

test_model_nodes = [ 512] 
seqs = [499] #lookback parameter

# run multiple model architechtures many times to test stability of cost function outputs
runs = 1 # Number of times to train recurrent neural network (RNN) models, starting from random initial conditions. We used this for hyperparameter selection
test_model_seq = np.repeat(seqs, runs) # Specify each model's hyperparameters

count = np.arange(runs)

All_nodes = np.empty([0,1], dtype='int')
All_seq = np.empty([0,1],dtype='int')
All_valseg = np.empty([0,1],dtype='int')
All_trainseg = np.empty([0,1],dtype='int')
All_modelname = []
All_mod_name = []
count = np.empty([0,1],dtype='int'); #initialize model run -- this serves as the model run ID number
ct = 0
for a in test_model_nodes:
  for b in test_model_seq:
    count = np.append(count,  ct + 1 )
    #if statement for valseg, trainseg based on sequence length
    if int(b) == 249:
      trainseg = 4
      valseg = 2
    elif int(b) == 499: 
      trainseg = 2
      valseg = 1
    elif int(b) == 749:
      trainseg = 1
      valseg = 1

    # Store resulting model structures and training plans
    All_nodes = np.append(All_nodes, a) 
    All_seq = np.append(All_seq, int(b))
    All_valseg = np.append(All_valseg, valseg)
    All_trainseg = np.append(All_trainseg, trainseg)
    All_modelname = np.append(All_modelname, 'UNIT_' + str(a) + '_LB_' + str(b) + '_run_' + str(count[-1]) + '/' )
    All_mod_name = np.append(All_mod_name, 'UNIT_' + str(a) + '_LB_' + str(b) + '_run_' + str(count[-1]) )

    if ct+1 < runs:
      ct += 1
    else:
      ct = 0
print(All_mod_name)

30
['UNIT_256_LB_499_run_1' 'UNIT_256_LB_499_run_2' 'UNIT_256_LB_499_run_3'
 'UNIT_256_LB_499_run_4' 'UNIT_256_LB_499_run_5' 'UNIT_256_LB_499_run_6'
 'UNIT_256_LB_499_run_7' 'UNIT_256_LB_499_run_8' 'UNIT_256_LB_499_run_9'
 'UNIT_256_LB_499_run_10' 'UNIT_256_LB_749_run_1' 'UNIT_256_LB_749_run_2'
 'UNIT_256_LB_749_run_3' 'UNIT_256_LB_749_run_4' 'UNIT_256_LB_749_run_5'
 'UNIT_256_LB_749_run_6' 'UNIT_256_LB_749_run_7' 'UNIT_256_LB_749_run_8'
 'UNIT_256_LB_749_run_9' 'UNIT_256_LB_749_run_10' 'UNIT_512_LB_499_run_1'
 'UNIT_512_LB_499_run_2' 'UNIT_512_LB_499_run_3' 'UNIT_512_LB_499_run_4'
 'UNIT_512_LB_499_run_5' 'UNIT_512_LB_499_run_6' 'UNIT_512_LB_499_run_7'
 'UNIT_512_LB_499_run_8' 'UNIT_512_LB_499_run_9' 'UNIT_512_LB_499_run_10']
/content/drive/My Drive/NeuromechanicsLab/GaitSignatures/UNIT_256_LB_499_run_1/UNIT_256_LB_499_run_1_bestwhole.h5


**Load short & long-time gait signature alignment**

In [None]:
ST = np.zeros((trialnum,runs,3)) # [subject x runs x models]
R2_ST = np.zeros((trialnum,runs,3)) # [subject x runs x models]

for j in range(len(All_mod_name)):
  # extract path to store each model and generated data
  savepath = path + All_modelname[j]
  mod_name = All_mod_name[j]

  # Load results
  temp = scipy.io.loadmat(savepath + mod_name + '_shortTimeAlignment.mat')
  # Load Euclidean distance alignment to identify bogus trials
  EuclidAlignment = temp['EuclideanAlignment']
  ST[:,j % 10, int(np.ceil((j+1)/10))-1] = ShortTimeAlignment[:,0]

  # Extract alignment results
  R2 = temp['R2']
  R2_ST[:,j % 10, int(np.ceil((j+1)/10))-1] = R2[:,0]

# Remove any short-time sigs where clean cycles were not identified
badInds, temp =  np.where(np.any(ST == 0, axis = 1)) 
ST = np.delete(ST, np.unique(badInds), 0)
R2_ST = np.delete(R2_ST, np.unique(badInds), 0)


# LONG TIME ####################
LT = np.zeros((trialnum,runs,3)) # [subject x runs x models]
R2_LT = np.zeros((trialnum,runs,3)) # [subject x runs x models]
print(np.shape(LT))

for j in range(len(All_mod_name)):
  savepath = path + All_modelname[j]
  mod_name = All_mod_name[j]

  # Load long-time alignment results
  temp = scipy.io.loadmat(savepath + mod_name + '_LongTimeAligment.mat')
  # Extract R-squared
  R2 = temp['R2']
  R2_LT[:,j % 10, int(np.ceil((j+1)/10))-1] = R2[:,0]

# Remove bad trials (same trials as short-time alignment)
R2_LT = np.delete(R2_LT, np.unique(badInds), 0)

# Save
scipy.io.savemat(newSavePath + 'gaitSigAlignment.mat',{'R2_ST':R2_ST, 'R2_LT':R2_LT})

(72, 10, 3)
