Long Short-Term Memory (LSTM) is a type of Recurrent Neural Network (RNN) that is designed to handle the problem of vanishing gradient in traditional RNNs. LSTM networks can learn long-term dependencies in sequential data and are widely used in various applications such as speech recognition, language modeling, and time series forecasting.

The key feature of an LSTM cell is its ability to selectively remember and forget information using its memory cell and gates. The memory cell stores the previous hidden state and carries it forward in time. The input gate determines how much new information is allowed into the memory cell, and the forget gate decides how much of the previous memory to retain or forget. Finally, the output gate controls how much of the current memory cell state is used to generate the output. The gates are controlled by sigmoid and element-wise multiplication operations, and the memory cell is updated using a tanh function.

To train an LSTM model, we need to define the architecture, loss function, and optimization method. The input to the model is a sequence of vectors, and the output is a sequence of predicted values. The architecture can have multiple LSTM layers with varying hidden units, dropout, and activation functions. The loss function can be Mean Squared Error (MSE) or Binary Cross Entropy (BCE), depending on the type of problem. The optimization method can be Stochastic Gradient Descent (SGD), Adam, or other variations.

To make our LSTM model have the highest accuracy, we can follow the following experimental procedure:

1. Preprocess the data: The input data should be normalized, and the sequences should be padded or truncated to a fixed length. We can also use techniques like data augmentation or feature engineering to enhance the data quality.

2. Split the data: We need to split the data into training, validation, and test sets. The training set is used to update the model parameters, the validation set is used to monitor the model performance and avoid overfitting, and the test set is used to evaluate the final model accuracy.

3. Define the architecture: We can start with a simple LSTM model and gradually increase the complexity by adding more layers or hidden units. We can also use techniques like early stopping, learning rate scheduling, or batch normalization to improve the model performance.

4. Train the model: We can use the Adam optimizer with a learning rate of 0.001, and a batch size of 32. We can train the model for 50 epochs and monitor the validation loss to avoid overfitting. We can also use techniques like gradient clipping or weight decay to prevent exploding gradients.

5. Evaluate the model: We can use the test set to evaluate the final model accuracy. We can compute metrics like Mean Squared Error (MSE), Root Mean Squared Error (RMSE), or Mean Absolute Error (MAE) to assess the model performance. We can also visualize the predicted values against the ground truth to gain insights into the model behavior.

6. By following these steps, we can create an LSTM model that achieves high accuracy and generalizes well to new data. The key to success is to experiment with different architectures, optimization methods, and hyperparameters and choose the ones that work best for our specific problem.

LSTM Model
1. Make sure all packages are installed (WINDOWS)

*    pip install numpy
*   pip install pandas
*   pip install os-sys
*    pip install matplotlib
*   pip install seaborn
*   pip install scikit-learn
*   pip install keras
*   pip install tensorflow 

1. Make sure all packages are installed (MACS AND/OR ANACONDA ENVIRONMENT)
*   conda  install numpy
*   conda  install pandas
*   conda  install os-sys
*   conda  install matplotlib
*   conda  install seaborn
*   conda  install scikit-learn
*   conda  install keras
*   conda  install tensorflow 

Links:

Kaggle Data: https://www.kaggle.com/harnoor343/fall-detection-accelerometer-data 

Potential Issues:
 * can I reshape before putting in LSTM layer :(

Todo:
* understand acceleration data comepared to previosu acceleration data
* normalize it?
* let's use fall data only first cases (), then all the cases


In [None]:
#DO NOT RUN IF YOU ARE NOT USING WINDOWS AND/OR NOT USING ANACONDA
%pip install numpy
%pip install pandas
%pip install os-sys
%pip install matplotlib
%pip install seaborn
%pip install scikit-learn
%pip install "tensorflow-gpu<2.10"
%pip install "tensorflow<2.10"
%pip install "keras<2.10"

In [None]:
%pip list

In [None]:
# If you are on Google Colab run this
# from google.colab import drive
# drive.mount('/content/drive')

In [1]:
import numpy as np
import pandas as pd
import glob
import os
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn 
import tensorflow as tf
from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout, Flatten, Conv1D, MaxPooling1D
import sklearn.model_selection

gpus = tf.config.list_physical_devices('GPU')
if gpus:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")

The data is stored like this:
['SubjectID', 'Device','ActivityID','TrialNo','Acc_x', 'Acc_y', 'Acc_z
, Gyr_x', 'Gyr_y', 'Gyr_z']
- The label (output) is the ActivityID
    - These labels are specified in function get_activity()
    - labels <= 100 are non-fall activities. 100 < labels < 135 are fall activities 
    - output is either 'IsFall' or no label

In [2]:
def data_collection():
    # create directory normalized if it does not exist
    os.makedirs('./normalized', exist_ok=True) 

    # import csv file
    # ON GOOGLE COLAB
        # read csv file from your google drive. find the file in your drive and copy the path and replace
        # the path in the read_csv function with the path to your file
    df = pd.read_csv('FallAllD2.csv')

    return df

In [3]:
df = data_collection()

In [4]:

# convert all columns to float32
df = df.astype('float32')
#df = data_normalize(df) #Don't know if I need to normalize
print('data collected')
print(df.shape)
original_df = df.copy()

data collected
(31439800, 12)


In [None]:
df = original_df.copy()
print(df.head(5))
# only keep `SubjectID` == 1
#df = df[df['SubjectID'] == 1]
print(df.head(5))

In [10]:
def data_label(df):
    # add a new column called "IsFall" that is 1 if the ActivityID > 100, and 0 if it is not
    df['IsFall'] = df['ActivityID'].apply(lambda x: 1 if x > 100 else 0)
    return df

def data_split(df):
    df = data_label(df)
    # split the data into features and labels
    x = df[['Device','Acc_x','Acc_y','Acc_z', 'Gyr_x', 'Gyr_y', 'Gyr_z']]
    #y = df['ActivityID']
    y = df['IsFall']
    
    from sklearn.preprocessing import StandardScaler
    scaler = StandardScaler()
    x = scaler.fit_transform(x)
    x = x.reshape((x.shape[0], 1, x.shape[1]))
    
    #Spliting Data
    x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x,y,test_size = 0.2)
    print('x y shape: ', x_train.shape, y_train.shape)

    # the following is not needed, the reshape is already done
    if False:
        # reshape training and testing data
        y_train = np.array(y_train).reshape(-1,1) # (-1,1) because our data has a single feature, and a 'n' amount of rows
        y_test = np.array(y_test).reshape(-1,1) # (-1,1) bec  ause our data has a single feature, and a 'n' amount of rows
        #reshape the features for the LSTM layer (I REALLY WANT TO FIX UGHHHG)
        steps = x_test.shape[1]
        x_train = np.array(x_train).reshape(x_train.shape[0], steps, 1)
        x_test = np.array(x_test).reshape(x_test.shape[0], steps, 1)

    return x_train, x_test, y_train, y_test

df = original_df.copy()
# only keep the columns 'Device' = 2
df = df[df['Device'] == 2]
print(df.head(1))
x_train, x_test, y_train, y_test = data_split(df)
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

print("Data Split Done")


   Unnamed: 0.1  Unnamed: 0  SubjectID  Device  ActivityID  TrialNo   Acc_x  \
0           0.0         0.0        1.0     1.0       101.0      1.0  4974.0   

   Acc_y  Acc_z  Gyr_x  Gyr_y  Gyr_z  
0  803.0  684.0  581.0  184.0  204.0  
x y shape:  (25151840, 1, 7) (25151840,)
(25151840, 1, 7) (25151840,)
(6287960, 1, 7) (6287960,)
Data Split Done


In [None]:
from sklearn.metrics import confusion_matrix
def model_create(x_train):
  model = Sequential()
  model.add(LSTM(512, input_shape=(x_train.shape[1], x_train.shape[2])))
  model.add(Dropout(0.2))
  
  # add a Flatten layer using x_train as input shape
  #model.add(Flatten(input_shape=x_train.shape[1:]))

  model.add(Dense(128, activation='relu'))
  model.add(Dropout(0.3))

  # for activity classification, we need 136 neurons in the output layer and categorical crossentropy as the loss function
  #model.add(Dense(136, activation='softmax'))
  #model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
  
  # for IsFall classification, we need 1 neuron in the output layer and binary crossentropy as the loss function
  model.add(Dense(1, activation='sigmoid'))
  model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
  return model

def train_and_accurary_model(model, x_train, x_test, y_train, y_test):

  # train the model
  model.fit(x_train, y_train, epochs=10, batch_size=256, validation_split=0.1)

  # evaluate the model
  test_loss, test_Acc = model.evaluate(x_test, y_test)
  print('Test accuracy:', test_Acc)
  model.summary()
  print("Confusion Matrix")
  y_pred = model.predict(x_test)
  y_pred = (y_pred > 0.5)
  confusion_mtx = confusion_matrix(y_test, y_pred)
  confusion_mtx_percent = confusion_mtx / confusion_mtx.sum(axis=1)[:, np.newaxis]
  print(confusion_mtx_percent)

  # Save Model Summary and Confusion Matrix into a text file
  with open(file_name + '.txt', 'w') as f:
    model.summary(print_fn=lambda x: f.write(x + '\n'))
    f.write('Confusion Matrix \n')
    f.write(str(confusion_mtx_percent))
  files.download(file_name + '.txt')  # Download to local machine
  
  return model

# Create and Train Model
model = model_create(x_train)
print("Model Created")
model = train_and_accurary_model(model, x_train, x_test, y_train, y_test)
print("Model Trained")

# Save Model to Local Machine
file_name = 'model_device2_aggregate.h5'
model.save(file_name)
from google.colab import files
files.download(file_name)  # Download to local machine
print("Model downloaded to local machine")

# Save Model to Google Drive
model.save('/content/drive/My Drive/' + file_name )
print("Model saved to Google Drive")


Functions that are not used

In [17]:
def get_activity(num = 0):
    # a dictionary that maps the case number (ActivityID) with stirng label
    activity_dict = {
    101: 'Fall F, walking, trip',
    102: 'Fall F, walking, trip, rec.',
    103: 'Fall F, walking, slip',
    104: 'Fall F, walking, slip, rec.',
    105: 'Fall F, walking, slip, rot.',
    106: 'Fall F, walking, slip, rot., rec.',
    107: 'Fall B, walking, slip',
    108: 'Fall B, walking, slip, rec.',
    109: 'Fall B, walking, slip, rot.',
    110: 'Fall B, walking, slip rot., rec.',
    111: 'Fall F, walking, syncope',
    112: 'Fall B, walking, syncope',
    113: 'Fall L, walking, syncope',
    114: 'Fall, syncope, table',
    115: 'Fall F, try sit',
    116: 'Fall F, try sit, rec.',
    117: 'Fall B, try sit',
    118: 'Fall B, try sit, rec.',
    119: 'Fall L, try sit',
    120: 'Fall L, try sit, rec.',
    121: 'Fall F, jog, trip',
    122: 'Fall F, jog, trip, rec.',
    123: 'Fall F, jog, slip',
    124: 'Fall F, jog, slip, rev.',
    125: 'Fall F, jog, slip, rot.',
    126: 'Fall F, jog, slip, rot., rec.',
    127: 'Fall L, bed',
    128: 'Fall L, bed, rec.',
    129: 'Fall F, chair, syncope',
    130: 'Fall B, chair, syncope',
    131: 'Fall L, chair, syncope',
    132: 'Fall F, syncope',
    133: 'Fall B, syncope',
    134: 'Fall L, syncope',
    135: 'Fall, syncope, slide over a wall',
    1: 'Start clap hands',
    2: 'Clap hands',
    3: 'Stop clap hands',
    4: 'Clap hands 1',
    5: 'Start wave hands',
    6: 'wave hands',
    7: 'Stop wave hands',
    8: 'Raising hand up',
    9: 'Moving hand down',
    10: 'Move hand up -> down',
    11: 'Hand shaking',
    12: 'Beating a table',
    13: 'Sitting down',
    14: 'Standing up',
    15: 'Fail to stand up',
    16: 'Lying down',
    17: 'Turning while lying',
    18: 'Rising up',
    19: 'Start walking',
    20: 'Walking slowly',
    21: 'Stop walking',
    22: 'Walking quickly',
    23: 'Stumbling',
    24: 'Jogging slowly',
    25: 'Jogging quickly',
    26: 'Jumping slightly',
    27: 'Jumping strongly',
    28: 'B...'
    }
    return activity_dict[num]

In [3]:
#Debugging function
def data_plot(df):
    #debugging function. does not need to be called, or return anything

    # plot the first 100 column 'AccelerationX' and 'AccelerationX_fft'
    df[['Acc'[0]], ['Acc'[1]], ['Acc'[2]]].iloc[:100].plot()

    # Versus Head (the same)
    #df[['AccelerationX_fft', 'AccelerationY_fft', 'AccelerationZ_fft']].head(100).plot()
    

In [None]:
def data_normalize(df): #Not in use
    # for each column in 'Accerlation', 'AccelerationY', 'AccelerationZ',
    # add a new column that is the fft of the original column
    for c in ['AccX', 'AccY', 'AccZ', 'GyrX', 'GyrY', 'GyrZ']:
        df[c + '_fft'] = np.fft.fft(df[c])
        
    return df

def data_categorical(df): #Not in use
    # convert column 'Device' to numerical data
    df['Device'] = df['Device'].astype('category')
    df['Device'] = df['Device'].cat.codes

    return df