#Implementing Human Activity Recognition (HAR) using Keras with CNN and LSTM algorithms

**Objective:** To implement Human Activity Recognition using Keras with CNN and LSTM

**Data**: UCI Human Activity Recognition(HAR) dataset is used for this "Human Activity Recognition". In this dataset, six activities are recorded such as (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING< LAYING). In this, dataset os partitioned into train and test sets.
Dataset is in format of time-series representation and suitable for classification tasks.

**Pre-trained model algorithm used:** LSTM and CNN

**Description:** Using CNN and LSTM algorithms for to perform Human Activity Recognition. Keras from Tensorflow framework is used in the library for provide a good performance for building a convolutional neural layers and LSTM model.

**Libraries used**: Numpy, Pandas, Matplotlib and Tensorflow

**Algorithms used**: CNN and LSTM

**Reference used**: [Correct link to be updated](https://www.kaggle.com/datasets/meetnagadia/human-action-recognition-har-dataset)


In [1]:
!pip install datasets

Collecting datasets
  Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.2.0-py3-none-any.whl (480 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.6/480.6 kB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading dill-0.3.8-py3-none-any.whl (116 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading fsspec-2024.9.0-py3-none-any.whl 

In [1]:
pip install rarfile



In [4]:
!pip install patool

Collecting patool
  Downloading patool-3.1.0-py2.py3-none-any.whl.metadata (4.3 kB)
Downloading patool-3.1.0-py2.py3-none-any.whl (98 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/98.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.4/98.4 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: patool
Successfully installed patool-3.1.0


In [2]:
pip install tensorflow pandas numpy scikit-learn



In [17]:
pip install tensorflow pandas numpy scikit-learn urllib3



###**Importing required libraries and adding activity labels**

In [28]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv1D, MaxPooling1D, LSTM, Dense, Dropout
import urllib.request
import zipfile
import os

# Define activity labels
ACTIVITIES = {
    0: 'WALKING',
    1: 'WALKING_UPSTAIRS',
    2: 'WALKING_DOWNSTAIRS',
    3: 'SITTING',
    4: 'STANDING',
    5: 'LAYING'
}

###**Defining download_and_extract_dataset() function for integrating the data from UCI website**

In [29]:
def download_and_extract_dataset():
    """Download and extract the UCI HAR dataset"""
    if not os.path.exists("UCI HAR Dataset"):
        print("Downloading dataset...")
        try:
            urllib.request.urlretrieve(
                "https://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip",
                "UCI HAR Dataset.zip"
            )
            print("Download complete!")

            print("Extracting files...")
            with zipfile.ZipFile("UCI HAR Dataset.zip", 'r') as zip_ref:
                zip_ref.extractall()

            os.remove("UCI HAR Dataset.zip")
            print("Extraction complete!")
        except Exception as e:
            print(f"Error during download/extraction: {str(e)}")
            return False
    else:
        print("Dataset directory already exists.")
    return True

###**Defining a load_data() function for integrating train and test dataset from UCI HAR dataset**

In [30]:
def load_data():
    """Load and prepare the UCI HAR dataset"""
    try:
        print("Loading training data...")
        X_train = pd.read_csv('UCI HAR Dataset/train/X_train.txt', delim_whitespace=True, header=None)
        y_train = pd.read_csv('UCI HAR Dataset/train/y_train.txt', names=['activity'], header=None)

        print("Loading test data...")
        X_test = pd.read_csv('UCI HAR Dataset/test/X_test.txt', delim_whitespace=True, header=None)
        y_test = pd.read_csv('UCI HAR Dataset/test/y_test.txt', names=['activity'], header=None)

        print(f"Training data shape: {X_train.shape}")
        print(f"Test data shape: {X_test.shape}")

        # Reshape data for CNN-LSTM
        n_timesteps = 1
        n_features = 561

        print("Reshaping data...")
        X_train_reshaped = np.array(X_train).reshape(-1, n_timesteps, n_features)
        X_test_reshaped = np.array(X_test).reshape(-1, n_timesteps, n_features)

        print(f"Reshaped training data shape: {X_train_reshaped.shape}")
        print(f"Reshaped test data shape: {X_test_reshaped.shape}")

        # Convert labels to categorical
        y_train_cat = tf.keras.utils.to_categorical(y_train - 1)
        y_test_cat = tf.keras.utils.to_categorical(y_test - 1)

        print("Data preparation complete!")
        return X_train_reshaped, y_train_cat, X_test_reshaped, y_test_cat, y_test

    except Exception as e:
        print(f"Error during data loading: {str(e)}")
        raise

###**Defining create_model() sequential function for LSTM with convolutional layers under 'relu' activation function**

In [31]:
def create_model(n_timesteps, n_features, n_classes):
    """Create CNN-LSTM model"""
    print("Creating model...")
    model = Sequential([
        Conv1D(filters=64, kernel_size=1, activation='relu',
               input_shape=(n_timesteps, n_features)),
        MaxPooling1D(pool_size=1),
        Dropout(0.2),

        LSTM(64, return_sequences=True),
        Dropout(0.2),
        LSTM(32),
        Dropout(0.2),

        Dense(32, activation='relu'),
        Dense(n_classes, activation='softmax')
    ])

    print("Compiling model...")
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )

    return model

###**Defining predict_activities() function for to add parameter activities for the model to view the prediction statistics**

In [32]:
def predict_activities(model, X_test, y_test):
    """Predict activities and show results"""
    # Make predictions
    print("\nPredicting activities...")
    y_pred = model.predict(X_test)
    y_pred_classes = np.argmax(y_pred, axis=1)


    y_test_classes = y_test['activity'].values - 1

    # Print some sample predictions
    print("\nSample Activity Predictions:")
    print("================================")
    print("Actual Activity | Predicted Activity")
    print("--------------------------------")

    # Show first 20 predictions
    for actual, pred in zip(y_test_classes[:20], y_pred_classes[:20]):
        print(f"{ACTIVITIES[actual]:<15} | {ACTIVITIES[pred]}")

    # Calculate accuracy
    correct_predictions = sum(y_test_classes == y_pred_classes)
    total_predictions = len(y_test_classes)
    accuracy = (correct_predictions / total_predictions) * 100

    print("\nPrediction Statistics:")
    print(f"Total samples: {total_predictions}")
    print(f"Correct predictions: {correct_predictions}")
    print(f"Accuracy: {accuracy:.2f}%")

    # Print activity-wise accuracy
    print("\nActivity-wise Accuracy:")
    print("================================")
    for activity_id, activity_name in ACTIVITIES.items():
        mask = y_test_classes == activity_id
        if np.any(mask):
            activity_accuracy = (sum((y_pred_classes == activity_id) & mask) / sum(mask)) * 100
            print(f"{activity_name:<20}: {activity_accuracy:.2f}%")

In [33]:
def main():
    print("Starting HAR classification program...")

    if not download_and_extract_dataset():
        print("Failed to prepare dataset. Exiting...")
        return

    try:
        # Load and prepare data
        print("\nPreparing data...")
        X_train, y_train, X_test, y_test_cat, y_test = load_data()

        # Model parameters
        n_timesteps = 1
        n_features = 561
        n_classes = 6

        # Create and compile model
        print("\nPreparing model...")
        model = create_model(n_timesteps, n_features, n_classes)

        print("\nModel summary:")
        model.summary()

        # Train model
        print("\nTraining model...")
        history = model.fit(
            X_train, y_train,
            epochs=10,
            batch_size=32,
            validation_split=0.2,
            verbose=1
        )

        # Evaluate and predict
        print("\nEvaluating model...")
        loss, accuracy = model.evaluate(X_test, y_test_cat, verbose=1)
        print(f"\nTest accuracy: {accuracy*100:.2f}%")

        # Show predictions
        predict_activities(model, X_test, y_test)

    except Exception as e:
        print(f"\nAn error occurred: {str(e)}")
        raise

if __name__ == "__main__":
    main()

Starting HAR classification program...
Dataset directory already exists.

Preparing data...
Loading training data...


  X_train = pd.read_csv('UCI HAR Dataset/train/X_train.txt', delim_whitespace=True, header=None)


Loading test data...


  X_test = pd.read_csv('UCI HAR Dataset/test/X_test.txt', delim_whitespace=True, header=None)


Training data shape: (7352, 561)
Test data shape: (2947, 561)
Reshaping data...
Reshaped training data shape: (7352, 1, 561)
Reshaped test data shape: (2947, 1, 561)
Data preparation complete!

Preparing model...
Creating model...
Compiling model...

Model summary:


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)



Training model...
Epoch 1/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 11ms/step - accuracy: 0.3544 - loss: 1.4658 - val_accuracy: 0.6608 - val_loss: 0.7003
Epoch 2/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 13ms/step - accuracy: 0.7135 - loss: 0.6067 - val_accuracy: 0.9069 - val_loss: 0.2588
Epoch 3/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 9ms/step - accuracy: 0.8538 - loss: 0.3642 - val_accuracy: 0.9171 - val_loss: 0.2251
Epoch 4/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 8ms/step - accuracy: 0.9058 - loss: 0.2441 - val_accuracy: 0.9198 - val_loss: 0.2281
Epoch 5/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - accuracy: 0.9279 - loss: 0.2025 - val_accuracy: 0.9334 - val_loss: 0.2051
Epoch 6/10
[1m184/184[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - accuracy: 0.9395 - loss: 0.1563 - val_accuracy: 0.9395 - val_loss: 0.1365
Epoch 7

Hence from training the model of upto 10 epochs, the LSTM model has performed good with the good accuracy. LSTM Model is used for to identify the activities for longer sequences and since it has the ability to remember long range dependencies. Predication statistics is measured and showed after calling all the subfunctions within the main function, it has recognized the list of activities along with its performance metrics. It has also shown of total number of samples predicted is 2947 files.

 Within total predictions, it has correctly predicted 2703 files with the accuracy of 91.72%