Human Activity Recognition (HAR) via Smartphone Sensor Data

Project Overview

This repository contains a complete end-to-end Machine Learning pipeline designed to classify human physical activities (Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing, and Laying) using continuous time-series data captured from smartphone accelerometers and gyroscopes.

Developed during a rapid 3-day collaborative sprint, this project demonstrates a mature Data Science workflow: starting with exploratory data analysis (EDA), establishing a traditional machine learning baseline, and ultimately engineering a custom Deep Neural Network to maximize predictive accuracy.

The Dataset & Real-World Application

This model utilizes the UCI HAR Dataset, acting as the core "engine" similar to the algorithms running inside modern smartwatches (e.g., Apple Watch, Fitbit) for health monitoring and workout tracking.

Input: 561 statistical features extracted from 2.56-second sliding windows of 3-axial linear acceleration and 3-axial angular velocity.
Output: Multiclass classification across 6 distinct activity states.

Architecture & Methodology

We divided the project into a competitive architecture, pitting a traditional ML ensemble against a Deep Learning approach.

Data Pipeline & EDA: Downloaded raw signals, verified class balance, and visualized sensor waveform variance (e.g., high-variance walking waves vs. low-variance resting flatlines).
The Baseline (Scikit-Learn): Trained a Random Forest Classifier (100 estimators) to establish a performance floor. Random Forests are highly interpretable and robust against overfitting on tabular data.
The Deep Learning Challenger (PyTorch): Engineered a Deep Feedforward Neural Network (DNN) with PyTorch.
- Architecture: 561 Input Nodes → 256 Hidden (ReLU) → Dropout (0.3) → 128 Hidden (ReLU) → 6 Output Nodes.
- Optimization: Adam Optimizer (lr=0.001) and CrossEntropyLoss, trained over 50 epochs.

Results & Evaluation

Model	Accuracy	Training Time	Complexity
Random Forest (Baseline)	92.57%	Instant	Low
PyTorch Neural Network	94.16%	Moderate (50 Epochs)	High

Key Data Science Insight: The "Static Posture" Problem

While the PyTorch DNN achieved an impressive 94.16% overall accuracy, a deep dive into the Confusion Matrix reveals a shared vulnerability in both models: Differentiating between SITTING and STANDING.

The "Why": Because the smartphone is positioned in a static pocket, the microscopic gravitational acceleration profiles for sitting and standing are virtually indistinguishable to the sensors once the transition movement is complete.
Future Work: To break this plateau, future iterations would require engineering specific temporal features that capture the transition (the act of standing up or sitting down) rather than relying solely on static 2.56-second windows.

Repository Structure

This project was built collaboratively using branch-based version control.

har.ipynb: Shakshitha's initial branch for Data Retrieval, EDA, and Baseline Modeling.
preprocessing.ipynb: Utsav's initial branch for PyTorch tensor formatting and architecture design.
Final_HAR_Showdown.ipynb: The master integration notebook. Start here. This notebook combines the entire pipeline from data ingestion to the final dual-model showdown.

How to Run

This project is fully self-contained and requires zero local file downloading.

Open Final_HAR_Showdown.ipynb in Google Colab.
Click Runtime > Run all.
The script will automatically fetch the UCI dataset via wget, preprocess the tensors, train both models, and output the final confusion matrices.

Contributors

Utsav Saxena (@Utsav-exe) - Deep Learning Architecture & PyTorch Integration
Shakshitha M (@shagitjams) - Data Pipeline, Exploratory Analysis, & Baseline Modeling

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
Final_HAR_Showdown.ipynb		Final_HAR_Showdown.ipynb
README.md		README.md
har.ipynb		har.ipynb
preprocessing.ipynb		preprocessing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human Activity Recognition (HAR) via Smartphone Sensor Data

Project Overview

The Dataset & Real-World Application

Architecture & Methodology

Results & Evaluation

Key Data Science Insight: The "Static Posture" Problem

Repository Structure

How to Run

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Human Activity Recognition (HAR) via Smartphone Sensor Data

Project Overview

The Dataset & Real-World Application

Architecture & Methodology

Results & Evaluation

Key Data Science Insight: The "Static Posture" Problem

Repository Structure

How to Run

Contributors

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages