The objective of this case study is to build a model that predicts human activities such as Walking, Walking Upstairs, Walking Downstairs, Standing, Sitting or laying using sensors like accelerometer to measures acceleration and gyroscope to measure angular velocity that we have in the Smartphone.
The dataset is collected from 30 persons performing different activities with a smartphone to their waist and it is recorded from the tri-axial reading of the accelerometer and gyroscope measured w.r.t time. So, we have 6 time series data as a input to solve a 6-class multi class classification problem.
The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window).
For each record in the dataset it is provided:
- Triaxial acceleration from the accelerometer (total acceleration) and the estimated body acceleration.
- Triaxial Angular velocity from the gyroscope.
- A 561-feature vector with time and frequency domain variables.
- Its activity label.(WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING)
- An identifier of the subject who carried out the experiment.
- No. of data points per Activity - No problem of Imbalance
- Stationary and Moving activities
- Magnitude of an acceleration - Well seperate static and dynamic activities
-
T-SNE Plot: Lot of Confusion between the standing and sitting activities
-
We have 561 handcoded features which is engineered by the domain experts and obtained as a frequency and amplitide variation of time series data.
-
Apart from that, we have raw time series collected directly from the sensors which is then fed into the deep learning Algorithms to auto engineer all the features and make the predictions.
Machine Learning Models
Hyper-tuned all the relevant machine learning models on the Handcoded 561 features for the human activity recognition problem
Algorithm Test Accuracy Logistic Regression 96.27% Linear SVC 96.61% rbf SVM classifier 96.27% DecisionTree 86.43% Random Forest 91.31% GradientBoosting DT 91.31% Best performing model is Linear Support Vector Classifier
Deep Learning Models
In the deep learning, the most effective algorithm for the raw time-series data is LSTM
1-Layer LSTM Layer with hidden layer = 128 with dropout = 0.5 - 92.53
1-Layer LSTM Layer with hidden layer = 324 with slight change in dropout = 0.6 - 92.22%
1-Layer LSTM Layer with hidden layer = 324 with slight change in dropout = 0.6 - 89%
2-Layer LSTM Layer with hidden layer, h1 = 128 & h2 = 64 with dropout 0.2 & 0.5 respectively - 92.77%
Rely on the single algorithm is always not to be the best idea and this statement will makes sense when we apply CNN and got accuracy of 92.80% which is slightly better than the LSTM approach.
Algorithm Test Accuracy LSTM 92.77% CNN 92.80% Divide and Conquer-Based with CNN 94.6%
Deep Learning Models for Human Activity Recognition by machinelearningmastery.com
Applied AI Course
Divide and Conquer-Based 1D CNN Human Activity Recognition Using Test Data Sharpening paper