Skip to content

ML application on wearable devices to predict a person's activities.

Notifications You must be signed in to change notification settings

angelxd84130/Activity-Analyzer-with-Wearable

Repository files navigation

Contributors Forks Stargazers Issues LinkedIn


Activity-Analyzer-with-Wearable

Create an ELT system to classify a person’s activities by analyzing the data collected by wearable devices.
View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap
  4. Contact
  5. Acknowledgements

About The Project

Download data from the website of the Huawei Challenge and upload it to related database AWS Aurora to create an environment where big data is stored in the cloud service. Realize the concept of ELT(Extraction, Loading, Transformation), use SQL to load the required data from the cloud and use machine learning algorithms to classify a person's activities. The calculated machine model can be used on wearable devices or mobile phones to classify people's activities, and even plan people's travel speed and trajectory, which can be used in sports bracelets, maps, and anti-lost tools.

Here's why:

  • Real-world data is very complicated and needs to be handled carefully
  • The ELT method is more conducive to future big data processing
  • The results of machine training can be applied to a wider range of content and products

Feature Engineering

In the dataset, it stores a record of a person carrying 4 sensors on his body's hand, hip, backpack, and torso and performing eight activities: still, walking, run, bike, bus, car, train, and subway. Indeed, each sensor records the person's continuous activity amplitude vectors (x,y,z). x represents the forward/backward shock amplitude of the body, y represents the left/right amplitude of the body, z Indicates the amplitude of body jumping/squatting movements.

Sensors

After observation, it is found that each sensor contains a variety of calculation methods for vibration amplitude. Among them, the three methods of Acceleration, Gyroscope, and Magnetyometer store the most complete and stable data. Therefore, the next experiment will focus on analyzing the data of these three methods.

Data Modeling

The identification method of the eight activities is the most important theme of this experiment. After detailed discussions, we believe that using the amplitude of xyz during the activity is the best way to judge the activity the user is doing.
The xyz amplitude generated by the user during each activity should be within a reference range, such as the amplitude of human walking in a straight(x) per second, the amplitude of riding a bicycle in a straight(x), to left and right(y), and to the up and down(z) shaking.

Sensor

Considering that the xyz amplitude of each activity is within a certain range, the decision tree is our priority algorithm that can help us calculate the scope of the activity and do classification effectively.

Data Processing

Considering that the amount of data for each activity is different, we must unify the number of examples for each activity to maintain the balance of training, so we extracted 160,000 examples for each activity in the data set.
(130,000 examples for training & 30,000 examples for testing)

DataDistribution

First, the data extracted from the four parts of the body are combined according to time, and then the features needed for the next step of machine learning are extracted. Each part only refers to 9 features, a total of four body parts, so the training set will contain a total of 4x9+1(time) features.

DataDescription

Next, consider the modeling method of the decision tree algorithm that the tree structure will expand the branches when encountering new exceptions, randomly disrupting the time-continuous activity examples can balance the tree structure and prevent overfitting problems.

Data Training & Testing

Use 1,040,000(81.25%) examples as training data and 240,000(18.75%) examples as testing data.
Use the built-in decision tree algorithm of the scikit-learn library for training and testing, the final result shows that the accuracy is up to 71.61%.
Draw the decision tree and confusion matrix.

Tree

Evaluation

According to the data in this ConfusionMatrix, it not only discovering the prediction accuracy of various activities, but also helps scientists think about which activities have similar data and are easy to distinguish errors.
With future parameter debugging and model optimization and integration, the accuracy of activity classification will gradually improve.

ConfusionMatrix

Built With

Getting Started

  1. Download the dataset from Huawei Challenge Website Huawei
  2. Data Processing
    • Deal with missing data
    • Combine data of 4 sensors come from the user's body
  3. Data Modeling
    • Decision Tree
    • Random Froest (imporve)
  4. Check the plot to see predict results
    • Accuracies
    • Confusion Matrix

Roadmap

See the open issues for a list of proposed features (and known issues).

Contact

Yu-Chieh Wang - LinkedIn
email: angelxd84130@gmail.com

Acknowledgements

About

ML application on wearable devices to predict a person's activities.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages