Human activity recognition is one of important problems in computer vision. It includes accurate identification of activity being performed by a person or group of people.
In this project, we will be classifying yoga poses into different classes. We have 19 types of Asanas in our dataset and 29K images for training a machine learning model.
Yoga pose estimation has multiple applications such as in creating a mobile application for yoga trainer.
This is a kaggle competition and link of competition is : Link
The competition ran for 1 month and we need to make submissions on weekly basis.
We are given around 29K training images with 19 types of Asanas and the images are taken from 4 different camera angles. The camera angles of training and test images are different i.e. training images are taken from 3 camera angles and test images are taken from the fourth camera angle. We need to predict the test data yoga poses using Machine Learning.
I have used 2 techniques to predict Yoga poses:
I have built a simple CNN model to predict yoga poses. Initially, I was getting accuracy of 35% on test data. I analyse the images and use appropriate data augmentation techniques to improve the accuracy from 35% to 76% using the same model. The data augmentation techniques that I have used are:
- Padding
- HorizontalFlip with Padding
- RandomPerspective with Padding
- HorizontalFlip with RandomPerspective and Padding
The detailed analysis could be found at Report.pdf under Week 1 section.
After analysing the images, I observed that if we remove background and only focus on the human body, then for different camera angles, the body is at 90 degree angle i.e. it is like body is rotating about the y-axis if we assume the image is in X-Y plane. So, if we could find the coordinates of different parts of body, then we can use neural network to learn from the coordinates and the features that we will require are X,Y,Z,X2,Z2. So, I used PoseNet to find X-Y coordinates of 17 key points of the body. But the problem is we need to find the Z coordinate. For that I calculated the bone's length of human body and then I calculated z-coordinate using that.
The detailed analysis could be found at Report.pdf under Week 2 section.