3D pose estimation from a single-shot captured from a monocular RGB camera. This approach is in real-time and robust to
- Various poses in the wild
- Multi-Person
- Can handle upto 15 FPS for video speed
- Illumination invariant.
Proposed solution is capable of obtaining a temporally consistent, full 3D skeletal human pose from a single RGB camera. The system has two main components.
- First, a Convolutional Neural Network (CNN) that regresses 2D and 3D joint locations under the defined conditions of monocular image captured. It is trained on annotated 3D human pose datasets using additionally annotated 2D human pose datasets for improved performance in the wild.
- Second, component blends regressed joint positions with a method of fitting kinematic skeletons to create a temporarily stable, camera-relative, complete 3D skeletal pose. The main idea of our method is a CNN that predicts the relative 3D joint positions of 2D and root (pelvis) in real time.
Apart from this, trained model is able to:
- Crop human pose from a single RGB image in real time
- Ensure temporarily smooth tracking over time.
Add model weights to Projects/models
and update the image/video path in 3d_Pose_Estimation.py
and run through
python 3d_Pose_Estimation.py
- OS : Ubuntu 18.04 LTS
- Processor : Intel® Core™ i7 3610QM @ 2.3 GHz
- RAM : 8 GB
- Graphics : NVIDIA® GeForce® GT 630M with 2GB
For futher details about the proposed approach and more poses refer the project report.