# Pose Estimation
Using Mediapipe


#importing necessary packages

In [57]:
import cv2
import mediapipe as mp


In [58]:
mpPose = mp.solutions.pose
pose = mpPose.Pose()
mpDraw = mp.solutions.drawing_utils

## Task - 1

Processing a single image for pose estimation

In [59]:

img = cv2.imread('Gallery/2.png') #getting the image
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # since cv2 reads image as BGR we are converting it to RGB
results = pose.process(imgRGB) #getting the pose results from the imag
if results.pose_landmarks:
    mpDraw.draw_landmarks(img, results.pose_landmarks, mpPose.POSE_CONNECTIONS)
    for id, lm in enumerate(results.pose_landmarks.landmark):
        h, w,c = img.shape
    cx, cy = int(lm.x*w), int(lm.y*h)
    cv2.circle(img, (cx, cy), 5, (255,0,0), cv2.FILLED)
    
cv2.imwrite('out.png', img)
cv2.imshow("Image", img)
if cv2.waitKey(0) == ord('q'):  
    cv2.destroyAllWindows()

# Explaination: -

1. OpenCV

The code in the above section shows how we can easily detect and estimate the pose of a single image. The image 'out.png' represents the desired output of the pose estimation model using MediaPipe library. 
The Python Packages used are:

What is OpenCV?
Computer vision is a process by which we can understand how the images and videos are stored and manipulated, also it helps in the process of retrieving data from either images or videos. Computer Vision is part of Artificial Intelligence. Computer-Vision plays a major role in Autonomous cars, Object detections, robotics, object tracking, etc.

 

OpenCV is an open-source library mainly used for computer vision, image processing, and machine learning. It gives better output for real-time data, with the help of OpenCV, we can process images and videos so that the implemented algorithm can be able to identify objects such as cars, traffic signals, number plates, etc., and faces, or even handwriting of a human. With the help of other data analysis libraries, OpenCV is capable of processing the images and videos according to one’s desire.

More information about OpenCV can be acquired here (https://opencv.org/)

The library which we are going to use along with OpenCV-python is Mediapipe




2.MediaPipe

The library used is [mediapipe](https://learnopencv.com/introduction-to-mediapipe/). it developed by Google and is used for computer vision applications.
It has multiple purposes and can be used for different tasks.

What is Mediapipe?
Mediapipe is a framework mainly used for building multimodal audio, video, or any time series data. With the help of the MediaPipe framework, an impressive ML pipeline can be built for instance of inference models like TensorFlow, TFLite, and also for media processing functions.

 
Applications of Mediapipe:
 -> Cutting edge ML models using Mediapipe
 -> Face Detection
 -> Multi-hand Tracking
 -> Hair Segmentation
 -> Object Detection and Tracking
 -> Objectron: 3D Object Detection and Tracking
 -> AutoFlip: Automatic video cropping pipeline
 -> Pose Estimation


# Task - 2 

Here I have  used the same technologies and packages as in the previous section. 

The small additions are the following:
1. the while loop: it moves the images(frames) one by one
2. VideoCapture(): it is used to read the video file, when we write videoCapture(0) it reads the video from the webcam.
3. cv2.VideoWriter_fourcc() - it is used to write the video file. in order to me to be able to share it with you
4. landmarks = I was unable to find an in-built method to get only the specific landmarks. and hence I used the follwoing code:

```
required_landmarks = [0,12,11,13,14, 25,26,27,28]
 landmarks  = ([i for i in range(len(results.pose_landmarks.landmark)) if i in required_landmarks])
 ```
 This helped me in getting only the required landmarks as mentioned in the wordfile.

Note - You can uncomment the VideoCapture('Gallery/a.mp4') line and comment the VideoCapture(0) line to see the same task getting performed on the video file.

In [60]:
mpPose = mp.solutions.pose
pose = mpPose.Pose()
mpDraw = mp.solutions.drawing_utils

cap = cv2.VideoCapture(0)
# cap = cv2.VideoCapture('Gallery/a.mp4')
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 20.0, (640, 480))
required_landmarks = [0,12,11,13,14, 25,26,27,28]
while True:
    success, img = cap.read()
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = pose.process(imgRGB)
    
    
    if results.pose_landmarks:
        landmarks  = ([i for i in range(len(results.pose_landmarks.landmark)) if i in required_landmarks])
        # mpDraw.draw_landmarks(img,results.pose_landmarks)
        for id, lm in enumerate(results.pose_landmarks.landmark):
            if id in required_landmarks:
                h, w,c = img.shape
                print(id, lm)
                cx, cy = int(lm.x*w), int(lm.y*h)
                cv2.circle(img, (cx, cy), 5, (255,0,0), cv2.FILLED)
                cv2.imshow("Image", img)
                # Converts to HSV color space, OCV reads colors as BGR
                # frame is converted to hsv
                # output the frame
                out.write(img) 
        if cv2.waitKey(1) == ord('q'): 
            cap.release() 
            out.release()
            cv2.destroyAllWindows()
            break

0 x: 0.45083341002464294
y: 0.17173419892787933
z: -0.5295770764350891
visibility: 0.9999860525131226

11 x: 0.5220150947570801
y: 0.28873056173324585
z: -0.24774160981178284
visibility: 0.9999487400054932

12 x: 0.38049861788749695
y: 0.28750714659690857
z: -0.25611838698387146
visibility: 0.9999428987503052

13 x: 0.5214977860450745
y: 0.43411362171173096
z: -0.1654447764158249
visibility: 0.9690458178520203

14 x: 0.3696163296699524
y: 0.44776955246925354
z: -0.19700360298156738
visibility: 0.9764620661735535

25 x: 0.49167734384536743
y: 0.7463185787200928
z: 0.025084365159273148
visibility: 0.9942060112953186

26 x: 0.42743152379989624
y: 0.7472224831581116
z: 0.12288936227560043
visibility: 0.9890666007995605

27 x: 0.5165785551071167
y: 0.927878737449646
z: 0.1898217797279358
visibility: 0.9946910738945007

28 x: 0.4345349669456482
y: 0.8997609615325928
z: 0.3867786228656769
visibility: 0.9886799454689026

0 x: 0.44780057668685913
y: 0.17280635237693787
z: -0.5373184680938721
vi

# Task -3