# Create Keypoints Dataset

In this notebook we are going to make our own pose classification dataset in a csv format, which we can then use to train a pose classifier.

The csv file will be created by extracting the keypoints (x, y coordinates of every body landmark) from images that contain human body poses. 

Therefore, the primary material we need to work with, is images. There are many ways to collect images for that cause:

- Use the `00-collect-images-from-webcam` python script

- Download public image datasets from Kaggle, eg. [yoga pose image dataset](https://www.kaggle.com/datasets/ujjwalchowdhury/yoga-pose-classification) and organise them in the respective folder

- Record videos of yourself or others (as long as you have their consent) performing specific poses each time, and then use `ffmpeg` to extract the frames for each class. This process will return folders of frames, similar to what you get from option 1. Some quick instructions for `ffmpeg`:

Download it from [here](https://ffmpeg.org/download.html)
Then, open the terminal, move into the directory (cd PATH) where you have the video you want to transform into frames, create a folder to save your frames in it and run the command:

`ffmpeg -i file_name.mov -r 1 -s WxH -f image2 folder_name/%03d.jpeg`

This will extract one video frame per second from the video and will output them in files named 001.jpeg, 002.jpeg, etc. Images will be rescaled to fit the new WxH values (you can skipp -s WxH if you do not wish to resize your images). Look [here](https://ffmpeg.org/ffmpeg.html) for more information on `ffmpeg`.

## Call and Test Yolov8-Pose to Extract Keypoints from Image

Code adapted from this [repo](https://github.com/Alimustoofaa/YoloV8-Pose-Keypoint-Classification/tree/master).

First let's do some imports:

In [None]:
import cv2
import os
import glob
import csv
import pandas as pd
from PIL import Image
from pydantic import BaseModel
from ultralytics import YOLO

If the previous cell returns an error, uncomment the next line and run it to install pydantic

In [None]:
# !pip install pydantic

##### Download and save the pre-trained model YOLO for pose estimation

In [None]:
model = YOLO(model='yolov8m-pose.pt')

##### List out the names of our classes/labels in your dataset folder

In [None]:
# You might want to rename your folders with more meaningful names if they are currently named with numbers
# You can do that manually and then re-run this cell to see all your names listed
!ls ../data/my-data/my-pose-classification-dataset

##### Select a sample image from our dataset and apply the keypoint predictions from YOLO on it

In [None]:
image = cv2.imread('../data/my-data/my-pose-classification-dataset/folder_name/sample_file_name.jpg')
result = model.predict(image, save=False)[0]
result.keypoints[0]

##### Visualise the predictions of YOLO on our sample image

In [None]:
Image.fromarray(cv2.cvtColor(result.plot(), cv2.COLOR_BGR2RGB))

## Generate Dataset of Keypoints

##### Keypoint extract

YOLOv8 pose estimation returns 17 keypoints:
* NOSE:           int = 0
* LEFT_EYE:       int = 1
* RIGHT_EYE:      int = 2
* LEFT_EAR:       int = 3
* RIGHT_EAR:      int = 4
* LEFT_SHOULDER:  int = 5
* RIGHT_SHOULDER: int = 6
* LEFT_ELBOW:     int = 7
* RIGHT_ELBOW:    int = 8
* LEFT_WRIST:     int = 9
* RIGHT_WRIST:    int = 10
* LEFT_HIP:       int = 11
* RIGHT_HIP:      int = 12
* LEFT_KNEE:      int = 13
* RIGHT_KNEE:     int = 14
* LEFT_ANKLE:     int = 15
* RIGHT_ANKLE:    int = 16

Since YOLO does not return the keypoints with their respective labels, we will create the following class for easy access to the data.

In [None]:
class GetKeypoint(BaseModel):
    NOSE:           int = 0
    LEFT_EYE:       int = 1
    RIGHT_EYE:      int = 2
    LEFT_EAR:       int = 3
    RIGHT_EAR:      int = 4
    LEFT_SHOULDER:  int = 5
    RIGHT_SHOULDER: int = 6
    LEFT_ELBOW:     int = 7
    RIGHT_ELBOW:    int = 8
    LEFT_WRIST:     int = 9
    RIGHT_WRIST:    int = 10
    LEFT_HIP:       int = 11
    RIGHT_HIP:      int = 12
    LEFT_KNEE:      int = 13
    RIGHT_KNEE:     int = 14
    LEFT_ANKLE:     int = 15
    RIGHT_ANKLE:    int = 16

get_keypoint = GetKeypoint()

##### Create a function for extracting keypoints while using the class above

In [None]:
def extract_keypoint(keypoint):
    # nose
    nose_x, nose_y = keypoint[get_keypoint.NOSE]
    # eye
    left_eye_x, left_eye_y = keypoint[get_keypoint.LEFT_EYE]
    right_eye_x, right_eye_y = keypoint[get_keypoint.RIGHT_EYE]
    # ear
    left_ear_x, left_ear_y = keypoint[get_keypoint.LEFT_EAR]
    right_ear_x, right_ear_y = keypoint[get_keypoint.RIGHT_EAR]
    # shoulder
    left_shoulder_x, left_shoulder_y = keypoint[get_keypoint.LEFT_SHOULDER]
    right_shoulder_x, right_shoulder_y = keypoint[get_keypoint.RIGHT_SHOULDER]
    # elbow
    left_elbow_x, left_elbow_y = keypoint[get_keypoint.LEFT_ELBOW]
    right_elbow_x, right_elbow_y = keypoint[get_keypoint.RIGHT_ELBOW]
    # wrist
    left_wrist_x, left_wrist_y = keypoint[get_keypoint.LEFT_WRIST]
    right_wrist_x, right_wrist_y = keypoint[get_keypoint.RIGHT_WRIST]
    # hip
    left_hip_x, left_hip_y = keypoint[get_keypoint.LEFT_HIP]
    right_hip_x, right_hip_y = keypoint[get_keypoint.RIGHT_HIP]
    # knee
    left_knee_x, left_knee_y = keypoint[get_keypoint.LEFT_KNEE]
    right_knee_x, right_knee_y = keypoint[get_keypoint.RIGHT_KNEE]
    # ankle
    left_ankle_x, left_ankle_y = keypoint[get_keypoint.LEFT_ANKLE]
    right_ankle_x, right_ankle_y = keypoint[get_keypoint.RIGHT_ANKLE]
    
    return [
        nose_x, nose_y,
        left_eye_x, left_eye_y,
        right_eye_x, right_eye_y,
        left_ear_x, left_ear_y,
        right_ear_x, right_ear_y,
        left_shoulder_x, left_shoulder_y,
        right_shoulder_x, right_shoulder_y,
        left_elbow_x, left_elbow_y,
        right_elbow_x, right_elbow_y,
        left_wrist_x, left_wrist_y,
        right_wrist_x, right_wrist_y,
        left_hip_x, left_hip_y,
        right_hip_x, right_hip_y,
        left_knee_x, left_knee_y,
        right_knee_x, right_knee_y,        
        left_ankle_x, left_ankle_y,
        right_ankle_x, right_ankle_y
    ]

##### Create our list of labels/classes based on the folders' names:

In [None]:
dataset_root = '../data/my-data/my-pose-classification-dataset'
pose_list = sorted((f for f in os.listdir(dataset_root) if not f.startswith(".")), key=str.lower) # to avoid included hidden files like .DS_Store
pose_list

##### Run through all the images of our dataset and extract the keypoints

In [None]:
dataset_csv = []
for pose in pose_list:

    image_path_list = glob.glob(f'{dataset_root}/{pose}/*.jpg')
    for image_path in image_path_list:
        # get image_name
        image_name = image_path.split('/')[-1]
        # read numpy image
        image = cv2.imread(image_path)
        # get height width image
        height, width = image.shape[:2]
        
        # detect pose using yolov8-pose
        results = model.predict(image, save=False)[0]
        # get the normalised values of x and y
        results_keypoint = results.keypoints.xyn.numpy()

        for result_keypoint in results_keypoint:
            if len(result_keypoint) == 17:
                keypoint_list = extract_keypoint(result_keypoint)
                # insert image_name, labe] in index 0,1 for the first 2 columns of the csv file
                keypoint_list.insert(0, image_name)
                keypoint_list.insert(1, pose)
                dataset_csv.append(keypoint_list)
        # break
    # break


##### Save the dataset_csv list in a csv file

In [None]:
# write csv
header = [
    'image_name',
    'label',
    # nose
    'nose_x',
    'nose_y',
    # left eye
    'left_eye_x',
    'left_eye_y',
    # right eye
    'right_eye_x',
    'right_eye_y',
    # left ear
    'left_ear_x',
    'left_ear_y',
    # right ear
    'right_ear_x',
    'right_ear_y',
    # left shoulder
    'left_shoulder_x',
    'left_shoulder_y',
    # right sholder
    'right_shoulder_x',
    'right_shoulder_y',
    # left elbow
    'left_elbow_x',
    'left_elbow_y',
    # rigth elbow
    'right_elbow_x',
    'right_elbow_y',
    # left wrist
    'left_wrist_x',
    'left_wrist_y',
    # right wrist
    'right_wrist_x',
    'right_wrist_y',
    # left hip
    'left_hip_x',
    'left_hip_y',
    # right hip
    'right_hip_x',
    'right_hip_y',
    # left knee
    'left_knee_x',
    'left_knee_y',
    # right knee
    'right_knee_x',
    'right_knee_y',
    # left ankle
    'left_ankle_x',
    'left_ankle_y',
    # right ankle
    'right_ankle_x',
    'right_ankle_y'
]

with open('../data/my-data/poses_keypoints.csv', 'w', encoding='UTF8', newline='') as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(header)

    # write multiple rows
    writer.writerows(dataset_csv)


Now, in our data/my-data folder, we should have our dataset of keypoints in a csv format, with one column with our labels, one (optional) column with the image_name and 34 columns with the 17 x 2 keypoints.

##### Get the first 5 rows of the csv file we just saved (without the image_name)

In [None]:
df = pd.read_csv('../data/my-data/poses_keypoints.csv')
df = df.drop('image_name', axis=1)
df.head()

### Tasks:

**Task 1:** Run all the cells in this code to create your own dataset. Create a dataset with at least 3 classes, with at least 200 instances in each class.

**Task 2:** Visualise the YOLO predictions for different samples from your dataset, to test how it works on different poses.

**Bonus Task:** 

If you created the initial image datasets based on the provided `00-collect-images-from-webcam` python script, you can also try recording small clips with your phone and then use `ffmpeg` on them to extract frames and organise them in the respective folders in your data. Then use this notebook again to create a different csv file for your classifier.
