# Offline evaluation of Pose Estimation models using PyCocoTools

This example shows how you can evaluate pose estimation models using PyCocoEval class from pycocotools python package.
Although we provide a ready-to-use metric class to compute average precision (AP) and average recall (AR) scores, the 
evaluation protocol during validation is slightly different from what pycocotools suggests for academic evaluation.

In particular:

## SG

* In SG, during training/validation, we resize all images to a fixed size (Default is 640x640) using aspect-ratio preserving resize of the longest size + padding. 
* Our metric evaluate AP/AR in the resolution of the resized & padded images, **not in the resolution of original image**. 


## COCOEval

* In COCOEval all images are not resized and pose predictions are evaluated in the resolution of original image 

Because of this discrepancy, metrics reported by `PoseEstimationMetrics` class is usually a bit lower (Usually by ~1AP) than the ones 
you would get from the same model if computed with COCOEval. 

For this reason we provide this example to show how you can compute metrics using COCOEval for pose estimation models that are available in SuperGradients.

## Instantiate the model for evaluation

First, let's instantiate the model we are going to evaluate. 
You can use either pretrained models or provide a checkpoint path to your own trained checkpoint.

```python
# This is how you can load your custom checkpoint instead of pretrained one
model = models.get(
    Models.YOLO_NAS_POSE_L,
    num_classes=17,
    checkpoint_path="G:/super-gradients/checkpoints/coco2017_yolo_nas_pose_l_ckpt_best.pth",
)
```
In this example we will be using pretrained weights for simplicity.

In [2]:
from super_gradients.common.object_names import Models
from super_gradients.training import models

model = models.get(
    Models.YOLO_NAS_POSE_L,
    num_classes=17,
    checkpoint_path="G:/super-gradients/checkpoints/coco2017_yolo_nas_pose_l_ckpt_best.pth",
    # pretrained_weights="coco_pose" TODO: Replace when weights are on S3
).cuda()


The console stream is logged into C:\Users\ekhve\sg_logs\console.log


[2023-09-28 09:19:49] INFO - crash_tips_setup.py - Crash tips is enabled. You can set your environment variable to CRASH_HANDLER=FALSE to disable it
W0928 09:19:51.855809 18076 redirects.py:27] NOTE: Redirects are currently not supported in Windows or MacOs.
W0928 09:19:56.457413 18076 env_sanity_check.py:30] [31mFailed to verify operating system: Deci officially supports only Linux kernels. Some features may not work as expected.[0m


## Prepare COCO validation data

Next, we obtain list of images in COCO2017 validation set and load their annotations.
You may want to either set the COCO_ROOT_DIR environment variable where COCO2017 data is located on your machine or edit the default path directylu

In [15]:
import os
COCO_DATA_DIR = os.environ.get("COCO_ROOT_DIR", "g:/coco2017")
os.listdir(COCO_DATA_DIR)

['annotations', 'images']

Once data is set we can load it

In [1]:
from pycocotools.cocoeval import COCOeval

In [4]:
from pycocotools.coco import COCO

images_path = os.path.join(COCO_DATA_DIR, "images/val2017")
image_files = [os.path.join(images_path, x) for x in os.listdir(images_path)]

gt_annotations_path = os.path.join(COCO_DATA_DIR, "annotations/person_keypoints_val2017.json")
gt = COCO(gt_annotations_path)

loading annotations into memory...


In [9]:
predictions = model.predict(
    image_files, conf=0.01, iou=0.7, pre_nms_max_predictions=300, post_nms_max_predictions=20, fuse_model=False
)

Predicting Images: 100%|██████████| 5000/5000 [01:41<00:00, 49.44it/s]


In [12]:
import copy
import json_tricks as json
import collections
import numpy as np
import tempfile

def predictions_to_coco(predictions, image_files):
    predicted_poses = []
    predicted_scores = []
    non_empty_image_ids = []
    for image_file, image_predictions in zip(image_files, predictions):
        non_empty_image_ids.append(int(os.path.splitext(os.path.basename(image_file))[0]))
        predicted_poses.append(image_predictions.prediction.poses)
        predicted_scores.append(image_predictions.prediction.scores)

    coco_pred = _coco_convert_predictions_to_dict(predicted_poses, predicted_scores, non_empty_image_ids)
    return coco_pred

def _coco_process_keypoints(keypoints):
    tmp = keypoints.copy()
    if keypoints[:, 2].max() > 0:
        num_keypoints = keypoints.shape[0]
        for i in range(num_keypoints):
            tmp[i][0:3] = [float(keypoints[i][0]), float(keypoints[i][1]), float(keypoints[i][2])]

    return tmp

def _coco_convert_predictions_to_dict(predicted_poses, predicted_scores, image_ids):
    kpts = collections.defaultdict(list)
    for poses, scores, image_id_int in zip(predicted_poses, predicted_scores, image_ids):

        for person_index, kpt in enumerate(poses):
            area = (np.max(kpt[:, 0]) - np.min(kpt[:, 0])) * (np.max(kpt[:, 1]) - np.min(kpt[:, 1]))
            kpt = _coco_process_keypoints(kpt)
            kpts[image_id_int].append({"keypoints": kpt[:, 0:3], "score": float(scores[person_index]), "image": image_id_int, "area": area})

    oks_nmsed_kpts = []
    # image x person x (keypoints)
    for img in kpts.keys():
        # person x (keypoints)
        img_kpts = kpts[img]
        # person x (keypoints)
        # do not use nms, keep all detections
        keep = []
        if len(keep) == 0:
            oks_nmsed_kpts.append(img_kpts)
        else:
            oks_nmsed_kpts.append([img_kpts[_keep] for _keep in keep])

    classes = ["__background__", "person"]
    _class_to_coco_ind = {cls: i for i, cls in enumerate(classes)}

    data_pack = [
        {"cat_id": _class_to_coco_ind[cls], "cls_ind": cls_ind, "cls": cls, "ann_type": "keypoints", "keypoints": oks_nmsed_kpts}
        for cls_ind, cls in enumerate(classes)
        if not cls == "__background__"
    ]

    results = _coco_keypoint_results_one_category_kernel(data_pack[0], num_joints=17)
    return results

def _coco_keypoint_results_one_category_kernel(data_pack, num_joints: int):
    cat_id = data_pack["cat_id"]
    keypoints = data_pack["keypoints"]
    cat_results = []

    for img_kpts in keypoints:
        if len(img_kpts) == 0:
            continue

        _key_points = np.array([img_kpts[k]["keypoints"] for k in range(len(img_kpts))])
        key_points = np.zeros((_key_points.shape[0], num_joints * 3), dtype=np.float32)

        for ipt in range(num_joints):
            key_points[:, ipt * 3 + 0] = _key_points[:, ipt, 0]
            key_points[:, ipt * 3 + 1] = _key_points[:, ipt, 1]
            # keypoints score.
            key_points[:, ipt * 3 + 2] = _key_points[:, ipt, 2]

        for k in range(len(img_kpts)):
            kpt = key_points[k].reshape((num_joints, 3))
            left_top = np.amin(kpt, axis=0)
            right_bottom = np.amax(kpt, axis=0)

            w = right_bottom[0] - left_top[0]
            h = right_bottom[1] - left_top[1]

            cat_results.append(
                {
                    "image_id": img_kpts[k]["image"],
                    "category_id": cat_id,
                    "keypoints": list(key_points[k]),
                    "score": img_kpts[k]["score"],
                    "bbox": list([left_top[0], left_top[1], w, h]),
                }
            )

    return cat_results

coco_pred = predictions_to_coco(predictions, image_files)

with tempfile.TemporaryDirectory() as td:
    res_file = os.path.join(td, "keypoints_coco2017_results.json")

    with open(res_file, "w") as f:
        json.dump(coco_pred, f)

    coco_dt = copy.deepcopy(gt)
    coco_dt = coco_dt.loadRes(res_file)

    coco_evaluator = COCOeval(gt, coco_dt, iouType="keypoints")
    coco_evaluator.evaluate()  # run per image evaluation
    coco_evaluator.accumulate()  # accumulate per image results
    coco_evaluator.summarize()  # display summary metrics of results

DONE (t=0.85s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.675
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.880
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.748
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.633
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.744
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.918
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.796
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.686
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.795
Loading and preparing results...
DONE (t=2.81s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *keypoints*
DONE (t=17.44s).
Accumulating eval

In [13]:
coco_evaluator.stats

array([0.67541086, 0.88030538, 0.74760717, 0.63302696, 0.74432253,
       0.73090365, 0.91829345, 0.7961272 , 0.68634253, 0.79483463])