# <font style="color:blue">Densepose Inference using detectron2</font>
Detectron2 provides 2 tools to visualize dataset and run inference on test images.
- **Apply Net**
    - A tool to print or visualize DensePose results on a set of images. It has two modes: dump to save DensePose model results to a pickle file and show to visualize them on images

- **Query Db**
    -  A tool to print or visualize DensePose data from a dataset. It has two modes: print and show to output dataset entries to standard output or to visualize them on images.

We will use apply net in this notebook. Query db is to visualize any dataset which will be of use while training in the next notebook.

## <font style="color:green">1. Setup Code</font>

To use the above tools, we have to download the densepose project from detectron2.

In [None]:
# install dependencies
!pip install -U torch torchvision cython
!pip install -U 'git+https://github.com/facebookresearch/fvcore.git' 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
torch.__version__

In [None]:
!#git clone https://github.com/facebookresearch/detectron2.git

In [None]:
!git clone https://github.com/facebookresearch/detectron2 detectron2
!pip install -e detectron2

fatal: destination path 'detectron2' already exists and is not an empty directory.
Obtaining file:///content/detectron2
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting pycocotools>=2.0.2 (from detectron2==0.6)
  Using cached pycocotools-2.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.1 kB)
Collecting fvcore<0.1.6,>=0.1.5 (from detectron2==0.6)
  Using cached fvcore-0.1.5.post20221221-py3-none-any.whl
Using cached pycocotools-2.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (458 kB)
Installing collected packages: fvcore, pycocotools, detectron2
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.6
    Uninstalling fvcore-0.1.6:
      Successfully uninstalled fvcore-0.1.6
  Attempting uninstall: pycocotools
    Found existing installation: pycocotools 2.0
    Uninstalling pycocotools-2.0:
      Successfully uninstalled pycocotools-2.0
  Attempting uninstall: detectron2
    Found existing installation: det

In [None]:
%cd detectron2/projects/DensePose

/content/detectron2/projects/DensePose


## <font style="color:green">2. Import Config and Model files</font>

Densepose config file can be found at `detectron2/projects/DensePose/configs/densepose_rcnn_R_50_FPN_s1x.yaml`

Model weights files can be found <a href="https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/doc/MODEL_ZOO.md" target="_blank">here</a>. From the link, we have used improved baselines with original fully convolutional head.

In [None]:
import urllib

def download(url, filepath):
    response = urllib.request.urlretrieve(url, filepath)
    return response

In [None]:
download("https://dl.fbaipublicfiles.com/densepose/densepose_rcnn_R_50_FPN_s1x/165712039/model_final_162be9.pkl",
         "model_final_162be9.pkl")

('model_final_162be9.pkl', <http.client.HTTPMessage at 0x7f18533dc790>)

Based on the major functions of apply net, now lets see how we can run inference on Video using detectron2's densepose.

## <font style="color:green">3. Inference on Video</font>

### 3.1. Import Libraries

In [None]:
import os
import sys

print("File location using os.getcwd():", os.getcwd())


File location using os.getcwd(): /content/detectron2/projects/DensePose


In [None]:
sys.path.insert(0,'/content/detectron2')

In [None]:
import cv2
import numpy as np

from typing import ClassVar, Dict

from detectron2.config import get_cfg
from detectron2.structures.instances import Instances
from detectron2.engine.defaults import DefaultPredictor

from densepose import add_densepose_config
from densepose.vis.base import CompoundVisualizer
from densepose.vis.bounding_box import ScoredBoundingBoxVisualizer
from densepose.vis.extractor import CompoundExtractor, create_extractor

from densepose.vis.densepose_results import (
    DensePoseResultsContourVisualizer,
    DensePoseResultsFineSegmentationVisualizer,
    DensePoseResultsUVisualizer,
    DensePoseResultsVVisualizer,
)

### Import Visualizers
- Below mentioned object contains the different visualization methods like contour, segmentation, U coordinates, V coordinates and bounding box.

**Sample visualizer method**:

```
class DensePoseResultsFineSegmentationVisualizer(DensePoseMaskedColormapResultsVisualizer):
    def __init__(self, inplace=True, cmap=cv2.COLORMAP_PARULA, alpha=0.7):
        super(DensePoseResultsFineSegmentationVisualizer, self).__init__(
            _extract_i_from_iuvarr,
            _extract_i_from_iuvarr,
            inplace,
            cmap,
            alpha,
            val_scale=255.0 / DensePoseDataRelative.N_PART_LABELS,
        )
```

From above we can see how the segmentation visualizer **`DensePoseResultsFineSegmentationVisualizer`** works by calling other classes like **`DensePoseMaskedColormapResultsVisualizer`** which again calls **`DensePoseResultsVisualizer`** and few other functions like **`_extract_i_from_iuvarr`**.

```
class DensePoseMaskedColormapResultsVisualizer(DensePoseResultsVisualizer):
    def __init__(
        self,
        data_extractor,
        segm_extractor,
        inplace=True,
        cmap=cv2.COLORMAP_PARULA,
        alpha=0.7,
        val_scale=1.0,
    ):
        self.mask_visualizer = MatrixVisualizer(
            inplace=inplace, cmap=cmap, val_scale=val_scale, alpha=alpha
        )
        self.data_extractor = data_extractor
        self.segm_extractor = segm_extractor

    def create_visualization_context(self, image_bgr: Image):
        return image_bgr

    def context_to_image_bgr(self, context):
        return context

    def get_image_bgr_from_context(self, context):
        return context

    def visualize_iuv_arr(self, context, iuv_arr, bbox_xywh):
        image_bgr = self.get_image_bgr_from_context(context)
        matrix = self.data_extractor(iuv_arr)
        segm = self.segm_extractor(iuv_arr)
        mask = np.zeros(matrix.shape, dtype=np.uint8)
        mask[segm > 0] = 1
        image_bgr = self.mask_visualizer.visualize(image_bgr, mask, matrix, bbox_xywh)
        return image_bgr


def _extract_i_from_iuvarr(iuv_arr):
    return iuv_arr[0, :, :]


def _extract_u_from_iuvarr(iuv_arr):
    return iuv_arr[1, :, :]


def _extract_v_from_iuvarr(iuv_arr):
    return iuv_arr[2, :, :]
```


```
class DensePoseResultsVisualizer(object):
    def visualize(self, image_bgr: Image, densepose_result: Optional[DensePoseResult]) -> Image:
        if densepose_result is None:
            return image_bgr
        context = self.create_visualization_context(image_bgr)
        for i, result_encoded_w_shape in enumerate(densepose_result.results):
            iuv_arr = DensePoseResult.decode_png_data(*result_encoded_w_shape)
            bbox_xywh = densepose_result.boxes_xywh[i]
            self.visualize_iuv_arr(context, iuv_arr, bbox_xywh)
        image_bgr = self.context_to_image_bgr(context)
        return image_bgr
```

- Visualize function of `DensePoseResultsVisualizer` decoded densepose result data to get iuv_arr and corresponding bounding boxes.
- `visualize_iuv_arr` extracts matrix and segm from iuv_arr, since the selected visulization format is segm and I is also partwise segmentation, both the matrix and segm are same. In case of other visualizations, we may use `_extract_u_from_iuvarr` or `_extract_v_from_iuvarr`
- Mask of segmentation is generated.
- mask_visualizer uses `MatrixVisualizer` defined in `densepose/vis/base.py`
    - resizes the matrix, mask according to the bbox width, height.
    - multiples the matrix with val_scale, clips the matrix values to (0,255) and converts to the image format.
    - Then it applies color coding to the matrix image and the original image is colored accordingly.

In [None]:
## Visualizer methods
VISUALIZERS: ClassVar[Dict[str, object]] = {
    "dp_contour": DensePoseResultsContourVisualizer,
    "dp_segm": DensePoseResultsFineSegmentationVisualizer,
    "dp_u": DensePoseResultsUVisualizer,
    "dp_v": DensePoseResultsVVisualizer,
    "bbox": ScoredBoundingBoxVisualizer,
}

### 3.2. Setup Config
- It imports the default config and gets the densepose specific config `add_densepose_config` which can be viewed at `detectron2/projects/DensePose/densepose/config.py`.
- It also imports the config file and model weights file.

In [None]:
def setConfig():
    cfg = get_cfg()
    add_densepose_config(cfg)

    cfg.merge_from_file("configs/densepose_rcnn_R_50_FPN_s1x.yaml")
    cfg.MODEL.DEVICE = "cuda"
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5

    cfg.MODEL.WEIGHTS = "model_final_162be9.pkl"

    return cfg

### 3.3. Visualizer and Extractor
- Initializes the visualizer and extractor method for the different types of visualizations given in the arguments.
- Simlutaneously multiple visulization formats can be selected which are handled by CompoundVisualizer and CompoundExtractor.
- These methods extract contour, segmentation or points information from IUV mapping output given by the densepose.

In [None]:
def getVisAndExtract(vis_specs):
    visualizers = []
    extractors = []
    for vis_spec in vis_specs:
        vis = VISUALIZERS[vis_spec]()
        visualizers.append(vis)
        extractor = create_extractor(vis)
        extractors.append(extractor)
    visualizer = CompoundVisualizer(visualizers)
    extractor = CompoundExtractor(extractors)

    return extractor, visualizer

### 3.4. Create context
```
context = {
            "extractor": extractor,
            "visualizer": visualizer,
            "out_fname": args.output,
            "entry_idx": 0,
        }
```
- Creates context object with visualizer, extractor, output filename and entry idx. Here, we use only visualizer and extractor keys for our purpose.

In [None]:
def createContext(extractor, visualizer):
    context = {
        "extractor": extractor,
        "visualizer": visualizer
    }

    return context

### 3.5. Predict Image
- Extractor finds the IUV mapping of the detected humans in the image in the DensePoseOutput format.
- This output is processed in the visualizer to the viewable format like contours, points or segmentation.

In [None]:
def predict(img, predictor, context):
    outputs = predictor(img)['instances']
    image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    image = np.tile(image[:, :, np.newaxis], [1, 1, 3])
    data = context["extractor"](outputs)
    image_vis = context["visualizer"].visualize(image, data)
    return image_vis

In the inference video function, we are performing detection for every 10th frame, which can be changed accordingly.

In [None]:
def inferenceOnVideo(videoPath, predictor, context):
    cap = cv2.VideoCapture(videoPath)
    cnt = 0
    n_frame = 10

    output_frames = []

    import time

    while True:
        ret, im = cap.read()

        if not ret:
            break

        if cnt%n_frame == 0:
            output = predict(im, predictor, context)
            time.sleep(1)
            output_frames.append(output)

        cnt = cnt + 1


    height, width, _ = output_frames[0].shape
    size = (width,height)
    out = cv2.VideoWriter("out.mp4",cv2.VideoWriter_fourcc(*'mp4v'), 10, size)

    for i in range(len(output_frames)):
        out.write(output_frames[i])

    out.release()

### 3.6. Main Execution
- Define visulization formats to be used in vis_specs. {'bbox', 'dp_segm', 'dp_contour', 'dp_u', 'dp_v'}
- Initialize config
- Initialize detectron2's default predictor method.
- Define visualizer and extractor methods based on vis_specs
- Context created with required functions to use in the prediction
- All frames predicted by densepose are compiled to output video out.mp4


**Download <a href="https://www.dropbox.com/s/kk4zjqcfm5yf1cp/test_cut.mp4?dl=1" target="_blank">test_cut.mp4</a>**

In [None]:
download('https://www.dropbox.com/s/kk4zjqcfm5yf1cp/test_cut.mp4?dl=1', 'test_cut.mp4')

('test_cut.mp4', <http.client.HTTPMessage at 0x7f1744db8e90>)

In [None]:
vis_specs = ['dp_segm', 'bbox']

cfg = setConfig()

##Initialize predictor
predictor = DefaultPredictor(cfg)

extractor, visualizer = getVisAndExtract(vis_specs)

context = createContext(extractor, visualizer)

inferenceOnVideo("test_cut.mp4", predictor, context)

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


## <font style="color:green">References</font>

- <a href="https://github.com/facebookresearch/detectron2" target="_blank">https://github.com/facebookresearch/detectron2</a>
- <a href="http://densepose.org/" target="_blank">http://densepose.org/</a>
- <a href="https://research.fb.com/downloads/densepose/" target="_blank">https://research.fb.com/downloads/densepose/</a>
- <a href="https://arxiv.org/pdf/1802.00434.pdf" target="_blank">https://arxiv.org/pdf/1802.00434.pdf</a>