# <font style="color:blue">Train DensePose with detectron2</font>

In this module, we will go through the training of densepose using detectron2. Here, we will use COCO dataset with densepose annotations.

## <font style="color:green">1. Setup Code</font>

To use the detectron2's densepose training module, we will setup the detectron2 code.

In [None]:
# install dependencies
!pip install -U torch torchvision cython
!pip install -U 'git+https://github.com/facebookresearch/fvcore.git' 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
torch.__version__

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

'2.6.0+cu124'

In [None]:
#!git clone https://github.com/facebookresearch/detectron2.git

Cloning into 'detectron2'...
remote: Enumerating objects: 8, done.[K
remote: Counting objects: 100% (8/8), done.[K
remote: Compressing objects: 100% (8/8), done.[K
remote: Total 5058 (delta 0), reused 1 (delta 0), pack-reused 5050[K
Receiving objects: 100% (5058/5058), 2.46 MiB | 1.71 MiB/s, done.
Resolving deltas: 100% (3620/3620), done.
Checking connectivity... done.


In [None]:
!git clone https://github.com/facebookresearch/detectron2 detectron2
!pip install -e detectron2

Cloning into 'detectron2'...
remote: Enumerating objects: 15837, done.[K
remote: Counting objects: 100% (65/65), done.[K
remote: Compressing objects: 100% (51/51), done.[K
remote: Total 15837 (delta 35), reused 14 (delta 14), pack-reused 15772 (from 2)[K
Receiving objects: 100% (15837/15837), 6.41 MiB | 18.44 MiB/s, done.
Resolving deltas: 100% (11533/11533), done.
Obtaining file:///content/detectron2
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting pycocotools>=2.0.2 (from detectron2==0.6)
  Downloading pycocotools-2.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.1 kB)
Collecting fvcore<0.1.6,>=0.1.5 (from detectron2==0.6)
  Downloading fvcore-0.1.5.post20221221.tar.gz (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.2/50.2 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting iopath<0.1.10,>=0.1.7 (from detectron2==0.6)
  Downloading iopath-0.1.9-py3-

In [None]:
!pip install av



**Note**: We need to make a change in `densepose_coco_evaluation.py` so that later in densepose evaluations, the `unused arguments timeout` error is fixed. This needs no arguments to be passed other than path in the line number 143.
```
pdist_matrix_fpath = PathManager.get_local_path(
            "https://dl.fbaipublicfiles.com/densepose/data/Pdist_matrix.pkl", timeout_sec=120
        )
```
to be changed to
```
pdist_matrix_fpath = PathManager.get_local_path(
            "https://dl.fbaipublicfiles.com/densepose/data/Pdist_matrix.pkl"
        )
```

In [None]:
%cd detectron2/projects/DensePose

/content/detectron2/projects/DensePose


## <font style="color:green">2. Dataset Preparation</font>

Here, we downloaded 2014 val Images from <a href="http://cocodataset.org/#download">COCO website</a>. We have chosen val images (`6GB`) instead of train images (`13GB`) as it is smaller in size.

**[Download the COCO val2014 Dataset](http://images.cocodataset.org/zips/val2014.zip)**

After downloading the COCO val2014 dataset, unzip it in the current directory.

And the annotation files can be downloaded from <a href="https://github.com/facebookresearch/DensePose/blob/master/DensePoseData/get_DensePose_COCO.sh">here</a>.


From the annotations, it is found the number of images with annotations from val set are 1500. From these images, we will use 1000 images for creating train, val and test datasets.

Train, val and test datasets will follow the structure given in the detectron2 training module.

```
datasets
|
|-->coco
       |
       |-->annotations
       |       |-->densepose_train2014.json
       |       |-->densepose_valminusminival2014.json
       |       |-->densepose_minival2014.json
       |
       |-->train2014
       |
       |-->val2014
```

Script for splitting the images and annotations into train, val and test can be found at **[GenerateDatasets.ipynb](https://www.dropbox.com/s/rio3wx75srgndal/GenerateDatasets.ipynb?dl=1)**.

In [None]:
import urllib

def download(url, filepath):
    response = urllib.request.urlretrieve(url, filepath)
    return response

Data and annotations zip folder can be downloaded from <a href="https://www.dropbox.com/s/biptqvnd7r35962/datasets.zip?dl=1">here</a>. Then we will unzip the folder following the same structure.

**[Download Prepared Dataset](https://www.dropbox.com/s/biptqvnd7r35962/datasets.zip?dl=1)**

**Let's download datsset and unzip it by running below code cells.**

In [1]:
prepared_data_link = 'https://www.dropbox.com/s/biptqvnd7r35962/datasets.zip?dl=1'
dataset_zip = 'datasets.zip'

download(prepared_data_link, dataset_zip)

NameError: name 'download' is not defined

In [None]:
import zipfile

def unzip(zip_filepath, target_dir):
    with zipfile.ZipFile(zip_filepath,'r') as zip_file:
        zip_file.extractall(target_dir)
    return

In [None]:
unzip(dataset_zip, ".")

In [None]:
# !ls datasets/coco/annotations

## <font style="color:green">3. Training</font>

### 3.1. Import Libraries

In [None]:
import logging
import os
import av
from collections import OrderedDict

import matplotlib.pyplot as plt

import detectron2.utils.comm as comm
from detectron2.checkpoint import DetectionCheckpointer
from detectron2.config import CfgNode, get_cfg
from detectron2.data.datasets import register_coco_instances
from detectron2.engine import DefaultTrainer, default_argument_parser, default_setup, hooks, launch
from detectron2.evaluation import COCOEvaluator, DatasetEvaluators, verify_results
from detectron2.modeling import DatasetMapperTTA
from detectron2.utils.logger import setup_logger
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.utils.visualizer import Visualizer

In [None]:
from densepose import (
    DensePoseCOCOEvaluator,
    DensePoseGeneralizedRCNNWithTTA,
    add_dataset_category_config,
    add_densepose_config,
    load_from_cfg,
)
from densepose.data import DatasetMapper, build_detection_test_loader, build_detection_train_loader

### 3.2. Visualize Dataset

We will use detectron2's query_db.py to visualize the training dataset.

- QueryDb is a tool to print or visualize DensePose data from a dataset. It has two modes: print and show to output dataset entries to standard output or to visualize them on images. Usage:
    - `python query_db.py print [-h] [-v] [--max-entries N] <dataset> <selector>`
    - `python query_db.py show [-h] [-v] [--max-entries N] [--output <image_file>] <dataset> <selector> <visualizations>`
    
There are three mandatory arguments:

- `<dataset>`, DensePose dataset specification, from which to select the entries (e.g. densepose_coco_2014_train).
- `<selector>`, dataset entry selector which can be a single specification, or a comma-separated list of specifications of the form field[:type]=value for exact match with the value or field[:type]=min-max for a range of values
- `<visualizations>`, visualizations specifier; currently available visualizations are:
    - bbox - bounding boxes of annotated persons;
    - dp_i - annotated points colored according to the containing part;
    - dp_pts - annotated points in green color;
    - dp_segm - segmentation masks for annotated persons;
    - dp_u - annotated points colored according to their U coordinate in part parameterization;
    - dp_v - annotated points colored according to their V coordinate in part parameterization;


In [None]:
!python query_db.py show densepose_coco_2014_train image_id:int=785 bbox,dp_i -v

[32m[03/31 09:13:25 query_db]: [0mLoading dataset densepose_coco_2014_train
[32m[03/31 09:13:26 query_db]: [0mLoaded dataset densepose_coco_2014_train in 0.746s
Traceback (most recent call last):
  File "/content/detectron2/projects/DensePose/query_db.py", line 250, in <module>
    main()
  File "/content/detectron2/projects/DensePose/query_db.py", line 246, in main
    args.func(args)
  File "/content/detectron2/projects/DensePose/query_db.py", line 89, in execute
    cls.execute_on_entry(entry, context)
  File "/content/detectron2/projects/DensePose/query_db.py", line 171, in execute_on_entry
    image_vis = visualizer.visualize(image, datas)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/detectron2/projects/DensePose/densepose/vis/base.py", line 188, in visualize
    image = visualizer.visualize(image, data[i])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/detectron2/projects/DensePose/densepose/vis/densepose_data_points.py", line 67, in 

In [None]:
import cv2
import matplotlib.pyplot as plt

img =  cv2.imread("output.0001.png")

if img is None:
    print("Error: Could not load image. Please check the file path and format.")
else:
    plt.figure(figsize=(12, 12))
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))  # Convert color format
    plt.show()

Error: Could not load image. Please check the file path and format.


### 3.3. Setup Config

- After using default config, densepose specific config is imported and dataset category config is added. More details of densepose specific config can be seen <a href="https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/densepose/config.py">here</a>.
- Following config options are passed as arguments  
    - Model config path
    - Batch size
    - Number of iterations
    - Learning rate
    - Number of workers
- freeze() makes cfgNode and all of its children immutable.
- Logger to be set for logs during training


In [None]:
def setup(args):
    cfg = get_cfg()
    add_dataset_category_config(cfg)
    add_densepose_config(cfg)
    cfg.merge_from_file(args.config_file)
    cfg.merge_from_list(args.opts)
    cfg.freeze()
    default_setup(cfg, args)
    # Setup logger for "densepose" module
    setup_logger(output=cfg.OUTPUT_DIR, distributed_rank=comm.get_rank(), name="densepose")
    return cfg

### 3.4. Densepose Methods

- Evaluator
    - DenseposeCOCOEvaluator is added to the list of evaluators.
        - Similar to object detection's Intersection over Union(IOU) and keypoints' Object Keypoint Similarity(OKS) evaluation methods on COCO dataset, densepose uses Geodesic point similarity method for AP/AR calculation. GPS is based on geodesic distances on the template mesh between the collected groundtruth points and estimated surface coordinates for the same image points. More details can be found <a href="https://github.com/facebookresearch/DensePose/blob/master/challenge/2018_COCO_DensePose/evaluation.md">here</a>.
- DataLoaders
    - Uses custom datasetMapper to convert densepose data format to detectron2 format for data loading.
- Test Time Augmentation
    - Applies data Augmentation on test set during evaluation and runs inference on augmented data.

In [None]:
class Trainer(DefaultTrainer):
    @classmethod
    def build_evaluator(cls, cfg: CfgNode, dataset_name, output_folder=None):
        if output_folder is None:
            output_folder = os.path.join(cfg.OUTPUT_DIR, "inference")
        evaluators = [COCOEvaluator(dataset_name, cfg, True, output_folder)]
        if cfg.MODEL.DENSEPOSE_ON:
            evaluators.append(DensePoseCOCOEvaluator(dataset_name, True, output_folder))
        return DatasetEvaluators(evaluators)

    @classmethod
    def build_test_loader(cls, cfg: CfgNode, dataset_name):
        return build_detection_test_loader(cfg, dataset_name, mapper=DatasetMapper(cfg, False))

    @classmethod
    def build_train_loader(cls, cfg: CfgNode):
        return build_detection_train_loader(cfg, mapper=DatasetMapper(cfg, True))

    @classmethod
    def test_with_TTA(cls, cfg: CfgNode, model):
        logger = logging.getLogger("detectron2.trainer")
        # In the end of training, run an evaluation with TTA
        # Only support some R-CNN models.
        logger.info("Running inference with test-time augmentation ...")
        transform_data = load_from_cfg(cfg)
        model = DensePoseGeneralizedRCNNWithTTA(cfg, model, transform_data, DatasetMapperTTA(cfg))
        evaluators = [
            cls.build_evaluator(
                cfg, name, output_folder=os.path.join(cfg.OUTPUT_DIR, "inference_TTA")
            )
            for name in cfg.DATASETS.TEST
        ]
        res = cls.test(cfg, model, evaluators)
        res = OrderedDict({k + "_TTA": v for k, v in res.items()})
        return res

### 3.5. Main function
- Above defained custom Trainer is initiated using above defined config.
- If the mode is evaluation only, Trainer loads the model, `DetectionCheckpointer` loads the model weights and evaluations on the cfg.TEST dataset, which is densepose_coco_minival2014 here.
- If test time augmentation is enabled, the additional evaluations are added to the previous evaluations.

In [None]:
def main(args):
    cfg = setup(args)

    if args.eval_only:
        model = Trainer.build_model(cfg)
        DetectionCheckpointer(model, save_dir=cfg.OUTPUT_DIR).resume_or_load(
            cfg.MODEL.WEIGHTS, resume=args.resume
        )
        res = Trainer.test(cfg, model)
        if cfg.TEST.AUG.ENABLED:
            res.update(Trainer.test_with_TTA(cfg, model))
        if comm.is_main_process():
            verify_results(cfg, res)
        return res

    trainer = Trainer(cfg)
    trainer.resume_or_load(resume=args.resume)
    if cfg.TEST.AUG.ENABLED:
        trainer.register_hooks(
            [hooks.EvalHook(0, lambda: trainer.test_with_TTA(cfg, trainer.model))]
        )
    return trainer.train()

**Uncomment the below code cell to install `attrdict`, if not already installed.**

In [None]:
!pip install attrdict

Collecting attrdict
  Downloading attrdict-2.0.1-py2.py3-none-any.whl.metadata (6.7 kB)
Downloading attrdict-2.0.1-py2.py3-none-any.whl (9.9 kB)
Installing collected packages: attrdict
Successfully installed attrdict-2.0.1


In [None]:
#from attrdict import AttrDict
from easydict import EasyDict as edict



if __name__ == "__main__":
    attr_dict = {'resume': False,
                 'config_file':'configs/densepose_rcnn_R_50_FPN_s1x.yaml',
                 'eval_only': False,
                 'machine_rank': 0,
                 'num_gpus': 2,
                 'num_machines': 1,
                 'opts': ['SOLVER.IMS_PER_BATCH', '4',
                          'SOLVER.BASE_LR', '0.0005',
                          'SOLVER.MAX_ITER', '200',
                          'DATALOADER.NUM_WORKERS', '0']
                }


    #args = AttrDict(attr_dict)
    args = edict(attr_dict) # Using edict instead of AttrDict


    main(args)

[03/31 09:49:32 detectron2]: Rank of current process: 0. World size: 1
[03/31 09:49:33 detectron2]: Environment info:
-------------------------------  -----------------------------------------------------------------
sys.platform                     linux
Python                           3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
numpy                            2.0.2
detectron2                       0.6 @/content/detectron2/detectron2
Compiler                         GCC 11.4
CUDA compiler                    CUDA 12.5
detectron2 arch flags            7.5
DETECTRON2_ENV_MODULE            <not set>
PyTorch                          2.6.0+cu124 @/usr/local/lib/python3.11/dist-packages/torch
PyTorch debug build              False
torch._C._GLIBCXX_USE_CXX11_ABI  False
GPU available                    Yes
GPU 0                            Tesla T4 (arch=7.5)
Driver version                   550.54.15
CUDA_HOME                        /usr/local/cuda
Pillow                           11

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


[03/31 09:50:05 d2.utils.events]:  eta: 0:04:17  iter: 19  total_loss: 9.155  loss_cls: 0.1396  loss_box_reg: 0.003253  loss_densepose_U: 0.7747  loss_densepose_V: 0.7355  loss_densepose_I: 3.239  loss_densepose_S: 3.606  loss_rpn_cls: 0.6596  loss_rpn_loc: 0.05507    time: 1.4209  last_time: 1.3900  data_time: 0.1385  last_data_time: 0.1000   lr: 9.275e-05  max_mem: 5030M
[03/31 09:50:41 d2.utils.events]:  eta: 0:03:53  iter: 39  total_loss: 7.644  loss_cls: 0.03489  loss_box_reg: 0.003572  loss_densepose_U: 0.3065  loss_densepose_V: 0.2805  loss_densepose_I: 3.113  loss_densepose_S: 3.231  loss_rpn_cls: 0.5588  loss_rpn_loc: 0.04286    time: 1.4824  last_time: 1.4586  data_time: 0.1436  last_data_time: 0.0679   lr: 0.00013775  max_mem: 5249M
[03/31 09:51:09 d2.utils.events]:  eta: 0:03:23  iter: 59  total_loss: 7.126  loss_cls: 0.1299  loss_box_reg: 0.05707  loss_densepose_U: 0.3675  loss_densepose_V: 0.3016  loss_densepose_I: 3.088  loss_densepose_S: 2.807  loss_rpn_cls: 0.4349  los

Pdist_matrix.pkl: 1.52GB [01:49, 13.8MB/s]                            
  k = (n * (n - 1) / 2) - (n - i) * ((n - i) - 1) / 2 + j - i - 1
  k = (n * (n - 1) / 2) - (n - i) * ((n - i) - 1) / 2 + j - i - 1


[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]: DensePose evaluation DONE (t=122.08s).
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]: Accumulating evaluation results...
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]: Categories: [np.int64(0)]
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]: Final: max precision 0.041916167664670656, min precision 0.0
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]: DONE (t=0.01s).
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]:  Average Precision  (AP) @[ OGPS=0.50:0.95 | area=   all | maxDets= 20 ] = 0.001
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]:  Average Precision  (AP) @[ OGPS=0.50      | area=   all | maxDets= 20 ] = 0.010
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluation]:  Average Precision  (AP) @[ OGPS=0.75      | area=   all | maxDets= 20 ] = 0.000
[03/31 10:03:38 densepose.evaluation.densepose_coco_evaluati

Above, we have seen how the densepose can be trained using densepose. But to achieve better performance models in lesser time, we need to run the model on multiple gpus simultaneously with learning rate schedular. In demo, we have just used single gpu to show the process of training.

## <font style="color:green">References</font>

- https://github.com/facebookresearch/detectron2/blob/master/projects/DensePose/train_net.py
- https://detectron2.readthedocs.io/
- https://github.com/facebookresearch/DensePose