## Delta CNN 





To be able to import the **deltacnn library**, we first need to clone the [DeltaCNN GitHub repository](https://github.com/facebookresearch/DeltaCNN).

We then need to install the DeltaCNN framework. The README.md recommends to call the setup.py file directly using the following:
```
python setup.py install --user
```
However, we were not successful in installing the framework this way. Rather, we will **install using pip** directly on the DeltaCNN-Main folder.

In [None]:
!git clone https://github.com/facebookresearch/DeltaCNN.git

!pip install /content/DeltaCNN/

Cloning into 'DeltaCNN'...
remote: Enumerating objects: 55, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (50/50), done.[K
remote: Total 55 (delta 14), reused 37 (delta 5), pack-reused 0[K
Unpacking objects: 100% (55/55), 1.53 MiB | 13.82 MiB/s, done.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing /content/DeltaCNN
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: torchdeltacnn
  Building wheel for torchdeltacnn (setup.py) ... [?25l[?25hdone
  Created wheel for torchdeltacnn: filename=torchdeltacnn-0.0.0-cp39-cp39-linux_x86_64.whl size=17505467 sha256=ebd1b2f84c534ff886102e0ad9acc91cdeceabd5b017c2ba7d08eeccb73b8662
  Stored in directory: /tmp/pip-ephem-wheel-cache-d2xze6pc/wheels/79/07/be/b2e778efc1014ae6aaec3e3783a32762efddf8f6437493ea96
Successfully built torchdeltacnn
Installing collected packages: torchdeltacnn
Successfully inst

We can now install the deltacnn package and other modules like torch, numpy, etc.:

In [None]:
import torch
import numpy

import deltacnn

## Pose ResNet

The code provided below has been obtained from [Microsoft](https://github.com/microsoft/human-pose-estimation.pytorch/tree/49f3f4458c9d5917c75c37a6db48c6a0d7cd89a1). 

In [None]:
%cd /content/

/content


In [None]:
!git clone https://github.com/microsoft/human-pose-estimation.pytorch.git

Cloning into 'human-pose-estimation.pytorch'...
remote: Enumerating objects: 197, done.[K
remote: Counting objects: 100% (9/9), done.[K
remote: Compressing objects: 100% (9/9), done.[K
remote: Total 197 (delta 1), reused 1 (delta 0), pack-reused 188[K
Receiving objects: 100% (197/197), 73.71 KiB | 443.00 KiB/s, done.
Resolving deltas: 100% (87/87), done.


In [None]:
%%writefile /content/human-pose-estimation.pytorch/requirements.txt
EasyDict==1.7
opencv-python>=3.4.1.15
Cython
scipy
pandas
pyyaml
json_tricks
scikit-image
tensorboardX>=1.2
torchvision

Overwriting /content/human-pose-estimation.pytorch/requirements.txt


In [None]:
!pip install -r /content/human-pose-estimation.pytorch/requirements.txt

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting EasyDict==1.7
  Downloading easydict-1.7.tar.gz (6.2 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting json_tricks
  Downloading json_tricks-3.16.1-py2.py3-none-any.whl (27 kB)
Collecting tensorboardX>=1.2
  Downloading tensorboardX-2.6-py2.py3-none-any.whl (114 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.5/114.5 KB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: EasyDict
  Building wheel for EasyDict (setup.py) ... [?25l[?25hdone
  Created wheel for EasyDict: filename=easydict-1.7-py3-none-any.whl size=6120 sha256=c532c26e4c077307449189b4366f8692527a46095790fa7ce1e472d4925f26bd
  Stored in directory: /root/.cache/pip/wheels/45/8c/cf/effa1da676cabed573e1a13ed9cb03d2455f4f6fd2bbce2c25
Successfully built EasyDict
Installing collected packages: json_tricks, EasyDict, tensorboardX
  Attempting uninsta

In [None]:
#modify the github repo with deltacnn
%%writefile /content/human-pose-estimation.pytorch/lib/models/pose_resnet.py


# ------------------------------------------------------------------------------
# Copyright (c) Microsoft
# Licensed under the MIT License.
# Written by Bin Xiao (Bin.Xiao@microsoft.com)
# ------------------------------------------------------------------------------

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import logging

import torch
import torch.nn as nn
from collections import OrderedDict

import deltacnn


BN_MOMENTUM = 0.1
logger = logging.getLogger(__name__)


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return deltacnn.DCConv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=True)


class BasicBlock(deltacnn.DCModule):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = deltacnn.DCBatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.relu = deltacnn.DCActivation(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = deltacnn.DCBatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.downsample = downsample
        self.stride = stride

        self.sparse_add = deltacnn.DCAdd(activation = "relu")

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out = self.sparse_add(out, residual)
        #out = self.relu(out)

        return out


class Bottleneck(deltacnn.DCModule):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = deltacnn.DCConv2d(inplanes, planes, kernel_size=1, bias=True)
        self.bn1 = deltacnn.DCBatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.conv2 = deltacnn.DCConv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=True)
        self.bn2 = deltacnn.DCBatchNorm2d(planes, momentum=BN_MOMENTUM)
        self.conv3 = deltacnn.DCConv2d(planes, planes * self.expansion, kernel_size=1,
                               bias=True)
        self.bn3 = deltacnn.DCBatchNorm2d(planes * self.expansion,
                                  momentum=BN_MOMENTUM)
        self.relu = deltacnn.DCActivation(inplace=True)
        self.downsample = downsample
        self.stride = stride

        self.sparsify = deltacnn.DCSparsify()
        self.densify = deltacnn.DCDensify()

        self.sparse_add = deltacnn.DCAdd(activation = "relu")

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)


        if self.downsample is not None:
            residual = self.downsample(x)

        out = self.sparse_add(residual, out)

        #out = self.relu(out)

        return out


class PoseResNet(deltacnn.DCModule):

    def __init__(self, block, layers, cfg, **kwargs):
        self.inplanes = 64
        extra = cfg.MODEL.EXTRA
        self.deconv_with_bias = extra.DECONV_WITH_BIAS

        super(PoseResNet, self).__init__()
        self.conv1 = deltacnn.DCConv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=True)
        self.bn1 = deltacnn.DCBatchNorm2d(64, momentum=BN_MOMENTUM)
        self.relu = deltacnn.DCActivation(inplace=True)
        self.maxpool = deltacnn.DCMaxPooling(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)

        self.sparsify = deltacnn.DCSparsify()
        self.densify = deltacnn.DCDensify()

        # used for deconv layers
        self.deconv_layers = self._make_deconv_layer(
            extra.NUM_DECONV_LAYERS,
            extra.NUM_DECONV_FILTERS,
            extra.NUM_DECONV_KERNELS,
        )

        self.final_layer = deltacnn.DCConv2d(
            in_channels=extra.NUM_DECONV_FILTERS[-1],
            out_channels=cfg.MODEL.NUM_JOINTS,
            kernel_size=extra.FINAL_CONV_KERNEL,
            stride=1,
            padding=1 if extra.FINAL_CONV_KERNEL == 3 else 0
        )

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                deltacnn.DCConv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=True),
                deltacnn.DCBatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def _get_deconv_cfg(self, deconv_kernel, index):
        if deconv_kernel == 4:
            padding = 1
            output_padding = 0
        elif deconv_kernel == 3:
            padding = 1
            output_padding = 1
        elif deconv_kernel == 2:
            padding = 0
            output_padding = 0

        return deconv_kernel, padding, output_padding

    def _make_deconv_layer(self, num_layers, num_filters, num_kernels):
        assert num_layers == len(num_filters), \
            'ERROR: num_deconv_layers is different len(num_deconv_filters)'
        assert num_layers == len(num_kernels), \
            'ERROR: num_deconv_layers is different len(num_deconv_filters)'

        layers = []
        for i in range(num_layers):
            kernel, padding, output_padding = \
                self._get_deconv_cfg(num_kernels[i], i)

            planes = num_filters[i]
            layers.append(
                deltacnn.DCConvTranspose2d(
                    in_channels=self.inplanes,
                    out_channels=planes,
                    kernel_size=kernel,
                    stride=2,
                    padding=padding,
                    output_padding=output_padding,
                    bias=self.deconv_with_bias))
            layers.append(deltacnn.DCBatchNorm2d(planes, momentum=BN_MOMENTUM))
            layers.append(deltacnn.DCActivation(inplace=True))
            self.inplanes = planes

        return nn.Sequential(*layers)

    def forward(self, x):
        x = x.cuda().contiguous(memory_format=torch.channels_last)
        x = self.sparsify(x)

        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.deconv_layers(x)
        x = self.final_layer(x)

        x = self.densify(x)

        print("***MADE IT THROUGH ONE FORWARD PASS***")

        return x

    def init_weights(self, pretrained=''):
        if os.path.isfile(pretrained):
            logger.info('=> init deconv weights from normal distribution')
            for name, m in self.deconv_layers.named_modules():
                if isinstance(m, nn.ConvTranspose2d):
                    logger.info('=> init {}.weight as normal(0, 0.001)'.format(name))
                    logger.info('=> init {}.bias as 0'.format(name))
                    nn.init.normal_(m.weight, std=0.001)
                    if self.deconv_with_bias:
                        nn.init.constant_(m.bias, 0)
                elif isinstance(m, nn.BatchNorm2d):
                    logger.info('=> init {}.weight as 1'.format(name))
                    logger.info('=> init {}.bias as 0'.format(name))
                    nn.init.constant_(m.weight, 1)
                    nn.init.constant_(m.bias, 0)
            logger.info('=> init final conv weights from normal distribution')
            for m in self.final_layer.modules():
                if isinstance(m, nn.Conv2d):
                    # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                    logger.info('=> init {}.weight as normal(0, 0.001)'.format(name))
                    logger.info('=> init {}.bias as 0'.format(name))
                    nn.init.normal_(m.weight, std=0.001)
                    nn.init.constant_(m.bias, 0)

            # pretrained_state_dict = torch.load(pretrained)
            logger.info('=> loading pretrained model {}'.format(pretrained))
            # self.load_state_dict(pretrained_state_dict, strict=False)
            checkpoint = torch.load(pretrained)
            if isinstance(checkpoint, OrderedDict):
                state_dict = checkpoint
            elif isinstance(checkpoint, dict) and 'state_dict' in checkpoint:
                state_dict_old = checkpoint['state_dict']
                state_dict = OrderedDict()
                # delete 'module.' because it is saved from DataParallel module
                for key in state_dict_old.keys():
                    if key.startswith('module.'):
                        # state_dict[key[7:]] = state_dict[key]
                        # state_dict.pop(key)
                        state_dict[key[7:]] = state_dict_old[key]
                    else:
                        state_dict[key] = state_dict_old[key]
            else:
                raise RuntimeError(
                    'No state_dict found in checkpoint file {}'.format(pretrained))
            self.load_state_dict(state_dict, strict=False)
        else:
            logger.info('=> init weights from normal distribution')
            for m in self.modules():
                if isinstance(m, deltacnn.DCConv2d):
                    # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                    nn.init.normal_(m.weight, std=0.001)
                    # nn.init.constant_(m.bias, 0)
                elif isinstance(m, deltacnn.DCBatchNorm2d):
                    nn.init.constant_(m.weight, 1)
                    nn.init.constant_(m.bias, 0)
                elif isinstance(m, deltacnn.DCConvTranspose2d):
                    nn.init.normal_(m.weight, std=0.001)
                    if self.deconv_with_bias:
                        nn.init.constant_(m.bias, 0)


resnet_spec = {18: (BasicBlock, [2, 2, 2, 2]),
               34: (BasicBlock, [3, 4, 6, 3]),
               50: (Bottleneck, [3, 4, 6, 3]),
               101: (Bottleneck, [3, 4, 23, 3]),
               152: (Bottleneck, [3, 8, 36, 3])}


def get_pose_net(cfg, is_train, **kwargs):
    num_layers = cfg.MODEL.EXTRA.NUM_LAYERS
    style = cfg.MODEL.STYLE

    block_class, layers = resnet_spec[num_layers]

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    model = PoseResNet(block_class, layers, cfg, **kwargs)

    if is_train and cfg.MODEL.INIT_WEIGHTS:
        model.init_weights(cfg.MODEL.PRETRAINED)
    
    model.to(device, memory_format=torch.channels_last) # set the network in channels last mode
    model.process_filters() # convert filters into DeltaCNN format
        
    return model



Overwriting /content/human-pose-estimation.pytorch/lib/models/pose_resnet.py


In [None]:
%cd /content/human-pose-estimation.pytorch/lib/
!make

/content/human-pose-estimation.pytorch/lib
cd nms; python setup.py build_ext --inplace; rm -rf build; cd ../../
running build_ext
cythoning cpu_nms.pyx to cpu_nms.c
  tree = Parsing.p_module(s, pxd, full_module_name)
cythoning gpu_nms.pyx to gpu_nms.cpp
  tree = Parsing.p_module(s, pxd, full_module_name)
building 'cpu_nms' extension
creating build
creating build/temp.linux-x86_64-3.9
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.9/dist-packages/numpy/core/include -I/usr/include/python3.9 -c cpu_nms.c -o build/temp.linux-x86_64-3.9/cpu_nms.o -Wno-cpp -Wno-unused-function
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 -Wl,-Bsymbolic-functions -g -fwrapv -O2 -g -fstack-protector-str

In [None]:
%cd /content/
!git clone https://github.com/cocodataset/cocoapi.git
%cd /content/cocoapi/PythonAPI

!make install

/content
Cloning into 'cocoapi'...
remote: Enumerating objects: 975, done.[K
remote: Total 975 (delta 0), reused 0 (delta 0), pack-reused 975[K
Receiving objects: 100% (975/975), 11.72 MiB | 25.93 MiB/s, done.
Resolving deltas: 100% (576/576), done.
/content/cocoapi/PythonAPI
# install pycocotools to the Python site-packages
python setup.py build_ext install
running build_ext
cythoning pycocotools/_mask.pyx to pycocotools/_mask.c
  tree = Parsing.p_module(s, pxd, full_module_name)
building 'pycocotools._mask' extension
creating build
creating build/common
creating build/temp.linux-x86_64-3.9
creating build/temp.linux-x86_64-3.9/pycocotools
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.9/dist-packages/numpy/core/include -I../common -I/usr/in

In [None]:
!mkdir -p /content//human-pose-estimation.pytorch/models/pytorch/pose_coco

%cd /content/human-pose-estimation.pytorch/models/pytorch/pose_coco/
!gdown --fuzzy https://drive.google.com/file/d/1DIhf0DoyHjTkk_14BshTAdbgaa9ApnET/view?usp=share_link
%cd /content/human-pose-estimation.pytorch/


/content/human-pose-estimation.pytorch/models/pytorch/pose_coco
Downloading...
From: https://drive.google.com/uc?id=1DIhf0DoyHjTkk_14BshTAdbgaa9ApnET
To: /content/human-pose-estimation.pytorch/models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar
100% 136M/136M [00:00<00:00, 152MB/s]
/content/human-pose-estimation.pytorch


# Dataset processing part

In [None]:
!mkdir -p /content/human-pose-estimation.pytorch/data/

In [None]:
!pip install fiftyone

import fiftyone as fo
import fiftyone.zoo as foz

fo.config.dataset_zoo_dir = "/content/human-pose-estimation.pytorch/data/"

dataset = foz.load_zoo_dataset(
    "coco-2017",
    splits=["validation"]
)


#dataset = fo.Dataset.from_dir(
#    dataset_type = fo.types.COCODetectionDataset,
#    label_types = ["detections", "segmentations", "keypoints"],
#   dataset_dir = "/root/fiftyone/coco-2017/validation",
#   labels_path ="/root/fiftyone/coco-2017/raw/person_keypoints_val2017.json")

#session = fo.launch_app(dataset)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fiftyone
  Downloading fiftyone-0.20.1-py3-none-any.whl (7.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.3/7.3 MB[0m [31m45.5 MB/s[0m eta [36m0:00:00[0m
Collecting argcomplete
  Downloading argcomplete-3.0.5-py3-none-any.whl (40 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.2/40.2 KB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
Collecting pprintpp
  Downloading pprintpp-0.4.0-py2.py3-none-any.whl (16 kB)
Collecting boto3
  Downloading boto3-1.26.107-py3-none-any.whl (135 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.6/135.6 KB[0m [31m16.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting hypercorn>=0.13.2
  Downloading Hypercorn-0.14.3-py3-none-any.whl (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.9/57.9 KB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting starlet

INFO:fiftyone.migrations.runner:Migrating database to v0.20.1


Downloading split 'validation' to '/content/human-pose-estimation.pytorch/data/coco-2017/validation' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'validation' to '/content/human-pose-estimation.pytorch/data/coco-2017/validation' if necessary


Downloading annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/tmp-download/annotations_trainval2017.zip'


INFO:fiftyone.utils.coco:Downloading annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/tmp-download/annotations_trainval2017.zip'


 100% |██████|    1.9Gb/1.9Gb [13.9s elapsed, 0s remaining, 251.3Mb/s]      


INFO:eta.core.utils: 100% |██████|    1.9Gb/1.9Gb [13.9s elapsed, 0s remaining, 251.3Mb/s]      


Extracting annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/raw/instances_val2017.json'


INFO:fiftyone.utils.coco:Extracting annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/raw/instances_val2017.json'


Downloading images to '/content/human-pose-estimation.pytorch/data/coco-2017/tmp-download/val2017.zip'


INFO:fiftyone.utils.coco:Downloading images to '/content/human-pose-estimation.pytorch/data/coco-2017/tmp-download/val2017.zip'


 100% |██████|    6.1Gb/6.1Gb [24.0s elapsed, 0s remaining, 272.8Mb/s]      


INFO:eta.core.utils: 100% |██████|    6.1Gb/6.1Gb [24.0s elapsed, 0s remaining, 272.8Mb/s]      


Extracting images to '/content/human-pose-estimation.pytorch/data/coco-2017/validation/data'


INFO:fiftyone.utils.coco:Extracting images to '/content/human-pose-estimation.pytorch/data/coco-2017/validation/data'


Writing annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/validation/labels.json'


INFO:fiftyone.utils.coco:Writing annotations to '/content/human-pose-estimation.pytorch/data/coco-2017/validation/labels.json'


Dataset info written to '/content/human-pose-estimation.pytorch/data/coco-2017/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/content/human-pose-estimation.pytorch/data/coco-2017/info.json'


Loading 'coco-2017' split 'validation'


INFO:fiftyone.zoo.datasets:Loading 'coco-2017' split 'validation'


 100% |███████████████| 5000/5000 [31.8s elapsed, 0s remaining, 161.7 samples/s]      


INFO:eta.core.utils: 100% |███████████████| 5000/5000 [31.8s elapsed, 0s remaining, 161.7 samples/s]      


Dataset 'coco-2017-validation' created


INFO:fiftyone.zoo.datasets:Dataset 'coco-2017-validation' created


In [None]:
import os, sys


os.rename("/content/human-pose-estimation.pytorch/data/coco-2017","/content/human-pose-estimation.pytorch/data/coco")
os.rename("/content/human-pose-estimation.pytorch/data/coco/raw","/content/human-pose-estimation.pytorch/data/coco/annotations")
os.rename("/content/human-pose-estimation.pytorch/data/coco/validation","/content/human-pose-estimation.pytorch/data/coco/images")
os.rename("/content/human-pose-estimation.pytorch/data/coco/images/data","/content/human-pose-estimation.pytorch/data/coco/images/val2017")

In [None]:
%%writefile /content/human-pose-estimation.pytorch/lib/core/config.py

# ------------------------------------------------------------------------------
# Copyright (c) Microsoft
# Licensed under the MIT License.
# Written by Bin Xiao (Bin.Xiao@microsoft.com)
# ------------------------------------------------------------------------------

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import yaml

import numpy as np
from easydict import EasyDict as edict


config = edict()

config.OUTPUT_DIR = ''
config.LOG_DIR = ''
config.DATA_DIR = ''
config.GPUS = '0'
config.WORKERS = 4
config.PRINT_FREQ = 20

# Cudnn related params
config.CUDNN = edict()
config.CUDNN.BENCHMARK = True
config.CUDNN.DETERMINISTIC = False
config.CUDNN.ENABLED = True

# pose_resnet related params
POSE_RESNET = edict()
POSE_RESNET.NUM_LAYERS = 50
POSE_RESNET.DECONV_WITH_BIAS = False
POSE_RESNET.NUM_DECONV_LAYERS = 3
POSE_RESNET.NUM_DECONV_FILTERS = [256, 256, 256]
POSE_RESNET.NUM_DECONV_KERNELS = [4, 4, 4]
POSE_RESNET.FINAL_CONV_KERNEL = 1
POSE_RESNET.TARGET_TYPE = 'gaussian'
POSE_RESNET.HEATMAP_SIZE = [64, 64]  # width * height, ex: 24 * 32
POSE_RESNET.SIGMA = 2

MODEL_EXTRAS = {
    'pose_resnet': POSE_RESNET,
}

# common params for NETWORK
config.MODEL = edict()
config.MODEL.NAME = 'pose_resnet'
config.MODEL.INIT_WEIGHTS = True
config.MODEL.PRETRAINED = ''
config.MODEL.NUM_JOINTS = 16
config.MODEL.IMAGE_SIZE = [256, 256]  # width * height, ex: 192 * 256
config.MODEL.EXTRA = MODEL_EXTRAS[config.MODEL.NAME]

config.MODEL.STYLE = 'pytorch'

config.LOSS = edict()
config.LOSS.USE_TARGET_WEIGHT = True

# DATASET related params
config.DATASET = edict()
config.DATASET.ROOT = ''
config.DATASET.DATASET = 'coco'
config.DATASET.TRAIN_SET = 'train'
config.DATASET.TEST_SET = 'valid'
config.DATASET.DATA_FORMAT = 'jpg'
config.DATASET.HYBRID_JOINTS_TYPE = ''
config.DATASET.SELECT_DATA = False

# training data augmentation
config.DATASET.FLIP = True
config.DATASET.SCALE_FACTOR = 0.25
config.DATASET.ROT_FACTOR = 30

# train
config.TRAIN = edict()

config.TRAIN.LR_FACTOR = 0.1
config.TRAIN.LR_STEP = [90, 110]
config.TRAIN.LR = 0.001

config.TRAIN.OPTIMIZER = 'adam'
config.TRAIN.MOMENTUM = 0.9
config.TRAIN.WD = 0.0001
config.TRAIN.NESTEROV = False
config.TRAIN.GAMMA1 = 0.99
config.TRAIN.GAMMA2 = 0.0

config.TRAIN.BEGIN_EPOCH = 0
config.TRAIN.END_EPOCH = 140

config.TRAIN.RESUME = False
config.TRAIN.CHECKPOINT = ''

config.TRAIN.BATCH_SIZE = 32
config.TRAIN.SHUFFLE = True

# testing
config.TEST = edict()

# size of images for each device
config.TEST.BATCH_SIZE = 32
# Test Model Epoch
config.TEST.FLIP_TEST = False
config.TEST.POST_PROCESS = True
config.TEST.SHIFT_HEATMAP = True

config.TEST.USE_GT_BBOX = False
# nms
config.TEST.OKS_THRE = 0.5
config.TEST.IN_VIS_THRE = 0.0
config.TEST.COCO_BBOX_FILE = ''
config.TEST.BBOX_THRE = 1.0
config.TEST.MODEL_FILE = ''
config.TEST.IMAGE_THRE = 0.0
config.TEST.NMS_THRE = 1.0

# debug
config.DEBUG = edict()
config.DEBUG.DEBUG = False
config.DEBUG.SAVE_BATCH_IMAGES_GT = False
config.DEBUG.SAVE_BATCH_IMAGES_PRED = False
config.DEBUG.SAVE_HEATMAPS_GT = False
config.DEBUG.SAVE_HEATMAPS_PRED = False


def _update_dict(k, v):
    if k == 'DATASET':
        if 'MEAN' in v and v['MEAN']:
            v['MEAN'] = np.array([eval(x) if isinstance(x, str) else x
                                  for x in v['MEAN']])
        if 'STD' in v and v['STD']:
            v['STD'] = np.array([eval(x) if isinstance(x, str) else x
                                 for x in v['STD']])
    if k == 'MODEL':
        if 'EXTRA' in v and 'HEATMAP_SIZE' in v['EXTRA']:
            if isinstance(v['EXTRA']['HEATMAP_SIZE'], int):
                v['EXTRA']['HEATMAP_SIZE'] = np.array(
                    [v['EXTRA']['HEATMAP_SIZE'], v['EXTRA']['HEATMAP_SIZE']])
            else:
                v['EXTRA']['HEATMAP_SIZE'] = np.array(
                    v['EXTRA']['HEATMAP_SIZE'])
        if 'IMAGE_SIZE' in v:
            if isinstance(v['IMAGE_SIZE'], int):
                v['IMAGE_SIZE'] = np.array([v['IMAGE_SIZE'], v['IMAGE_SIZE']])
            else:
                v['IMAGE_SIZE'] = np.array(v['IMAGE_SIZE'])
    for vk, vv in v.items():
        if vk in config[k]:
            config[k][vk] = vv
        else:
            raise ValueError("{}.{} not exist in config.py".format(k, vk))


def update_config(config_file):
    exp_config = None
    with open(config_file) as f:
        exp_config = edict(yaml.full_load(f))
        for k, v in exp_config.items():
            if k in config:
                if isinstance(v, dict):
                    _update_dict(k, v)
                else:
                    if k == 'SCALES':
                        config[k][0] = (tuple(v))
                    else:
                        config[k] = v
            else:
                raise ValueError("{} not exist in config.py".format(k))


def gen_config(config_file):
    cfg = dict(config)
    for k, v in cfg.items():
        if isinstance(v, edict):
            cfg[k] = dict(v)

    with open(config_file, 'w') as f:
        yaml.dump(dict(cfg), f, default_flow_style=False)


def update_dir(model_dir, log_dir, data_dir):
    if model_dir:
        config.OUTPUT_DIR = model_dir

    if log_dir:
        config.LOG_DIR = log_dir

    if data_dir:
        config.DATA_DIR = data_dir

    config.DATASET.ROOT = os.path.join(
            config.DATA_DIR, config.DATASET.ROOT)

    config.TEST.COCO_BBOX_FILE = os.path.join(
            config.DATA_DIR, config.TEST.COCO_BBOX_FILE)

    config.MODEL.PRETRAINED = os.path.join(
            config.DATA_DIR, config.MODEL.PRETRAINED)


def get_model_name(cfg):
    name = cfg.MODEL.NAME
    full_name = cfg.MODEL.NAME
    extra = cfg.MODEL.EXTRA
    if name in ['pose_resnet']:
        name = '{model}_{num_layers}'.format(
            model=name,
            num_layers=extra.NUM_LAYERS)
        deconv_suffix = ''.join(
            'd{}'.format(num_filters)
            for num_filters in extra.NUM_DECONV_FILTERS)
        full_name = '{height}x{width}_{name}_{deconv_suffix}'.format(
            height=cfg.MODEL.IMAGE_SIZE[1],
            width=cfg.MODEL.IMAGE_SIZE[0],
            name=name,
            deconv_suffix=deconv_suffix)
    else:
        raise ValueError('Unkown model: {}'.format(cfg.MODEL))

    return name, full_name


if __name__ == '__main__':
    import sys
    gen_config(sys.argv[1])


Overwriting /content/human-pose-estimation.pytorch/lib/core/config.py


In [None]:
%cd /content/human-pose-estimation.pytorch/

!python pose_estimation/valid.py \
    --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml \
    --flip-test \
    --model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar

/content/human-pose-estimation.pytorch
=> creating output/coco/pose_resnet_50/256x192_d256x3_adam_lr1e-3
=> creating log/coco/pose_resnet_50/256x192_d256x3_adam_lr1e-3_2023-04-06-19-00
Namespace(cfg='experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml', frequent=100, gpus=None, workers=None, model_file='models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar', use_detect_bbox=False, flip_test=True, post_process=False, shift_heatmap=False, coco_bbox_file=None)
{'CUDNN': {'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True},
 'DATASET': {'DATASET': 'coco',
             'DATA_FORMAT': 'jpg',
             'FLIP': True,
             'HYBRID_JOINTS_TYPE': '',
             'ROOT': 'data/coco/',
             'ROT_FACTOR': 40,
             'SCALE_FACTOR': 0.3,
             'SELECT_DATA': False,
             'TEST_SET': 'val2017',
             'TRAIN_SET': 'val2017'},
 'DATA_DIR': '',
 'DEBUG': {'DEBUG': True,
           'SAVE_BATCH_IMAGES_GT': True,
           'SAVE_BATCH_IMAGES_PR

In [None]:
!pip install torch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge

[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'pytorch'[0m[31m
[0m

In [None]:
%cd /content/human-pose-estimation.pytorch/
!python pose_estimation/train.py \
    --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml

/content/human-pose-estimation.pytorch
=> creating output/coco/pose_resnet_18/256x192_d256x3_adam_lr1e-3
=> creating log/coco/pose_resnet_18/256x192_d256x3_adam_lr1e-3_2023-04-06-19-35
Namespace(cfg='experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml', frequent=100, gpus=None, workers=None)
{'CUDNN': {'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True},
 'DATASET': {'DATASET': 'coco',
             'DATA_FORMAT': 'jpg',
             'FLIP': True,
             'HYBRID_JOINTS_TYPE': '',
             'ROOT': 'data/coco/',
             'ROT_FACTOR': 40,
             'SCALE_FACTOR': 0.3,
             'SELECT_DATA': False,
             'TEST_SET': 'val2017',
             'TRAIN_SET': 'val2017'},
 'DATA_DIR': '',
 'DEBUG': {'DEBUG': True,
           'SAVE_BATCH_IMAGES_GT': True,
           'SAVE_BATCH_IMAGES_PRED': True,
           'SAVE_HEATMAPS_GT': True,
           'SAVE_HEATMAPS_PRED': True},
 'GPUS': '0',
 'LOG_DIR': 'log',
 'LOSS': {'USE_TARGET_WEIGHT': True},
 'MODEL': {'