Can noise be added to Dataset. #1357

YuktiADY · 2022-05-05T18:11:06Z

Hallo Team,

I was training the hrnet model and trying to improve the accuracy of model since trained the model too many times and it may lead to overfitting.

I would like to know if there is a possibility to augment the data with random noise in MMPOSE?

Where to look into the code of mmpose and how we can do this ?

Please suggest !

liqikai9 · 2022-05-06T01:08:54Z

Hi, you can add your custom data pipeline which can handle data preprocessing. For more detail, please refer to this tutorial: https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/3_data_pipeline.md#extend-and-use-custom-pipelines

YuktiADY · 2022-05-06T10:57:01Z

I mean where we can look into the code of mmpose if there is possibility to add noise to dataset ?

YuktiADY · 2022-05-06T11:00:46Z

Hi, you can add your custom data pipeline which can handle data preprocessing. For more detail, please refer to this tutorial: https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/3_data_pipeline.md#extend-and-use-custom-pipelinesI

In this link this is the noise added ?
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),

Sorry to ask such questions because I am new to this topic so need help .

YuktiADY · 2022-05-06T11:02:26Z

In the config which I am training this snippet is already added .

dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),

liqikai9 · 2022-05-06T11:49:22Z

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

liqikai9 · 2022-05-06T11:54:05Z

In the config which I am training this snippet is already added .

dict(type='TopDownAffine'), dict(type='ToTensor'), dict( type='NormalizeTensor', mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),

These three pipelines: TopDownAffine, ToTensor, NormalizeTensor will not have any randomness (or the noise you meant) while preparing data.

YuktiADY · 2022-05-06T12:00:05Z

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .

So according to you what does augmenting data with noise means ?
Do I need add these 4 classes ?

YuktiADY · 2022-05-06T12:04:24Z

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .

So according to you what does augmenting data with noise means ? Do I need add these 4 classes ?

Also by looking into code of MMPOSE can we augment data with noise ?

liqikai9 · 2022-05-06T12:08:46Z

That depends on your need. I think you can try these pipelines. BTW, which dataset are you using?

TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

YuktiADY · 2022-05-06T12:20:13Z

I have concatenated COCO and THEODORE dataset

YuktiADY · 2022-05-06T12:41:38Z

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .
So according to you what does augmenting data with noise means ? Do I need add these 4 classes ?

Also by looking into code of MMPOSE can we augment data with noise ?

So, is it possible to augment data with random noise ??

liqikai9 · 2022-05-06T12:48:17Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation.
They can perform data augmentation with random noise.

jin-s13 · 2022-05-06T12:51:01Z

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches.
https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

jin-s13 · 2022-05-06T12:51:36Z

For more information about albumentations, please check https://albumentations.ai/

YuktiADY · 2022-05-06T12:53:42Z

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches. https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

I will check.

YuktiADY · 2022-05-06T12:54:19Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

YuktiADY · 2022-05-06T12:56:58Z

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches. https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

The above approach of adding above pipelines will also work right ?

jin-s13 · 2022-05-06T13:12:03Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation.
But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

YuktiADY · 2022-05-06T13:37:37Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

I just want to first check based on the code in MMPOSE if it is possible to add noise .
If yes then is there is possibility to augment data with random noise and how we can do that.

YuktiADY · 2022-05-06T13:38:36Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

I just want to first check based on the code in MMPOSE if it is possible to add noise . If yes then is there is possibility to augment data with random noise and how we can do that.

YuktiADY · 2022-05-06T14:19:48Z

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

These are different approaches for augmentation like center, flipping,crop the box .
If i want to augment the data with random noise. how can i do that ? Will those above pipelines work ,i mean those pipelines contain methods like flipping, etc

liqikai9 · 2022-05-06T14:50:14Z

For flipping, you may try TopDownRandomFlip: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py#L93

YuktiADY · 2022-05-06T14:54:55Z

For flipping, you may try TopDownRandomFlip: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py#L93

Okay, I mean this is for flipping . What about adding noise , which one is for augmenting data with random noise ?
Because random noise is also other augmentation method ,if i am wrong ?

liqikai9 · 2022-05-06T15:09:56Z

Flipping can be viewed as a method of augmenting data with random noise as it can randomly flip the image.

If you want to add pixel-level noise to the data, you can use albumentations in MMPose.
An example to use it can be found here: https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192_coarsedropout.py#L108

YuktiADY · 2022-05-06T15:13:50Z

Flipping can be viewed as a method of augmenting data with random noise as it can randomly flip the image.

If you want to add pixel-level noise to the data, you can use albumentations in MMPose. An example to use it can be found here: https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192_coarsedropout.py#L108

okay Thank you. Understood.

YuktiADY · 2022-05-06T15:14:38Z

For flipping, you may try TopDownRandomFlip: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py#L93

So just adding this class in config , are any other changes also required ?

YuktiADY · 2022-05-06T15:22:01Z

I just saw in the config dict(type='TopDownRandomFlip', flip_prob=0.5), is alreadythere .

liqikai9 · 2022-05-06T15:34:02Z

You can try to use this in your config and see if it has better results.

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownGetBboxCenterScale', padding=1.25),
    dict(type='TopDownRandomShiftBboxCenter', shift_factor=0.16, prob=0.3),
    dict(type='TopDownRandomFlip', flip_prob=0.5),
    dict(
        type='TopDownHalfBodyTransform',
        num_joints_half_body=8,
        prob_half_body=0.3),
    dict(
        type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
    dict(type='TopDownAffine'),
###########################
# add the Albumentation here
    dict(
        type='Albumentation',
        transforms=[
            dict(
                type='CoarseDropout',
                max_holes=8,
                max_height=40,
                max_width=40,
                min_holes=1,
                min_height=10,
                min_width=10,
                p=0.5),
        ]),
###########################
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(type='TopDownGenerateTarget', sigma=2),
    dict(
        type='Collect',
        keys=['img', 'target', 'target_weight'],
        meta_keys=[
            'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
            'rotation', 'bbox_score', 'flip_pairs'
        ]),
]

liqikai9 · 2022-05-06T15:36:52Z

I suggest you read more detail about the implementation of Albumentation in MMPose: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/shared_transform.py#L190
And then change the parameters according to your need. Hope this help!

YuktiADY · 2022-06-09T15:19:51Z

Simple Baseline 2D is an algorithm ? Because if i say i am using simple baseline 2D algorithm i am indirectly saying that I am saying that I am using Resnet.
The only difference between Resnet and HRnet is that Hrnet uses different feature extractor. Is that main difference ?

liqikai9 · 2022-06-09T15:28:11Z

Simple Baseline 2D is an algorithm ?

Yes, you can say like that.

Is that main difference ?

Yes, HRNet and ResNet are two different feature extractors.

YuktiADY · 2022-06-16T15:50:15Z

How to test the pre trained models that is trained on coco and evaluate on my test dataset ??

YuktiADY · 2022-06-16T16:32:17Z

I did these changes in config :
base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/coco_wholebody.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TopDownCocoWholeBodyDataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TopDownCocoWholeBodyDataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),

The script i used for testing is this .
./mmpose/tools/dist_test.sh ./FES_Results_COCO/hrnet_w348_coco_wholebody_388x288.py "/home/yukti/Downloads/hrnet_w48_coco_384x288-314c8528_20200708.pth" 1 --eval mAP

But i am getting siye mismatch error,

l**oad checkpoint from local path: /home/yukti/Downloads/hrnet_w48_coco_384x288-314c8528_20200708.pth
The model and loaded state dict do not match exactly

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([17, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([133, 48, 1, 1]).
size mismatch for keypoint_head.final_layer.bias: copying a param with shape torch.Size([17]) from checkpoint, the shape in current model is torch.Size([133]).**

Is this above things I am doing correct ?

Awaiting for your response.

liqikai9 · 2022-06-17T00:59:53Z

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

This checkpoint seems like a model trained on COCO dataset, but not on COCO-Wholebody dataset. Please choose another appropriate checkpoint file. You can find one here.

YuktiADY · 2022-06-17T15:49:02Z

Please find my changes in config.

I gave coco whole body only but i get this

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),

I get size mismatch error and AP = 0.0
The model and loaded state dict do not match exactly

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([133, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([17, 48, 1, 1]).
size mismatch for keypoint_head.final_layer.bias: copying a param with shape torch.Size([133]) from checkpoint, the shape in current model is torch.Size([17]).
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 736/736, 19.5 task/s, elapsed: 38s, ETA: 0sLoading and preparing results...
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=0.12s).
Accumulating evaluation results...
DONE (t=0.01s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.001
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.025
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
AP: 2.3954008304056212e-05
AP (L): 0.0
AP (M): 0.0002155860747365059
AP .5: 7.984669434685404e-05
AP .75: 0.0
AR: 0.0004081632653061224
AR (L): 0.0
AR (M): 0.025
AR .5: 0.0013605442176870747
AR .75: 0.0

when i gave checkpoint for coco.It gave results with AP

I even tried testing resnet50 model and gave checkpoint for coco dataset only , it ran and gave results but when i run on with coco dataset its gives size mismatch error. I even checked the number of keypoints ..

Is there is a problem if we give check point for coco but in config its type is topdowncoco whole body .??
will there difference in results in AP ?

liqikai9 · 2022-06-18T00:31:21Z

Could you please provide the config for your model? Seems like the model you are using will output 17 channels but the checkpoint you are using will output 133 channels.

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([133, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([17, 48, 1, 1]).

Please use the model that matches your expected output.
If you would like to test on a dataset like COCO-Wholebody dataset (which has 133 keypoints thus need to output 133 channels), please use checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'

If you would like to test on a dataset like COCO dataset (which has 17 keypoints thus need to output 17 channels), please use checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

YuktiADY · 2022-06-18T00:44:02Z

Please find the config below

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup=None,
# warmup='linear',
# warmup_iters=500,
# warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='data/coco/person_detection_results/'
#'COCO_val2017_detections_AP_H_56_person.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_bboxes_scenario2.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

Also I would like to ask if lr = 5e-4 and i reduce it by a factor of 10 so lr = 5e-5 right ? which is best lr out of two ?

YuktiADY · 2022-06-18T00:45:39Z

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup=None,
# warmup='linear',
# warmup_iters=500,
# warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='data/coco/person_detection_results/'
#'COCO_val2017_detections_AP_H_56_person.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_bboxes_scenario2.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

liqikai9 · 2022-06-18T01:06:24Z

Please change this parameter according to your dataset.

channel_cfg = dict(
num_output_channels=17,

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

I am afraid that these hyper parameters may need to be tuned on your own dataset.

YuktiADY · 2022-06-18T01:14:36Z

Please change this parameter according to your dataset.

channel_cfg = dict(
num_output_channels=17,

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

I am afraid that these hyper parameters may need to be tuned on your own dataset.

Yes earlier it was 133 so i changed to 17. You mean only num_ouput_channels or num_keypoints also ?

Is the lr incorrect ? bcoz by default in config lr = 5e-4

liqikai9 · 2022-06-18T01:19:51Z

You mean only num_ouput_channels or num_keypoints also ?

I mean the number of channels should match the output channels you need in your own dataset.

YuktiADY · 2022-06-18T01:31:31Z

yes changed still not working..

Can you check this part, i guess some problem in this.

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

How when giving the checkpoint of coco its working ,?
Also does it make difference while testing these pre trained model with coco chekpoint not coco whole body.Does the APresults affect ?

liqikai9 · 2022-06-18T01:52:11Z

How many keypoints does TheodorePlusV2Dataset have?
If it is 17, this is a COCO-style dataset, please set the channels to 17 and use the checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

If it is 133, this is a COCO-Wholebody-style dataset, please set the channels to 133 and use the checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'

YuktiADY · 2022-06-23T15:02:22Z

Hello,

Where i can find this file python3.x/site-packages/pycocoapi/cocoeval.py ?

ly015 · 2022-06-23T15:16:16Z

You can try pip show pycocotools.

YuktiADY · 2022-06-25T00:43:17Z

Hello,

I am trying to evaluate the 13 keypoints so i did few changes in cocoeval.py file with slicing
i am getting this error.

Traceback (most recent call last):
File "./mmpose/tools/test.py", line 174, in
main()
File "./mmpose/tools/test.py", line 168, in main
results = dataset.evaluate(outputs, cfg.work_dir, eval_config)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 318, in evaluate
info_str = self._do_python_keypoint_eval(res_file)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 372, in _do_python_keypoint_eval
coco_eval.evaluate()
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 262, in evaluate
for imgId in p.imgIds
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 263, in
for catId in catIds}
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 407, in computeOks
e = (dx2 + dy**2) / vars / (gt['area']+np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (13,) (17,)

liqikai9 · 2022-06-25T06:11:36Z

I am trying to evaluate the 13 keypoints so i did few changes in cocoeval.py file with slicing

Seems like the number of keypoints which is 13 in your dataset, did not match the COCO's 17 keypoints.

Could you conveniently show the modifications you did?

YuktiADY · 2022-06-25T14:50:02Z

I did these changes in cocoeval.py. Actually i am only fetching to evaluate only 13 keypoints from these 17 keypoints,

i changed this : #self.sigmas = np.array(
#[.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0
self.sigmas = np.array(
[.26, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0
I added slicing for g and d in computeoks function

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

I checked everywhere but didnt get to resolve this error . Please give your valuable suggestions

YuktiADY · 2022-06-25T15:11:40Z

yes i mentioned it 17 only.

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

liqikai9 · 2022-06-26T03:15:57Z

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

What did these two lines do?

And make sure you did the modifications in
/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py

YuktiADY · 2022-06-26T16:58:24Z

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

What did these two lines do?

And make sure you did the modifications in /home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py

Actually in our dataset nose keypoint is present in array index 0,1,2 so we are slicing nose[0:3] and rest others [15:]exculding the keypoints of eyes and ears. Because in our dataset we dont have eyes and ears detected , it is based on 13 keypoints. So, we are evaluating only 13 keypoints.
So added these two slicing to exclude the eyes and ears which is not present

But this error i am getting .Please suggest on this .
Yes i did changes in this above path only.

i added here these changes even i tried to print but print also not working

elif p.iouType == 'keypoints_righthand':
g = np.array(gt['righthand_kpts'])
else:
g = np.array(gt['keypoints'])

        **g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])**        
        xg = g[0::3]; yg = g[1::3]; vg = g[2::3]
        k1 = np.count_nonzero(vg > 0)
        bb = gt['bbox']
        x0 = bb[0] - bb[2]; x1 = bb[0] + bb[2] * 2
        y0 = bb[1] - bb[3]; y1 = bb[1] + bb[3] * 2
        for i, dt in enumerate(dts):
            if p.iouType == 'keypoints_wholebody':
                body_dt = dt['keypoints']
                foot_dt = dt['foot_kpts']
                face_dt = dt['face_kpts']
                lefthand_dt = dt['lefthand_kpts']
                righthand_dt = dt['righthand_kpts']
                wholebody_dt = body_dt + foot_dt + face_dt + lefthand_dt + righthand_dt
                d = np.array(wholebody_dt)

            elif p.iouType == 'keypoints_foot':
                d = np.array(dt['foot_kpts'])
            elif p.iouType == 'keypoints_face':
                d = np.array(dt['face_kpts'])
            elif p.iouType == 'keypoints_lefthand':
                d = np.array(dt['lefthand_kpts'])
            elif p.iouType == 'keypoints_righthand':
                d = np.array(dt['righthand_kpts'])
            else:
                d = np.array(dt['keypoints'])
            
            **d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])**
            #print(d.size)

liqikai9 · 2022-06-27T01:57:59Z

self.sigmas = np.array(
[.26, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0

Did you changed this line in cocoeval.py? I think this may be the problem.

According to this line, we are using the sigmas in the datasetinfo.

You can see how the sigmas is chosen here.

So actually, with your modification, you are using the original COCO sigmas which have 17 numbers, instead of your expected sigmas.

You can add sigmas in the datasetinfo like this.

YuktiADY · 2022-06-27T15:19:43Z

Yes i have added the sigmas in the cocoeval.py

I have changed in datasetinfo too below:
sigmas=[
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089

But still getting this error .
File "./mmpose/tools/test.py", line 174, in
main()
File "./mmpose/tools/test.py", line 168, in main
results = dataset.evaluate(outputs, cfg.work_dir, eval_config)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 311, in evaluate
keep = nms(img_kpts, oks_thr, sigmas=self.sigmas)
File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 116, in oks_nms
sigmas, vis_thr)
File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 81, in oks_iou
e = (dx2 + dy**2) / vars / ((a_g + a_d[n_d]) / 2 + np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (17,) (13,)

Actually the error is coming in top_down_coco_dataset.py where it is taking config of coco.py and there it is taking 17 sigma values as below:
0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089
I tried giving mine 13 sigma values here but get the same error as above.

Do i need to change in this coco,py file too the sigma values ?

Not sure from where it is giving value error: 17, 13
Still the error persists. Please give your valuable suggestions.

liqikai9 · 2022-06-28T01:01:44Z

May I see the config you use to instantiate the dataset object? I guess the datasetinfo file used to instantiate the dataset may be mistakenly specified as coco.py. In top_down_coco_dataset.py, we use coco.py as the default datainfo file.

YuktiADY · 2022-06-28T17:00:20Z

## THEODORE.PY

dataset_info = dict(
dataset_name='theodore',
paper_info=dict(
author='Lin, Tsung-Yi and Maire, Michael and '
'Belongie, Serge and Hays, James and '
'Perona, Pietro and Ramanan, Deva and '
r'Doll{'a}r, Piotr and Zitnick, C Lawrence',
title='Learning from THEODORE',
container='CVPR',
year='2014',
homepage='http://cocodataset.org/',
),
keypoint_info={
0:
dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(
name='left_eye',
id=1,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
2:
dict(
name='right_eye',
id=2,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
3:
dict(
name='left_ear',
id=3,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
4:
dict(
name='right_ear',
id=4,
color=[51, 153, 255],
type='upper',
swap='left_ear'),
5:
dict(
name='left_shoulder',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
6:
dict(
name='right_shoulder',
id=6,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
7:
dict(
name='left_elbow',
id=7,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
8:
dict(
name='right_elbow',
id=8,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
9:
dict(
name='left_wrist',
id=9,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
10:
dict(
name='right_wrist',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
11:
dict(
name='left_hip',
id=11,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='left_knee',
id=13,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
14:
dict(
name='right_knee',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
15:
dict(
name='left_ankle',
id=15,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
16:
dict(
name='right_ankle',
id=16,
color=[255, 128, 0],
type='lower',
swap='left_ankle')
},
skeleton_info={
0:
dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
1:
dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
2:
dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
3:
dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
4:
dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
5:
dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
6:
dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
7:
dict(
link=('left_shoulder', 'right_shoulder'),
id=7,
color=[51, 153, 255]),
8:
dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
9:
dict(
link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
10:
dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
11:
dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
12:
dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
13:
dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
14:
dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
15:
dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
16:
dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
17:
dict(link=('left_ear', 'left_shoulder'), id=17, color=[51, 153, 255]),
18:
dict(
link=('right_ear', 'right_shoulder'), id=18, color=[51, 153, 255])
},
joint_weights=[
1., 1., 1., 1., 1., 1., 1., 1.2, 1.2, 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5,
1.5
],
sigmas=[
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])

CONFIG FILE (HRNET_W48_384 x 288)

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
#load_from = None
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'
#resume_from = '/home/yukti/mmpose/theodore_2022-05-24/epoch_50.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=5)
evaluation = dict(interval=1, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[30, 45])
total_epochs = 60
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2/coco_annotations/person_bbox_valid.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_bboxes_scenario2.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',

)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(
type='Albumentation',
transforms=[
dict(
type='GaussNoise',
var_limit=(10.0, 50.0)),
]),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

dataset settings

dataset_type = 'TheodorePlusV2Dataset'

data_root = '/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=[
dict(type=dataset_type,
ann_file=f'{data_root}/coco_annotations/person_keypoints_train.json',
img_prefix=f'{data_root}/train/img_png/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
dict(type='TopDownCocoDataset',
ann_file=f'/mnt/data/yjin/coco/annotations/person_keypoints_train2017.json',
img_prefix=f'/mnt/data/yjin/coco/images/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
],
val=dict(
type=dataset_type,
#ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_keypoints_scenario2.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/valid/img_png/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type=dataset_type,
#ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_keypoints_scenario2.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#img_prefix=f'{data_root}/valid/img_png/',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

liqikai9 · 2022-06-29T01:13:39Z

File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 81, in oks_iou e = (dx2 + dy**2) / vars / ((a_g + a_d[n_d]) / 2 + np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (17,) (13,)

I think this error may be that the keypoints prediction results' shape is (17,) but the sigmas is (13,) so their sizes did not match

Please double-check the shape of your prediction results and the sigmas.

Since you have exculded the keypoints of eyes and ears and modified the cocoeval.py, maybe you can try to change this line: https://github.com/jin-s13/xtcocoapi/blob/master/xtcocotools/cocoeval.py#L319 directly into:

sigmas=np.array([
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])

and see if the error still exists.

jin-s13 · 2022-06-29T03:14:01Z

In order to make each issue focus on one question, I will close this issue for now.
Please open a new issue, if you have any other questions. :)

liqikai9 self-assigned this May 6, 2022

jin-s13 closed this as completed Jun 29, 2022

YuktiADY mentioned this issue Jun 29, 2022

Error in evaluating 13 keypoints #1453

Closed

Can noise be added to Dataset. #1357

Can noise be added to Dataset. #1357

Comments

YuktiADY commented May 5, 2022 • edited

liqikai9 commented May 6, 2022

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

liqikai9 commented May 6, 2022

liqikai9 commented May 6, 2022

YuktiADY commented May 6, 2022 • edited

YuktiADY commented May 6, 2022

liqikai9 commented May 6, 2022

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

liqikai9 commented May 6, 2022

jin-s13 commented May 6, 2022

jin-s13 commented May 6, 2022

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

jin-s13 commented May 6, 2022

YuktiADY commented May 6, 2022 • edited

YuktiADY commented May 6, 2022 • edited

YuktiADY commented May 6, 2022 • edited

liqikai9 commented May 6, 2022

YuktiADY commented May 6, 2022 • edited

liqikai9 commented May 6, 2022

YuktiADY commented May 6, 2022 • edited

YuktiADY commented May 6, 2022

YuktiADY commented May 6, 2022

liqikai9 commented May 6, 2022

liqikai9 commented May 6, 2022

YuktiADY commented Jun 9, 2022

liqikai9 commented Jun 9, 2022

YuktiADY commented Jun 16, 2022

YuktiADY commented Jun 16, 2022 • edited

liqikai9 commented Jun 17, 2022

YuktiADY commented Jun 17, 2022 • edited

liqikai9 commented Jun 18, 2022

YuktiADY commented Jun 18, 2022

learning policy

model settings

YuktiADY commented Jun 18, 2022

learning policy

model settings

liqikai9 commented Jun 18, 2022

YuktiADY commented Jun 18, 2022 • edited

liqikai9 commented Jun 18, 2022

YuktiADY commented Jun 18, 2022 • edited

liqikai9 commented Jun 18, 2022

YuktiADY commented Jun 23, 2022

ly015 commented Jun 23, 2022

YuktiADY commented Jun 25, 2022

liqikai9 commented Jun 25, 2022

YuktiADY commented Jun 25, 2022 • edited

YuktiADY commented Jun 25, 2022

liqikai9 commented Jun 26, 2022

YuktiADY commented Jun 26, 2022 • edited

liqikai9 commented Jun 27, 2022 • edited

YuktiADY commented Jun 27, 2022 • edited

liqikai9 commented Jun 28, 2022

YuktiADY commented Jun 28, 2022

CONFIG FILE (HRNET_W48_384 x 288)

learning policy

model settings

dataset settings

liqikai9 commented Jun 29, 2022

jin-s13 commented Jun 29, 2022

YuktiADY commented May 5, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented May 6, 2022 •

edited

YuktiADY commented Jun 16, 2022 •

edited

YuktiADY commented Jun 17, 2022 •

edited

YuktiADY commented Jun 18, 2022 •

edited

YuktiADY commented Jun 18, 2022 •

edited

YuktiADY commented Jun 25, 2022 •

edited

YuktiADY commented Jun 26, 2022 •

edited

liqikai9 commented Jun 27, 2022 •

edited

YuktiADY commented Jun 27, 2022 •

edited