Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can noise be added to Dataset. #1357

Closed
YuktiADY opened this issue May 5, 2022 · 68 comments
Closed

Can noise be added to Dataset. #1357

YuktiADY opened this issue May 5, 2022 · 68 comments
Assignees
Labels
kind/discussion community discussion

Comments

@YuktiADY
Copy link

YuktiADY commented May 5, 2022

Hallo Team,

I was training the hrnet model and trying to improve the accuracy of model since trained the model too many times and it may lead to overfitting.

I would like to know if there is a possibility to augment the data with random noise in MMPOSE?

Where to look into the code of mmpose and how we can do this ?

Please suggest !

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

Hi, you can add your custom data pipeline which can handle data preprocessing. For more detail, please refer to this tutorial: https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/3_data_pipeline.md#extend-and-use-custom-pipelines

@liqikai9 liqikai9 self-assigned this May 6, 2022
@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

I mean where we can look into the code of mmpose if there is possibility to add noise to dataset ?

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Hi, you can add your custom data pipeline which can handle data preprocessing. For more detail, please refer to this tutorial: https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/3_data_pipeline.md#extend-and-use-custom-pipelinesI

In this link this is the noise added ?
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),

Sorry to ask such questions because I am new to this topic so need help .

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

In the config which I am training this snippet is already added .

dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

In the config which I am training this snippet is already added .

dict(type='TopDownAffine'), dict(type='ToTensor'), dict( type='NormalizeTensor', mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),

These three pipelines: TopDownAffine, ToTensor, NormalizeTensor will not have any randomness (or the noise you meant) while preparing data.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .

So according to you what does augmenting data with noise means ?
Do I need add these 4 classes ?

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .

So according to you what does augmenting data with noise means ? Do I need add these 4 classes ?

Also by looking into code of MMPOSE can we augment data with noise ?

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

That depends on your need. I think you can try these pipelines. BTW, which dataset are you using?

TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

I have concatenated COCO and THEODORE dataset

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

if there is possibility to add noise to dataset ?

What does the noise mean here? If you mean the randomness in data preprocessing, you can find some pipelines here: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py, in which TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation can perform different data augmentation with custom probability randomly.

According to me augmentating data with noise means we want to avoid over fitting and improve the performance of our model .
So according to you what does augmenting data with noise means ? Do I need add these 4 classes ?

Also by looking into code of MMPOSE can we augment data with noise ?

So, is it possible to augment data with random noise ??

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation.
They can perform data augmentation with random noise.

@jin-s13
Copy link
Collaborator

jin-s13 commented May 6, 2022

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches.
https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

@jin-s13
Copy link
Collaborator

jin-s13 commented May 6, 2022

For more information about albumentations, please check https://albumentations.ai/

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches. https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

I will check.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

@YuktiADY You can use albumentations in mmpose, it supports various kinds of augmentation approaches. https://mmpose.readthedocs.io/en/latest/papers/techniques.html#albumentations-information-2020

The above approach of adding above pipelines will also work right ?

@jin-s13
Copy link
Collaborator

jin-s13 commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation.
But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

I just want to first check based on the code in MMPOSE if it is possible to add noise .
If yes then is there is possibility to augment data with random noise and how we can do that.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

I just want to first check based on the code in MMPOSE if it is possible to add noise . If yes then is there is possibility to augment data with random noise and how we can do that.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Yes, you can try to add one or several of these pipelines in your config: TopDownRandomShiftBboxCenter, TopDownRandomFlip, TopDownHalfBodyTransform, TopDownGetRandomScaleRotation. They can perform data augmentation with random noise.

Simply just have to add these class to config , no other changes will be required ?

These are also augmentation approaches, shifting the center, flipping, crop the box, scaling, rotation. But I think what you want is to add pixel-level noise or rgb jittering, right? If so, albumentations will meet your requirements.

These are different approaches for augmentation like center, flipping,crop the box .
If i want to augment the data with random noise. how can i do that ? Will those above pipelines work ,i mean those pipelines contain methods like flipping, etc

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

For flipping, you may try TopDownRandomFlip: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py#L93

Okay, I mean this is for flipping . What about adding noise , which one is for augmenting data with random noise ?
Because random noise is also other augmentation method ,if i am wrong ?

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

Flipping can be viewed as a method of augmenting data with random noise as it can randomly flip the image.

If you want to add pixel-level noise to the data, you can use albumentations in MMPose.
An example to use it can be found here: https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192_coarsedropout.py#L108

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

Flipping can be viewed as a method of augmenting data with random noise as it can randomly flip the image.

If you want to add pixel-level noise to the data, you can use albumentations in MMPose. An example to use it can be found here: https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192_coarsedropout.py#L108

okay Thank you. Understood.

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

For flipping, you may try TopDownRandomFlip: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/top_down_transform.py#L93

So just adding this class in config , are any other changes also required ?

@YuktiADY
Copy link
Author

YuktiADY commented May 6, 2022

I just saw in the config dict(type='TopDownRandomFlip', flip_prob=0.5), is alreadythere .

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

You can try to use this in your config and see if it has better results.

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownGetBboxCenterScale', padding=1.25),
    dict(type='TopDownRandomShiftBboxCenter', shift_factor=0.16, prob=0.3),
    dict(type='TopDownRandomFlip', flip_prob=0.5),
    dict(
        type='TopDownHalfBodyTransform',
        num_joints_half_body=8,
        prob_half_body=0.3),
    dict(
        type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
    dict(type='TopDownAffine'),
###########################
# add the Albumentation here
    dict(
        type='Albumentation',
        transforms=[
            dict(
                type='CoarseDropout',
                max_holes=8,
                max_height=40,
                max_width=40,
                min_holes=1,
                min_height=10,
                min_width=10,
                p=0.5),
        ]),
###########################
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(type='TopDownGenerateTarget', sigma=2),
    dict(
        type='Collect',
        keys=['img', 'target', 'target_weight'],
        meta_keys=[
            'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
            'rotation', 'bbox_score', 'flip_pairs'
        ]),
]

@liqikai9
Copy link
Collaborator

liqikai9 commented May 6, 2022

I suggest you read more detail about the implementation of Albumentation in MMPose: https://github.com/open-mmlab/mmpose/blob/master/mmpose/datasets/pipelines/shared_transform.py#L190
And then change the parameters according to your need. Hope this help!

@YuktiADY
Copy link
Author

YuktiADY commented Jun 9, 2022

Simple Baseline 2D is an algorithm ? Because if i say i am using simple baseline 2D algorithm i am indirectly saying that I am saying that I am using Resnet.
The only difference between Resnet and HRnet is that Hrnet uses different feature extractor. Is that main difference ?

@liqikai9
Copy link
Collaborator

liqikai9 commented Jun 9, 2022

Simple Baseline 2D is an algorithm ?

Yes, you can say like that.

Is that main difference ?

Yes, HRNet and ResNet are two different feature extractors.

@YuktiADY
Copy link
Author

How to test the pre trained models that is trained on coco and evaluate on my test dataset ??

@YuktiADY
Copy link
Author

YuktiADY commented Jun 16, 2022

I did these changes in config :
base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/coco_wholebody.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TopDownCocoWholeBodyDataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',

#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TopDownCocoWholeBodyDataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',

#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),

The script i used for testing is this .
./mmpose/tools/dist_test.sh ./FES_Results_COCO/hrnet_w348_coco_wholebody_388x288.py "/home/yukti/Downloads/hrnet_w48_coco_384x288-314c8528_20200708.pth" 1 --eval mAP

But i am getting siye mismatch error,

l**oad checkpoint from local path: /home/yukti/Downloads/hrnet_w48_coco_384x288-314c8528_20200708.pth
The model and loaded state dict do not match exactly

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([17, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([133, 48, 1, 1]).
size mismatch for keypoint_head.final_layer.bias: copying a param with shape torch.Size([17]) from checkpoint, the shape in current model is torch.Size([133]).**

Is this above things I am doing correct ?

Awaiting for your response.

@liqikai9
Copy link
Collaborator

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

This checkpoint seems like a model trained on COCO dataset, but not on COCO-Wholebody dataset. Please choose another appropriate checkpoint file. You can find one here.

@YuktiADY
Copy link
Author

YuktiADY commented Jun 17, 2022

Please find my changes in config.

I gave coco whole body only but i get this

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),

I get size mismatch error and AP = 0.0
The model and loaded state dict do not match exactly

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([133, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([17, 48, 1, 1]).
size mismatch for keypoint_head.final_layer.bias: copying a param with shape torch.Size([133]) from checkpoint, the shape in current model is torch.Size([17]).
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 736/736, 19.5 task/s, elapsed: 38s, ETA: 0sLoading and preparing results...
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=0.12s).
Accumulating evaluation results...
DONE (t=0.01s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.001
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.025
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000
AP: 2.3954008304056212e-05
AP (L): 0.0
AP (M): 0.0002155860747365059
AP .5: 7.984669434685404e-05
AP .75: 0.0
AR: 0.0004081632653061224
AR (L): 0.0
AR (M): 0.025
AR .5: 0.0013605442176870747
AR .75: 0.0

when i gave checkpoint for coco.It gave results with AP

I even tried testing resnet50 model and gave checkpoint for coco dataset only , it ran and gave results but when i run on with coco dataset its gives size mismatch error. I even checked the number of keypoints ..

Is there is a problem if we give check point for coco but in config its type is topdowncoco whole body .??
will there difference in results in AP ?

@liqikai9
Copy link
Collaborator

Could you please provide the config for your model? Seems like the model you are using will output 17 channels but the checkpoint you are using will output 133 channels.

size mismatch for keypoint_head.final_layer.weight: copying a param with shape torch.Size([133, 48, 1, 1]) from checkpoint, the shape in current model is torch.Size([17, 48, 1, 1]).

Please use the model that matches your expected output.
If you would like to test on a dataset like COCO-Wholebody dataset (which has 133 keypoints thus need to output 133 channels), please use checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'

If you would like to test on a dataset like COCO dataset (which has 17 keypoints thus need to output 17 channels), please use checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

@YuktiADY
Copy link
Author

Please find the config below

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup=None,
# warmup='linear',
# warmup_iters=500,
# warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='data/coco/person_detection_results/'
#'COCO_val2017_detections_AP_H_56_person.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_bboxes_scenario2.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

Also I would like to ask if lr = 5e-4 and i reduce it by a factor of 10 so lr = 5e-5 right ? which is best lr out of two ?

@YuktiADY
Copy link
Author

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup=None,
# warmup='linear',
# warmup_iters=500,
# warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='data/coco/person_detection_results/'
#'COCO_val2017_detections_AP_H_56_person.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_bboxes_scenario2.json',
)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

@liqikai9
Copy link
Collaborator

Please change this parameter according to your dataset.

channel_cfg = dict(
num_output_channels=17,

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

I am afraid that these hyper parameters may need to be tuned on your own dataset.

@YuktiADY
Copy link
Author

YuktiADY commented Jun 18, 2022

Please change this parameter according to your dataset.

channel_cfg = dict(
num_output_channels=17,

Also if lr = 5e-4 and i reduce by factor of 10 , the lr = 5e-05 ? which is best out of these 2 lr ?

I am afraid that these hyper parameters may need to be tuned on your own dataset.

Yes earlier it was 133 so i changed to 17. You mean only num_ouput_channels or num_keypoints also ?

Is the lr incorrect ? bcoz by default in config lr = 5e-4

@liqikai9
Copy link
Collaborator

You mean only num_ouput_channels or num_keypoints also ?

I mean the number of channels should match the output channels you need in your own dataset.

@YuktiADY
Copy link
Author

YuktiADY commented Jun 18, 2022

yes changed still not working..

Can you check this part, i guess some problem in this.

test_pipeline = val_pipeline

data_root = 'TheodorePlusV2Dataset'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=dict(
type='TopDownCocoWholeBodyDataset',
ann_file=f'{data_root}/annotations/coco_wholebody_train_v1.0.json',
img_prefix=f'{data_root}/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
val=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/old_annotations/person_keypoints_scenario2.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type='TheodorePlusV2Dataset',
#ann_file=f'{data_root}/annotations/coco_wholebody_val_v1.0.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/val2017/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

How when giving the checkpoint of coco its working ,?
Also does it make difference while testing these pre trained model with coco chekpoint not coco whole body.Does the APresults affect ?

@liqikai9
Copy link
Collaborator

How many keypoints does TheodorePlusV2Dataset have?
If it is 17, this is a COCO-style dataset, please set the channels to 17 and use the checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'

If it is 133, this is a COCO-Wholebody-style dataset, please set the channels to 133 and use the checkpoint like this:

load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet_w48_coco_wholebody_384x288-6e061c6a_20200922.pth'

@YuktiADY
Copy link
Author

Hello,

Where i can find this file python3.x/site-packages/pycocoapi/cocoeval.py ?

@ly015
Copy link
Member

ly015 commented Jun 23, 2022

You can try pip show pycocotools.

@YuktiADY
Copy link
Author

Hello,

I am trying to evaluate the 13 keypoints so i did few changes in cocoeval.py file with slicing
i am getting this error.

Traceback (most recent call last):
File "./mmpose/tools/test.py", line 174, in
main()
File "./mmpose/tools/test.py", line 168, in main
results = dataset.evaluate(outputs, cfg.work_dir, eval_config)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 318, in evaluate
info_str = self._do_python_keypoint_eval(res_file)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 372, in _do_python_keypoint_eval
coco_eval.evaluate()
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 262, in evaluate
for imgId in p.imgIds
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 263, in
for catId in catIds}
File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 407, in computeOks
e = (dx
2 + dy**2) / vars / (gt['area']+np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (13,) (17,)

@liqikai9
Copy link
Collaborator

I am trying to evaluate the 13 keypoints so i did few changes in cocoeval.py file with slicing

Seems like the number of keypoints which is 13 in your dataset, did not match the COCO's 17 keypoints.

Could you conveniently show the modifications you did?

@YuktiADY
Copy link
Author

YuktiADY commented Jun 25, 2022

I did these changes in cocoeval.py. Actually i am only fetching to evaluate only 13 keypoints from these 17 keypoints,

  1. i changed this : #self.sigmas = np.array(
    #[.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0
    self.sigmas = np.array(
    [.26, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0

  2. I added slicing for g and d in computeoks function

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

I checked everywhere but didnt get to resolve this error . Please give your valuable suggestions

@YuktiADY
Copy link
Author

yes i mentioned it 17 only.

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

@liqikai9
Copy link
Collaborator

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

What did these two lines do?

And make sure you did the modifications in
/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py

@YuktiADY
Copy link
Author

YuktiADY commented Jun 26, 2022

g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])
d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])

What did these two lines do?

And make sure you did the modifications in /home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py

Actually in our dataset nose keypoint is present in array index 0,1,2 so we are slicing nose[0:3] and rest others [15:]exculding the keypoints of eyes and ears. Because in our dataset we dont have eyes and ears detected , it is based on 13 keypoints. So, we are evaluating only 13 keypoints.
So added these two slicing to exclude the eyes and ears which is not present

But this error i am getting .Please suggest on this .
Yes i did changes in this above path only.

i added here these changes even i tried to print but print also not working

elif p.iouType == 'keypoints_righthand':
g = np.array(gt['righthand_kpts'])
else:
g = np.array(gt['keypoints'])

        **g = np.array(gt['keypoints'][0:3] + gt['keypoints'][15:])**        
        xg = g[0::3]; yg = g[1::3]; vg = g[2::3]
        k1 = np.count_nonzero(vg > 0)
        bb = gt['bbox']
        x0 = bb[0] - bb[2]; x1 = bb[0] + bb[2] * 2
        y0 = bb[1] - bb[3]; y1 = bb[1] + bb[3] * 2
        for i, dt in enumerate(dts):
            if p.iouType == 'keypoints_wholebody':
                body_dt = dt['keypoints']
                foot_dt = dt['foot_kpts']
                face_dt = dt['face_kpts']
                lefthand_dt = dt['lefthand_kpts']
                righthand_dt = dt['righthand_kpts']
                wholebody_dt = body_dt + foot_dt + face_dt + lefthand_dt + righthand_dt
                d = np.array(wholebody_dt)

            elif p.iouType == 'keypoints_foot':
                d = np.array(dt['foot_kpts'])
            elif p.iouType == 'keypoints_face':
                d = np.array(dt['face_kpts'])
            elif p.iouType == 'keypoints_lefthand':
                d = np.array(dt['lefthand_kpts'])
            elif p.iouType == 'keypoints_righthand':
                d = np.array(dt['righthand_kpts'])
            else:
                d = np.array(dt['keypoints'])
            
            **d = np.array(dt['keypoints'][0:3] + dt['keypoints'][15:])**
            #print(d.size)

@liqikai9
Copy link
Collaborator

liqikai9 commented Jun 27, 2022

self.sigmas = np.array(
[.26, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0

Did you changed this line in cocoeval.py? I think this may be the problem.

According to this line, we are using the sigmas in the datasetinfo.

You can see how the sigmas is chosen here.

So actually, with your modification, you are using the original COCO sigmas which have 17 numbers, instead of your expected sigmas.

You can add sigmas in the datasetinfo like this.

@YuktiADY
Copy link
Author

YuktiADY commented Jun 27, 2022

Yes i have added the sigmas in the cocoeval.py

I have changed in datasetinfo too below:
sigmas=[
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089

But still getting this error .
File "./mmpose/tools/test.py", line 174, in
main()
File "./mmpose/tools/test.py", line 168, in main
results = dataset.evaluate(outputs, cfg.work_dir, eval_config)
File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 311, in evaluate
keep = nms(img_kpts, oks_thr, sigmas=self.sigmas)
File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 116, in oks_nms
sigmas, vis_thr)
File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 81, in oks_iou
e = (dx
2 + dy**2) / vars / ((a_g + a_d[n_d]) / 2 + np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (17,) (13,)

Actually the error is coming in top_down_coco_dataset.py where it is taking config of coco.py and there it is taking 17 sigma values as below:
0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089
I tried giving mine 13 sigma values here but get the same error as above.

Do i need to change in this coco,py file too the sigma values ?

Not sure from where it is giving value error: 17, 13
Still the error persists. Please give your valuable suggestions.

@liqikai9
Copy link
Collaborator

May I see the config you use to instantiate the dataset object? I guess the datasetinfo file used to instantiate the dataset may be mistakenly specified as coco.py. In top_down_coco_dataset.py, we use coco.py as the default datainfo file.

@YuktiADY
Copy link
Author

## THEODORE.PY

dataset_info = dict(
dataset_name='theodore',
paper_info=dict(
author='Lin, Tsung-Yi and Maire, Michael and '
'Belongie, Serge and Hays, James and '
'Perona, Pietro and Ramanan, Deva and '
r'Doll{'a}r, Piotr and Zitnick, C Lawrence',
title='Learning from THEODORE',
container='CVPR',
year='2014',
homepage='http://cocodataset.org/',
),
keypoint_info={
0:
dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(
name='left_eye',
id=1,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
2:
dict(
name='right_eye',
id=2,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
3:
dict(
name='left_ear',
id=3,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
4:
dict(
name='right_ear',
id=4,
color=[51, 153, 255],
type='upper',
swap='left_ear'),
5:
dict(
name='left_shoulder',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
6:
dict(
name='right_shoulder',
id=6,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
7:
dict(
name='left_elbow',
id=7,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
8:
dict(
name='right_elbow',
id=8,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
9:
dict(
name='left_wrist',
id=9,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
10:
dict(
name='right_wrist',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
11:
dict(
name='left_hip',
id=11,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='left_knee',
id=13,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
14:
dict(
name='right_knee',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
15:
dict(
name='left_ankle',
id=15,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
16:
dict(
name='right_ankle',
id=16,
color=[255, 128, 0],
type='lower',
swap='left_ankle')
},
skeleton_info={
0:
dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
1:
dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
2:
dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
3:
dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
4:
dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
5:
dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
6:
dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
7:
dict(
link=('left_shoulder', 'right_shoulder'),
id=7,
color=[51, 153, 255]),
8:
dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
9:
dict(
link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
10:
dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
11:
dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
12:
dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
13:
dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
14:
dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
15:
dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
16:
dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
17:
dict(link=('left_ear', 'left_shoulder'), id=17, color=[51, 153, 255]),
18:
dict(
link=('right_ear', 'right_shoulder'), id=18, color=[51, 153, 255])
},
joint_weights=[
1., 1., 1., 1., 1., 1., 1., 1.2, 1.2, 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5,
1.5
],
sigmas=[
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089

])

CONFIG FILE (HRNET_W48_384 x 288)

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py']
log_level = 'INFO'
#load_from = None
load_from = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_384x288-314c8528_20200708.pth'
#resume_from = '/home/yukti/mmpose/theodore_2022-05-24/epoch_50.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=5)
evaluation = dict(interval=1, metric='mAP', save_best='AP')

optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[30, 45])
total_epochs = 60
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])

channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
list(range(17)),
],
inference_channel=list(range(17)))

model settings

model = dict(
type='TopDown',
pretrained='https://download.openmmlab.com/mmpose/'
'pretrain_models/hrnet_w48-8ef0771d.pth',
backbone=dict(
type='HRNet',
in_channels=3,
extra=dict(
stage1=dict(
num_modules=1,
num_branches=1,
block='BOTTLENECK',
num_blocks=(4, ),
num_channels=(64, )),
stage2=dict(
num_modules=1,
num_branches=2,
block='BASIC',
num_blocks=(4, 4),
num_channels=(48, 96)),
stage3=dict(
num_modules=4,
num_branches=3,
block='BASIC',
num_blocks=(4, 4, 4),
num_channels=(48, 96, 192)),
stage4=dict(
num_modules=3,
num_branches=4,
block='BASIC',
num_blocks=(4, 4, 4, 4),
num_channels=(48, 96, 192, 384))),
),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=48,
out_channels=channel_cfg['num_output_channels'],
num_deconv_layers=0,
extra=dict(final_conv_kernel=1, ),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))

data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
#bbox_file='/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2/coco_annotations/person_bbox_valid.json',
#bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_bboxes_scenario2.json',
bbox_file='/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_bboxes_scenario1.json',

)

train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(
type='Albumentation',
transforms=[
dict(
type='GaussNoise',
var_limit=(10.0, 50.0)),
]),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=3),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]

val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]

test_pipeline = val_pipeline

dataset settings

dataset_type = 'TheodorePlusV2Dataset'

data_root = '/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2'
data = dict(
samples_per_gpu=32,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=32),
test_dataloader=dict(samples_per_gpu=32),
train=[
dict(type=dataset_type,
ann_file=f'{data_root}/coco_annotations/person_keypoints_train.json',
img_prefix=f'{data_root}/train/img_png/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
dict(type='TopDownCocoDataset',
ann_file=f'/mnt/data/yjin/coco/annotations/person_keypoints_train2017.json',
img_prefix=f'/mnt/data/yjin/coco/images/train2017/',
data_cfg=data_cfg,
pipeline=train_pipeline,
dataset_info={{base.dataset_info}}),
],
val=dict(
type=dataset_type,
#ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_keypoints_scenario2.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
#img_prefix=f'{data_root}/valid/img_png/',
data_cfg=data_cfg,
pipeline=val_pipeline,
dataset_info={{base.dataset_info}}),
test=dict(
type=dataset_type,
#ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json',
#ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario2/person_keypoints_scenario2.json',
ann_file=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/coco_annotations_final_corrected_2022/person_keypoints_scenario1.json',
#img_prefix=f'{data_root}/valid/img_png/',
img_prefix=f'/mnt/dst_datasets/own_omni_dataset/FES_keypoints/scenario1/JPEGImages/',
data_cfg=data_cfg,
pipeline=test_pipeline,
dataset_info={{base.dataset_info}}),
)

@liqikai9
Copy link
Collaborator

File "/home/yukti/mmpose/mmpose/mmpose/core/post_processing/nms.py", line 81, in oks_iou e = (dx2 + dy**2) / vars / ((a_g + a_d[n_d]) / 2 + np.spacing(1)) / 2
ValueError: operands could not be broadcast together with shapes (17,) (13,)

I think this error may be that the keypoints prediction results' shape is (17,) but the sigmas is (13,) so their sizes did not match

Please double-check the shape of your prediction results and the sigmas.

Since you have exculded the keypoints of eyes and ears and modified the cocoeval.py, maybe you can try to change this line: https://github.com/jin-s13/xtcocoapi/blob/master/xtcocotools/cocoeval.py#L319 directly into:

sigmas=np.array([
0.026, 0.079, 0.079, 0.072, 0.072, 0.062, 0.062,
0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])

and see if the error still exists.

@jin-s13
Copy link
Collaborator

jin-s13 commented Jun 29, 2022

In order to make each issue focus on one question, I will close this issue for now.
Please open a new issue, if you have any other questions. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/discussion community discussion
Projects
None yet
Development

No branches or pull requests

4 participants