New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is regression based method better than heatmap based? #7

Closed
Xiangyu-CAS opened this Issue Apr 12, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@Xiangyu-CAS

Xiangyu-CAS commented Apr 12, 2018

Hi~ Recently, I have read a lot papers related to keypoint localization and found out some of them could be categorized into three kinds based on loss ( regerssion directly, dense predicted by heatmap, and one-shot semantic segmentation). I am curious about which way is better.

It seems your work aims to regress 16 point directly. is this possible to get better performance by predicting heatmap densely?

@YuliangXiu

This comment has been minimized.

Show comment
Hide comment
@YuliangXiu

YuliangXiu Apr 12, 2018

Owner

Your question is GREAT, I am thinking about this question during these days, too!
The reason why I initially chose regression method is that inference speed is what I really care about most. Personally, I guess one-hot mask (e.g. Mask R-CNN) and heatmap (e.g. Stacked Hourglass Network) may guarantee better mAP but probably slower than direct regression ( Just a guess, have not verified by myself ). I have a plan to do some related experiments to verify these three choices and figure out which method is the best choice to balance accuracy-speed tradeoff.
What do you think of these three different choices? I'd like to know your opinions on this problem. ( I notice you have reimplemented several popular pose estimator. )

Owner

YuliangXiu commented Apr 12, 2018

Your question is GREAT, I am thinking about this question during these days, too!
The reason why I initially chose regression method is that inference speed is what I really care about most. Personally, I guess one-hot mask (e.g. Mask R-CNN) and heatmap (e.g. Stacked Hourglass Network) may guarantee better mAP but probably slower than direct regression ( Just a guess, have not verified by myself ). I have a plan to do some related experiments to verify these three choices and figure out which method is the best choice to balance accuracy-speed tradeoff.
What do you think of these three different choices? I'd like to know your opinions on this problem. ( I notice you have reimplemented several popular pose estimator. )

@Xiangyu-CAS

This comment has been minimized.

Show comment
Hide comment
@Xiangyu-CAS

Xiangyu-CAS Apr 12, 2018

I would appreciate it if you could share some progress on this problem after experiments :) .

As to me, I did not carry out much experiments, so I can only guess based on conference papers.
A paper [1] saying heatmap is better than regression and I have noticed nearly all the papers chose to use heatmap after this idea was first proposed in ECCV 2014. So maybe it is true that heatmap is better than regression.
However, it really hard to tell which is better between heatmap and one-shot mask, because both of them achieved decent performance on MSCOCO dataset. In my view, I think one-shot mask is promising. Softmax loss seems perform better in classification. But it may have more difficulties in training. For example, a network predict a keypoint at position (1, 1) and GT is actually (0, 1), If I choose heatmap and regression, loss is small but for one-shot mask loss is large ( and it is larger than blank prediction).

BTW, My classmate used deeplab_v3 as baseline to predict keypoints on FashionAI challenge and got 13% without any augmentation. My heatmap baseline got 12% with augmentation.

Please don't hesitate to point out my mistakes : )

[1] Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification ; CVPR 2018

Xiangyu-CAS commented Apr 12, 2018

I would appreciate it if you could share some progress on this problem after experiments :) .

As to me, I did not carry out much experiments, so I can only guess based on conference papers.
A paper [1] saying heatmap is better than regression and I have noticed nearly all the papers chose to use heatmap after this idea was first proposed in ECCV 2014. So maybe it is true that heatmap is better than regression.
However, it really hard to tell which is better between heatmap and one-shot mask, because both of them achieved decent performance on MSCOCO dataset. In my view, I think one-shot mask is promising. Softmax loss seems perform better in classification. But it may have more difficulties in training. For example, a network predict a keypoint at position (1, 1) and GT is actually (0, 1), If I choose heatmap and regression, loss is small but for one-shot mask loss is large ( and it is larger than blank prediction).

BTW, My classmate used deeplab_v3 as baseline to predict keypoints on FashionAI challenge and got 13% without any augmentation. My heatmap baseline got 12% with augmentation.

Please don't hesitate to point out my mistakes : )

[1] Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification ; CVPR 2018

@YuliangXiu

This comment has been minimized.

Show comment
Hide comment
@YuliangXiu

YuliangXiu Apr 12, 2018

Owner

There is one paper [1] which compares regression (R1), heatmap (H1), the one-hot mask (H2, H3) on 3D pose estimation track. Performance changes slightly according to a different task or metric and there is no speed discussion, I will do some related experiments and report final results to you when finished.
[1] Integral Human Pose Regression

Owner

YuliangXiu commented Apr 12, 2018

There is one paper [1] which compares regression (R1), heatmap (H1), the one-hot mask (H2, H3) on 3D pose estimation track. Performance changes slightly according to a different task or metric and there is no speed discussion, I will do some related experiments and report final results to you when finished.
[1] Integral Human Pose Regression

@YuliangXiu YuliangXiu closed this Apr 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment