Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling YOLOv3 predictions instead of resize may have a better performance? #14

Closed
xtyDoge opened this issue Jul 25, 2019 · 3 comments
Closed
Labels
enhancement New feature or request priority Issues with priority

Comments

@xtyDoge
Copy link

xtyDoge commented Jul 25, 2019

The prediction of YOLOv3 usually a rectangle and I see that you use transforms method to resize the image before put it into HRNet. It stretching the image and make pose estimation network performs badly.
I try to make YOLOv3 prediction regions have the same h/w ratio as the input of HRNet. For example, HRNet requires image_resolution (256, 256),but YOLOv3 gives a rectangle region. What we should do is calculate the center of YOLO prediction region and expand it as a square.

In SimpleHRNet._predict_single I modify it by adding:

squareLen = max(x2 - x1, y2 - y1) // 2
centerX = x1 + (x2 - x1) // 2
centerY = y1 + (y2 - y1) // 2
x1 = max(0, centerX - squareLen)
x2 = min(len(image[1]), centerX + squareLen)
y1 = max(0, centerY - squareLen)
y2 = min(len(image[0]), centerY + squareLen)

And it works on pose_hrnet_w32_256x256 with mpii annotation.
Maybe you can add it on your proj :D

@stefanopini
Copy link
Owner

That's a cool idea, thank you! I will surely add it to the project, probably as an option.

Have you tested also with models pre-trained on COCO and bounding boxes with aspect ratio 4/3 (as pose_hrnet_w32_256x192 and pose_hrnet_w48_384x288)?

@stefanopini stefanopini added enhancement New feature or request priority Issues with priority labels Aug 1, 2019
@xtyDoge
Copy link
Author

xtyDoge commented Aug 4, 2019

@stefanopini Sorry I don't, maybe you can try it.

@xtyDoge xtyDoge closed this as completed Aug 4, 2019
stefanopini added a commit that referenced this issue Sep 29, 2019
Adapt detection bounding boxes to match HRNet input aspect ratio (as suggested by xtyDoge in issue #14). Huge accuracy improvement in the multiperson setting.
@stefanopini
Copy link
Owner

Added from commit af40b39
Thank you for the suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority Issues with priority
Projects
None yet
Development

No branches or pull requests

2 participants