Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does pixel_std for? #94

Open
PaTricksStar opened this issue Feb 19, 2019 · 5 comments
Open

What does pixel_std for? #94

PaTricksStar opened this issue Feb 19, 2019 · 5 comments

Comments

@PaTricksStar
Copy link

PaTricksStar commented Feb 19, 2019

https://github.com/Microsoft/human-pose-estimation.pytorch/blob/c3a30c0e1f83e73b3038b1a443becf6b4a19cf1f/lib/dataset/JointsDataset.py#L31
I review the code and find the pixel_std represents the std of human bbox area, right?
But why we need to normalize the bbox scale and set it to 200?

@rafikg
Copy link

rafikg commented May 29, 2019

@PaTricksStar , I have the same question. Also, what about the scale = scale*1.25 in this function

def _xywh2cs(self, x, y, w, h):
        center = np.zeros((2), dtype=np.float32)
        center[0] = x + w * 0.5
        center[1] = y + h * 0.5

        if w > self.aspect_ratio * h:
            h = w * 1.0 / self.aspect_ratio
        elif w < self.aspect_ratio * h:
            w = h * self.aspect_ratio
        scale = np.array(
            [w * 1.0 / self.pixel_std, h * 1.0 / self.pixel_std],
            dtype=np.float32)
        if center[0] != -1:
            scale = scale * 1.25

return center, scale

@wanghao14
Copy link

@Gouiaa This also is what confuse me.
@leoxiaobin Could you please answer our questions?

@annopackage
Copy link

@PaTricksStar @leoxiaobin @wanghao14 @rafikg
Have you solved this? I am confused about it.

@PaTricksStar
Copy link
Author

I think It is just a hyper parameter representing the default w/h of the bounding box.
Just leave it alone.
Or you can try to email the author to verify .

@lqduc
Copy link

lqduc commented Oct 1, 2020

I think it is just a method they store values of bbox h and w. They divide h/w by 200 and then they get the h and w back in get_affine_transform by multiply scale by 200. It just a hyperparam and you could choose another number.

@rafikg As I say above, scale is just another representation of bbox h and w. I think they multiply scale with 1.25 to expand the bbox, in case the bbox fits the human body too much, which lead to information loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants