Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input image normalization #48

Open
khanhha opened this issue Jan 15, 2021 · 4 comments
Open

input image normalization #48

khanhha opened this issue Jan 15, 2021 · 4 comments

Comments

@khanhha
Copy link

khanhha commented Jan 15, 2021

Hello,

sorry for asking you too many questions.

I would like to know where the input images are actually normalized to ImageNet mean and standard deviation?

As far as I know, the code just subtracts the input image by the pixel_means = np.array([[[123.68, 116.78, 103.94]]])

image

The input heatmaps are in the range of [0, 255.0]

image

I can see two problems here

  • the input value ranges are big >> 1.0, doe sit cause any gradient problem?
  • the input value range of input image and input heatmaps are not on the same scale. Input images are subjected from the pixel_means, while the heatmap value ranges are in [0, 255].

Could you please describe a bit more about this kind of normalization? Or Did I miss something from your code?

Thanks!

@mks0601
Copy link
Owner

mks0601 commented Jan 15, 2021

  • the input value ranges are big >> 1.0, doe sit cause any gradient problem?
    -> No. Many works just use 0~255 pixel valued-images as the input of their networks.

  • the input value range of input image and input heatmaps are not on the same scale. Input images are subjected from the pixel_means, while the heatmap value ranges are in [0, 255].
    -> The output scale does not have to be the same with that of the input.

@khanhha
Copy link
Author

khanhha commented Jan 15, 2021

Thanks for your answer.

-> The output scale does not have to be the same as that of the input:
--> Sorry for the misunderstanding. I mean the scale value between two kinds of input: input image and input keypoint heatmaps; Image has a value range of [-126, 126], and input heatmap has a value of a range of [0, 255]. They are concatenated into a multi-channel input before passing into convolution operations, so I suppose that they should be normalized to the same scale.

Anyway, your normalization technique produces accurate results, so that means it works. Maybe it's just a matter of training convergence.

Best

@mks0601
Copy link
Owner

mks0601 commented Jan 16, 2021

Although they are concatenated, their scales do not have to be in the same range. The learnable weights will consider the scale difference. I think you don't have to worry about that.

@khanhha
Copy link
Author

khanhha commented Jan 16, 2021

Thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants