question about standardization #55

yuffon · 2019-03-08T02:37:26Z

I find that in the training and testing phase, the dataset is standardized as a whole batch, including computing the mean and variance from the entire cifar10 dataset.
However, when the model is deployed, the image is feed individually, how should we preprocess the image?
What mean and variance should we use when the input image is of the same category but not included in the cifar10 dataset?

liuzhuang13 · 2019-03-08T21:15:35Z

Hi @yuffon I think we just subtract the training means and stds in all testing, this is standard in machine learning.

yuffon · 2019-03-09T02:49:29Z

Hi @yuffon I think we just subtract the training means and stds in all testing, this is standard in machine learning.

Yes, this is common in ML. What if the input image is not from the 10 categories in cifar10 dataset?
If I want to do transfer learning or other tasks, how should I preprocess the data?
I think it is more convenient with per image standardization.
So I replaced the whole-batch standardization with per image standardization, the acc doesn't converge, untill that I add an BN layer without scale and shift variables at the start point of the densenet. I have not checked the final accuracy results in different scenarios.

liuzhuang13 · 2019-03-09T04:31:44Z

I don't think one should preprocess the images by its own mean and stds, that causes different input to be changed by different amounts.

CIFAR is not a good dataset to transfer from, I think if you do transfer learning it's better to start with ImageNet models. In this case, you still subtract imageNet training data's mean and stds.

yuffon · 2019-03-11T00:41:46Z

In TensorFlow's official tutorial Resnet repository, they use per image standardization.
https://github.com/tensorflow/models/tree/master/official/resnet

yuffon · 2019-03-18T05:24:46Z

I don't think one should preprocess the images by its own mean and stds, that causes different input to be changed by different amounts.

CIFAR is not a good dataset to transfer from, I think if you do transfer learning it's better to start with ImageNet models. In this case, you still subtract imageNet training data's mean and stds.

If I make a new network that trained on cifar10, but I want to test the model on image out of the cifar10 dataset and observe the network behavior on the out-of-distribution data. How should I normalize the data?

liuzhuang13 · 2019-03-26T23:15:04Z

I think the common practice is to just normalize the data as the way it is normalized during training (and using the training stats).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about standardization #55

question about standardization #55

yuffon commented Mar 8, 2019 •

edited

Loading

liuzhuang13 commented Mar 8, 2019

yuffon commented Mar 9, 2019 •

edited

Loading

liuzhuang13 commented Mar 9, 2019

yuffon commented Mar 11, 2019 •

edited

Loading

yuffon commented Mar 18, 2019

liuzhuang13 commented Mar 26, 2019

question about standardization #55

question about standardization #55

Comments

yuffon commented Mar 8, 2019 • edited Loading

liuzhuang13 commented Mar 8, 2019

yuffon commented Mar 9, 2019 • edited Loading

liuzhuang13 commented Mar 9, 2019

yuffon commented Mar 11, 2019 • edited Loading

yuffon commented Mar 18, 2019

liuzhuang13 commented Mar 26, 2019

yuffon commented Mar 8, 2019 •

edited

Loading

yuffon commented Mar 9, 2019 •

edited

Loading

yuffon commented Mar 11, 2019 •

edited

Loading