Origin of the means and stds used for preprocessing? #1439

pmeier · 2019-10-09T08:58:41Z

Does anyone remember how exactly we came about the channel means and stds we use for the preprocessing?

transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

I think the first mention of the preprocessing in this repo is in #39. In that issue @soumith points to https://github.com/pytorch/examples/tree/master/imagenet for reference. If you look at the history of main.py the commit pytorch/examples@27e2a46 first introduced the values. Unfortunately it contains no explanation, hence my question.

Specifically, I'm seeking answers to the following questions:

Are these values rounded, floored, or even ceiled?
Did we use only the images in the training set of ImageNet or additionally the images of the validation set?
Did we perform any kind of resizing or cropping on each image before the calculations were performed?

I've tested some combinations and will post my results here.

Parameters	mean	std
train set only, no resizing / cropping	`[0.4803, 0.4569, 0.4083]`	`[0.2806, 0.2736, 0.2877]`
train set only, resize to 256 and center crop to 224	`[0.4845, 0.4541, 0.4025]`	`[0.2724, 0.2637, 0.2761]`
train set only, center crop to 224	`[0.4701, 0.4340, 0.3832]`	`[0.2845, 0.2733, 0.2805]`

While the means match fairly well, the std differ significantly.

Update:

The process for obtaining the values of mean and std was roughly equivalent to the following but the the concrete subset that was used is lost:

import torch
from torchvision import datasets, transforms as T

transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.PILToTensor(), T.ConvertImageDtype(torch.float)])
dataset = datasets.ImageNet(".", split="train", transform=transform)

means = []
stds = []
for img in subset(dataset):
    means.append(torch.mean(img))
    stds.append(torch.std(img))

mean = torch.mean(torch.tensor(means))
std = torch.mean(torch.tensor(stds))

See #1965 for the reproduction experiments.

The text was updated successfully, but these errors were encountered:

nizhib · 2019-10-10T21:53:24Z

You need to go deeper ;)

https://github.com/facebook/fb.resnet.torch/blob/master/datasets/imagenet.lua

-- Computed from random subset of ImageNet training images
local meanstd = {
   mean = { 0.485, 0.456, 0.406 },
   std = { 0.229, 0.224, 0.225 },
}

pmeier · 2019-10-11T09:40:51Z

For my project I need to know the covariances between the channels. Since they are not part of the current implementation, my hope was that I can calculate them myself if I know the necessary images and processing. Unfortunately

random subset

gives me little hope that I'm able to do that. I suppose no one remembers how this random subset was selected?

Should we investigate this further? I'm a little anxious that we simply use this normalization for all our models without being able to reproduce it.

fmassa · 2019-10-14T09:51:36Z

@colesbury do you have more information here to clarify on the mean / std for imagenet that we use?

soumith · 2019-10-14T22:32:26Z

afaik we calculated the mean / std to use by running one pass on the training set of Imagenet

soumith · 2019-10-14T22:33:13Z

that being said, i see that std is not matching. possibly a bug of the past or some detail that we completely forgot about :-/

apple2373 · 2019-10-20T08:52:34Z

Can we put batch normalization layer before input so that mean/std will be computed automatically in the training time?

pmeier · 2019-10-21T07:07:43Z

@apple2373 We currently implementing the transforms for tensors in order to be able to use them within a model (see #1375). Whether we want to include them within the models is AFAIK still up for discussion (see #782)

pmeier · 2019-10-21T11:19:39Z

@fmassa @soumith

Any update on this? Do we investigate further or keep it as is?

fmassa · 2019-10-21T13:46:33Z

@pmeier I don't know if we will ever be able to get back those numbers, given that they seem to have been computed on a randomly-sampled part of the dataset.

If we really want to see if this has any impact, we would run multiple runs of end-to-end training with the new mean/std and see if it brings any noticeable improvement.

pmeier · 2019-10-21T13:59:47Z

I don't think we get significant improvement (or decline) of performance. I just think we shouldn't use numbers that are not reproducible. A change like this is of course a lot of work, BC breaking etc, but we don't know what the future brings. Maybe this becomes significant in the future and than its even harder to correct.

fmassa · 2019-10-21T14:26:26Z

don't think we get significant improvement (or decline) of performance. I just think we shouldn't use numbers that are not reproducible.

I agree. But given the scale of how things would break with such a change, I think we should just live with it for now, and maybe document somewhere the findings you have shown in here.

colesbury · 2019-10-21T21:21:46Z

It's been almost four years, so I don't remember, but I probably just used the mean / std from the previous Lua ImageNet training script:

https://github.com/soumith/imagenet-multiGPU.torch/blob/deb5466a16e54ec7a69fe027e5fbcd3c1bfb49cc/donkey.lua#L161-L187

It uses the average standard deviation of an individual image's channel instead of the an estimate of the standard deviation across the entire dataset.

I don't think we should change the mean/std, nor do I see any reproducibility issue. The scientific result here is the neural network, not mean/std values. Especially since the exact choice does not matter as long as they approximately whiten the input.

nizhib · 2019-10-23T11:32:35Z

A change like this is of course a lot of work, BC breaking etc, but we don't know what the future brings.

These numbers have become a standard for most neural networks created so far, it's not just a lot of work — one need to retrain hundreds of neural networks (approx. 2 gpu x week each for a model like resnet50) and create pull requests for all the pretrainedmodels/dpn/wide resnets/etc. repos all over the github just to adjust normalizing std by 0.05. What the future can justify this?

fmassa · 2019-10-25T15:32:26Z

Following the discussion that we had in here, I agree with @colesbury and @nizhib points above.

@pmeier would you like to send a PR adding some summary of the discussion that we had here, including @colesbury comment on how those numbers were obtained?

pmeier · 2019-10-28T14:12:17Z

I'm covered for the next weeks. This will take some time.

Stannislav · 2020-07-07T16:27:24Z

Maybe the reason why the stds don't match is that it was originally called with unbiased=False?

pmeier · 2020-07-07T17:01:16Z

@Stannislav in #1965 I've managed to get pretty close the the original numbers.

pmeier closed this as completed Oct 28, 2019

pmeier mentioned this issue Mar 11, 2020

Document origin of preprocessing mean / std #1965

Merged

jbehley mentioned this issue Sep 8, 2020

The mean and std in config/arch PRBonn/lidar-bonnetal#43

Closed

pmeier mentioned this issue Apr 11, 2021

Error in mean and std computation in torchvision.models #3657

Closed

datumbox mentioned this issue Jan 10, 2022

add FCOS #4961

Merged

nps1ngh mentioned this issue May 22, 2022

Wrong std value in CIFAR10 config chenyaofo/image-classification-codebase#5

Closed

m-parchami mentioned this issue May 23, 2022

Cannot obtain the accuracy stated in the doc for inception_v3 pretrained on Imagenet #6066

Open

CatOfTheCannals mentioned this issue Jun 18, 2022

Transfer learning y convolucional nPironio/aaut-catdog#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Origin of the means and stds used for preprocessing? #1439

Origin of the means and stds used for preprocessing? #1439

pmeier commented Oct 9, 2019 •

edited by datumbox

Loading

nizhib commented Oct 10, 2019

pmeier commented Oct 11, 2019 •

edited

Loading

fmassa commented Oct 14, 2019

soumith commented Oct 14, 2019

soumith commented Oct 14, 2019

apple2373 commented Oct 20, 2019

pmeier commented Oct 21, 2019

pmeier commented Oct 21, 2019

fmassa commented Oct 21, 2019

pmeier commented Oct 21, 2019

fmassa commented Oct 21, 2019

colesbury commented Oct 21, 2019

nizhib commented Oct 23, 2019

fmassa commented Oct 25, 2019

pmeier commented Oct 28, 2019

Stannislav commented Jul 7, 2020

pmeier commented Jul 7, 2020

Origin of the means and stds used for preprocessing? #1439

Origin of the means and stds used for preprocessing? #1439

Comments

pmeier commented Oct 9, 2019 • edited by datumbox Loading

nizhib commented Oct 10, 2019

pmeier commented Oct 11, 2019 • edited Loading

fmassa commented Oct 14, 2019

soumith commented Oct 14, 2019

soumith commented Oct 14, 2019

apple2373 commented Oct 20, 2019

pmeier commented Oct 21, 2019

pmeier commented Oct 21, 2019

fmassa commented Oct 21, 2019

pmeier commented Oct 21, 2019

fmassa commented Oct 21, 2019

colesbury commented Oct 21, 2019

nizhib commented Oct 23, 2019

fmassa commented Oct 25, 2019

pmeier commented Oct 28, 2019

Stannislav commented Jul 7, 2020

pmeier commented Jul 7, 2020

pmeier commented Oct 9, 2019 •

edited by datumbox

Loading

pmeier commented Oct 11, 2019 •

edited

Loading