Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtract mean image or pixel in both training and inference #169

Closed
lukeyeager opened this issue Jul 16, 2015 · 7 comments
Closed

Subtract mean image or pixel in both training and inference #169

lukeyeager opened this issue Jul 16, 2015 · 7 comments
Assignees
Labels

Comments

@lukeyeager
Copy link
Member

Thanks to Yuguang Lee for pointing this out to me on digits-users.

Problem

Currently, DIGITS subtracts the mean image during training (see here) and the mean pixel during inference (see here). That's both illogical and incorrect [but still works pretty well, which is why I hadn't noticed it].

Explanation

I moved to subtracting the mean pixel during inference because it simplifies the inference path. I'll give an example to explain:

Let's say you have 256x256 images in your dataset and your crop size is 227. During training, Caffe first subtracts the 256x256 mean image from the input image, and then takes a random 227x227 crop of the image before passing it to your network. But at inference time, the network only knows that images are supposed to be 227x227. So DIGITS resizes your test image to 227x227 and then subtracts the mean pixel across the whole image before passing it to the network.

Solution

DIGITS should allow you to do either mean pixel subtraction or mean image subtraction, but it shouldn't do this strange hybrid.

Mean pixel

  1. For training - provide transform_param.mean_value (see here).
  2. For inference - resize test image to crop size and then subtract the mean pixel. DIGITS already does this.

Mean image

  1. For training - provide transform_param.mean_file (see here). DIGITS already does this.
  2. For inference - resize test image to the mean file size, subtract mean, resize again to the crop size.
@lukeyeager lukeyeager added the bug label Jul 16, 2015
lukeyeager added a commit that referenced this issue Jul 17, 2015
In the future, DIGITS should allow the user to subtract EITHER the mean
image OR the mean pixel. For now, let's at least be consistent.
lukeyeager added a commit that referenced this issue Jul 17, 2015
In the future, DIGITS should allow the user to subtract EITHER the mean
image OR the mean pixel. For now, let's at least be consistent.
@lukeyeager
Copy link
Member Author

Temporary fix for master in 3ea12e6 and for digits-2.0 in 7143260.

Still want to enable selection of either option in the UI.

@andredubbel
Copy link

Hi @lukeyeager

Dunno If this is the correct way of bringing this up but I stumbled upon this, when trying to figure out why my pycaffe implementation is giving different output than DIGITS at test time, and I had some thoughts I wanted to share:

Since the test image is being resized without cropping, I guess the assumption is that the test image has already been cropped to the proper scale. Wouldn't it then be more correct to crop the mean image before subtraction rather than resizing it and thus changing the scale of whatever "mean object" it depicts?

Probably won't have a very big impact either way but feels like this way should be closer to what happens during training.

@lukeyeager
Copy link
Member Author

Good question @andredubbel. I dug into it some more and learned a little more:

Training with caffe -train:

Using caffe.io.Transformer:

  • The mean that you provide must already be cropped
    • Again, not an issue for this discussion, but what do you do if you want to train with pycaffe? It seems you wouldn't be able to replicate the command-line results exactly without random crop windows for the mean image.

In order to match the VAL phase of training exactly when we test an image in DIGITS, we would need to do the following:

  1. Crop the mean according to crop_size
  2. Initialize the Transformer object with the cropped mean
  3. Resize test images to the original dataset size (i.e. 256x256)
  4. Crop test image to crop size (i.e. 227x227)
    • Can't let the Transformer do the 256->227 conversion because it resizes instead of crops
  5. Use the Transformer to preprocess the image
    • Subtract the mean, etc.

Here's what we're currently doing instead:

  1. Resize mean to crop_size
  2. Initialize the Transformer with the resized mean
  3. Resize test image to the original dataset size
  4. Use the Transformer to preprocess the image
    • Resize to crop_size
    • Subtract the resized mean

Things we would need to change:

  1. Crop the mean image instead of resizing to crop_size
  2. Crop images before passing them to Transformer.preprocess (then the resize within Transformer will be a noop)

Those seem like pretty reasonable changes to me.

@andredubbel
Copy link

Wow, thanks for a very thorough response @lukeyeager

Things we would need to change:

  1. Crop the mean image instead of resizing to crop_size
  2. Crop images before passing them to Transformer (then the resize within Transformer will be a noop)

I definitely agree with point 2, point 1 I think is a question of preference if not correctness.

I tend to think of the training data's crop size as it's "actual" size, and the input size that size plus padding. The question then is if you expect test images to have the same padding or not.

When deploying the model you most likely won't be doing any padding and if you wanted DIGITS to match a deployed system you might be better off just sticking to the squashing. That is, unless you want to have the option of oversampling in which case some padding might be needed :)

In the end I think I would be happy with either soluion as long as it's clear what's expected of the test images.

@willishf
Copy link

Looks like I have run into the mean pixel vs mean image problem comparing results from DIGITS 5.0 interface to method used in example.py. It appears to have a significant impact and based on searching the forum can only find a discussion with no solution.

I have already trained model with mean image subtraction which was the default and it took 3 days to train the model. In testing the model using the Digits web interface classify many very happy with the results in that for the primary target that I want to identify minimal false positives in other test images and near perfect on the primary image. Working on an April 1st deadline for a cancer tumor classification challenge so don't have time to do anything significantly different at this point.

The images that I am submitting to example.py are already 256x256 so they don't need to be resized. The images I submitted to Caffe for training were also 256x256 so shouldn't have been resized related to the potential of a resizing problem.

I am passing the mean file on the command line from the model via --mean mean.binaryproto which is triggering if mean_file in the get_transformer(deploy_file, mean_file=None)

The comment clearly indicates #set mean pixel as the method. In looking through the code where I have no familiarity with the API I don't see an obvious path to doing a mean of the image where the assumption it should be a trivial change.

Is it possible that someone who is familiar with the API could look at the code and provide the changes that are required to do mean image subtraction instead of mean pixel subtraction?

The results I am getting with test images using the model that appears to be very good using the DIGITS web interface is not performing well via the example.py approach. I am already setup with a pipeline that has been tested using example.py for many many many images so hoping for a one or two line code update to improve results.

@karimhasebou
Copy link

@willishf were you able to fix the problem ? I have a model that was trained in digits gets 96 % validation in digits. Once i load it into pycaffe, it gets 10 % on the same validation data

@willishf
Copy link

willishf commented Aug 30, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants