Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image.scale vs OpenCV resize #188

Closed
achalddave opened this issue Sep 6, 2016 · 9 comments
Closed

image.scale vs OpenCV resize #188

achalddave opened this issue Sep 6, 2016 · 9 comments

Comments

@achalddave
Copy link

It seems that OpenCV and this library's resize methods give different results, even though both use bilinear interpolation by default. It may be useful to point this out if this is expected.

I posted code and the example images in this Gist: https://gist.github.com/achalddave/d9e7a6416996c648c6e75355e3f87df1

@soumith
Copy link
Member

soumith commented Sep 6, 2016

@achalddave it looks like one of them uses "floor" to determine some sizes, and the other uses "ceil".

@soumith
Copy link
Member

soumith commented Sep 6, 2016

i dont think we explicitly tried to match opencv, so this is probably expected

@achalddave
Copy link
Author

achalddave commented Sep 6, 2016

👍

It seemed to throw off the relative rankings of class labels from a pretrained network, so I figured I'd check. I suppose anyone trying to replicate a Caffe model's results exactly can always use the torch opencv library.

@soumith
Copy link
Member

soumith commented Sep 6, 2016

was it a HUGE difference, or was it minor? :)

@achalddave
Copy link
Author

I want to say it's minor. I ran the network on images loaded using OpenCV and using Torch. In both cases, I used the image library's 'crop' function.

The top 10 categories look relatively different, but their scores are all fairly low (<0.08 [not shown in output below]), so that might be why.

Loading frames
Avg opencv image prediction 0.0020533881613644
Avg torch image prediction 0.0020533880204356
Max absolute difference 0.058509038761258
Sum of squared errors   0.0074829880344148
Top 10 OpenCV labels
 163
 359
 429
 392
 144
 364
 409
 287
 205
  46
[torch.LongTensor of size 10]

Top 10 Torch labels
 403
 364
 409
  46
 428
 287
 392
 359
 144
 429
[torch.LongTensor of size 10]

Top 10 OpenCV scores
0.01 *
 7.1058
 5.7152
 3.6673
 3.5534
 3.1041
 2.7137
 2.6046
 2.4163
 2.1523
 2.1312
[torch.DoubleTensor of size 10]

Top 10 Torch scores
0.01 *
 4.3588
 4.0891
 3.5908
 2.9616
 2.9594
 2.9134
 2.4286
 2.4174
 1.6333
 1.4935
[torch.DoubleTensor of size 10]

@zhangxiangxiao
Copy link
Contributor

Adversarial sample by size flooring :P

@senthilps8
Copy link

This might be more relevant now.
I loaded a torch trained model to pytorch using load_lua. The results are drastically different. Specifically this model for inpainting.
With cv2 resize:
predicted_cv2

With image.scale(), torch.save() and load_lua():
predicted

@soumith
Copy link
Member

soumith commented May 2, 2017

@senthilps8 if you look at the image after cv2 resize, you might see that it is scaled very differently than the image out of image.scale.

Just do this for both images:

print(img.mean(), img.std(), img.min(), img.max())

Try renormalize the image out of cv2.resize to have the same range as the one out of image.scale.

@senthilps8
Copy link

senthilps8 commented May 2, 2017

@soumith Are you referring to the difference of torch image belonging to [0,1] and cv2 images belonging to [0,255]? If so, I forgot to mention that I divide the cv2 images by 255.0 for normalizing. If not, is there a normalization step in image.scale that I'm not aware of? Here's a snippet of the preprocessing I use for cv2 (I've also tried all the other modes of interpolation).

    iminput = cv2.imread(imPath);
    iminput = cv2.resize(iminput, (inputSize,inputSize), interpolation=cv2.INTER_LINEAR)
    iminput = iminput.swapaxes(0,2).swapaxes(1,2)
    iminput = (iminput/255.0)*2.0 - 1 
    iminput = iminput[-1::-1,:,:]                                                                                               

Also, I did try printing out the min, max, etc as you suggested.
With cv2.resize:
('Image: Min, Max, Mean, Stdv: ', -0.9764705896377563, 1.0, -0.0761005503667415, 0.41822466298177663)
With lua image:
('Image: Min, Max, Mean, Stdv: ', -0.9600126147270203, 0.9971910715103149, -0.08448679641393635, 0.4320645987985845)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants