Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two identical image diff significant #41

Closed
chekun opened this issue Sep 2, 2016 · 10 comments
Closed

Two identical image diff significant #41

chekun opened this issue Sep 2, 2016 · 10 comments
Labels

Comments

@chekun
Copy link

chekun commented Sep 2, 2016

Hi, I have two identical image, but the algorithm outs two different signatures.

First image:

test01

Second image:

test02

Code below:

from image_match.goldberg import ImageSignature
gis = ImageSignature()
test01 = gis.generate_signature('test01.jpg')
test02 = gis.generate_signature('test02.jpg')
gis.normalized_distance(test01, test02)

which outputs

0.70823708184882128

Is that right?

@rhsimplex
Copy link
Owner

Nope, it should return 0.0. I'll try to replicate this Monday and see what's going on.

Thanks for reporting the issue.

@rhsimplex rhsimplex added the bug label Sep 3, 2016
@rhsimplex
Copy link
Owner

Ok, I only get a distance of 0.09, which should be under the threshold for matching. However these images should be identical.

In [1]: from image_match.goldberg import ImageSignature

In [2]: gis = ImageSignature()

In [3]: path1 = 'https://cloud.githubusercontent.com/assets/1967804/18188884/53e77c7a-70e8-11e6-8e27-98c196f1e242.jpg'

In [4]: path2 = 'https://cloud.githubusercontent.com/assets/1967804/18188887/5bbba7aa-70e8-11e6-93b9-3e881bc03e66.jpg'

In [5]: sig1 = gis.generate_signature(path1)

In [6]: sig2 = gis.generate_signature(path2)

In [7]: gis.normalized_distance(sig1, sig2)
Out[7]: 0.094653959692538772

What's going on? I think one of these images is greyscale, and one is color.

In [8]: from skimage.io import imread

In [9]: imread(path1).shape
Out[9]: (1334, 750, 3)

In [10]: imread(path2).shape
Out[10]: (1334, 750)

So skimage (which image-match uses for loading the images) converts to greyscale from color with range (0,1) but if the image is already greyscale over 8 bits, it leaves it!

In [12]: imread(path1, as_grey=True)
Out[12]: 
array([[ 1.,  1.,  1., ...,  1.,  1.,  1.],
       [ 1.,  1.,  1., ...,  1.,  1.,  1.],
       [ 1.,  1.,  1., ...,  1.,  1.,  1.],
       ..., 
       [ 1.,  1.,  1., ...,  1.,  1.,  1.],
       [ 1.,  1.,  1., ...,  1.,  1.,  1.],
       [ 1.,  1.,  1., ...,  1.,  1.,  1.]])

In [13]: imread(path2, as_grey=True)
Out[13]: 
array([[255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       ..., 
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

Hence the slightly different signatures. The correct fix would be get rid of the skimage dependency and use PIL directly. A quick fix might be to detect if an image is uint8 and divide by 255.

@rhsimplex
Copy link
Owner

Interestingly, when I try dividing by 255, I get the same distance value as you: 0.70823708184882128. Maybe something has changed with skimage. Can you tell me the output of:

import skimage
print(skimage.__version__)

?

I'm using version 0.12.3

@chekun
Copy link
Author

chekun commented Sep 6, 2016

@rhsimplex , Thank you for your reply.

I use pavlov/match docker image pavlov/match@9b7df7ecc867.

and the skimage version is exactly 0.12.3

Would use PIL directly fix the problem? Any plan to replace it ?

@rhsimplex
Copy link
Owner

rhsimplex commented Sep 6, 2016

If you're using pavlov's match, then the real problem is that it's using an out-of-date image-match build. Their docker file has the line:

pip install git+https://github.com/ascribe/image-match.git@0.2.1

And we're at version 1+ now. My colleague @vrde has actually ported match to use the latest version of image-match (which now uses python3), you can find that fork here: https://github.com/vrde/match. I'll ask him to make a PR against the original repository.

Sorry about the confusion, but see if using this version helps. If so, I probably won't make any changes because 0.09 is well below the threshold of what should be considered a match.

@chekun
Copy link
Author

chekun commented Sep 6, 2016

@rhsimplex Good news, I will use @vrde's fork and see if it woks out. Thank you very much. 👍

@rhsimplex
Copy link
Owner

You're welcome. Please let me know if it fixes your problem.

@chekun
Copy link
Author

chekun commented Sep 6, 2016

@rhsimplex vrde's fork fixed my problem, issue solved 😃

@chekun chekun closed this as completed Sep 6, 2016
@rhsimplex
Copy link
Owner

there is now a PR dsys/match#8 on Pavlov's match. When the merge it you can pull from there directly =)

@chekun
Copy link
Author

chekun commented Sep 8, 2016

@rhsimplex Nice, love you guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants