# [TLDR] greyscale and/or blur images to compare images with `psnr` metric found in `image-similarity-measures`

# [LONGER VERSION]

- in this notebook I test out different image distance metrics available in `image_similarity_measures` considering their
  - adeptness at finding similar images (by eyeballing results)
  - its speed of compute (by timing big requests)  

- re timing, some metrics
  - take way too long to compute (`fsim` and `uiq`)
  - others are much faster (`issm` and `ssim`)
  - and some are even faster (`psnr`, `rmse`, `sam` and `sre`)

- with these metrics I noticed that
    1. some products are listed multiple times (potential solution : throwing duplicate listings)
    2. some images differ only in colour (potential solution : do image differences on grey scale)
    3. some images differ only in image markups (potential solution : do image differences on blurred images)

- no metric seems to be significantly better so I settle with `psnr` as it is
  - not only among the fastest of the choices available
  - it's results are similar to other measures
  - it's the package default
  


# IMPORTS

In [None]:
%run ipynb_setup.ipynb

In [None]:
%run class_ImageSearch.ipynb

In [None]:
import time

# time each type of similarity measure

In [None]:
###############################################################################
# initialize
###############################################################################
im_search=ImageSearch(dataset=Dataset())

In [None]:
'''
###############################################################################
# time each type measure singly
###############################################################################
st=time.time()
sim_raw_do_fsim = im_search.img_similarity_tgt_locs(loc=3,do_fsim=True) # 1091.9723105430603
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_issm = im_search.img_similarity_tgt_locs(loc=3,do_issm=True) # 43.62554740905762
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_psnr = im_search.img_similarity_tgt_locs(loc=3,do_psnr=True) # 10.773506164550781
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_rmse = im_search.img_similarity_tgt_locs(loc=3,do_rmse=True) # 14.554333209991455
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_sam = im_search.img_similarity_tgt_locs(loc=3,do_sam=True) # 16.528496980667114
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_sre = im_search.img_similarity_tgt_locs(loc=3,do_sre=True) # 12.952892541885376
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_ssim = im_search.img_similarity_tgt_locs(loc=3,do_ssim=True) # 38.041083097457886
et=time.time()
print(et-st)

st=time.time()
sim_raw_do_uiq = im_search.img_similarity_tgt_locs(loc=3,do_uiq=True)
et=time.time()
print(et-st)
'''
None

# time fastest couple of measures

In [None]:
###############################################################################
# time fastest couple of measures
###############################################################################
st=time.time()
sim_raw = im_search.img_similarity_tgt_locs(src_loc=3) # 83.10340213775635
et=time.time()
print(et-st)

st=time.time()
sim_blur = im_search.img_similarity_tgt_locs(src_loc=3,blur=True) # 80.66669917106628
et=time.time()
print(et-st)

st=time.time()
sim_grayscale = im_search.img_similarity_tgt_locs(src_loc=3,grayscale=True,plot_tgt=True) # 113.59069466590881
et=time.time()
print(et-st)

st=time.time()
sim_grayscale_blur = im_search.img_similarity_tgt_locs(src_loc=3,grayscale=True,blur=True) # 74.6272337436676
et=time.time()
print(et-st)

# plot `IMG_SIMILARITY_DF`s

In [None]:
im_search.img_similarity_plot(sim_raw)

In [None]:
im_search.img_similarity_plot(sim_blur)

In [None]:
im_search.img_similarity_plot(sim_grayscale)

In [None]:
im_search.img_similarity_plot(sim_grayscale_blur)

# look at top 10 chosen by each measure

In [None]:
[im_search.dataset.get_product_picture(loc=x) for x in sim_grayscale_blur['psnr'].sort_values(ascending=False)[:10].index];

In [None]:
[im_search.dataset.get_product_picture(loc=x) for x in sim_grayscale_blur['rmse'].sort_values(ascending=True)[:10].index];

In [None]:
[im_search.dataset.get_product_picture(loc=x) for x in sim_grayscale_blur['sam'].sort_values(ascending=False)[:10].index];

In [None]:
[im_search.dataset.get_product_picture(loc=x) for x in sim_grayscale_blur['sre'].sort_values(ascending=False)[:10].index];

In [None]:
[im_search.dataset.get_product_picture(loc=x) for x in sim_grayscale_blur['ssim'].sort_values(ascending=False)[:10].index];

# try using all measures at the same time

In [None]:
all_merged=pd.DataFrame(
    [
        sim_raw['psnr'].rank(ascending=False),
        sim_raw['rmse'].rank(ascending=True),
        sim_raw['sam'].rank(ascending=False),
        sim_raw['sre'].rank(ascending=False),
        sim_raw['ssim'].rank(ascending=False),
    ],
).transpose().min(axis=1)

In [None]:
(lambda x:[im_search.dataset.get_product_picture(loc=idx) for idx in x.sort_values(ascending=True)[:10].index])(sim_grayscale_blur['psnr'].rank(ascending=False));

In [None]:
(lambda x:[im_search.dataset.get_product_picture(loc=idx) for idx in x.sort_values(ascending=True)[:10].index])(all_merged.sort_values(ascending=True));