## Compare the differences between two png images

### Base idea of the image comparison.
https://www.pyimagesearch.com/2017/06/19/image-difference-with-opencv-and-python/

### requirement
+ required scikit-image and imutils.
+ scikit-image and imutils shoudl be up-to-date.

+ note 1.
if you use Anaconda, add the "conda-forge" channel and install imutils.

+ note 2.
if you encountered the "rocedure endpoint OPENSSL_sk_nre_reserve could not found" error when installing imutils, check the following page to fix the error.
    - https://github.com/conda/conda/issues/9003#issuecomment-516499958
    - i.3. you have to copy the libssl-1_1-x64.dll in Anaconda/DLLS and replaced that in Anaconda/Library/bin

+ other requirement
    - put the base image file(s) into the "img/1.first" folder.
    - put the comparison file(s) into the "img/2.second" folder.
    - The image files to be compared in the "1.first" and "2.second" folders must have the same file name.
    - Do not include UNICODE (double-byte characters or spaces) in file names, including PATH, because cv2.imread() and cv2.imwrite() do not support PATH containing UNICODE.

### How it works.
+ Finds and compares files with the same name in the "2.second" folder against files in the "1.first" folder.
+ If the file exists in the "1.first" folder but not in the "2.second" folder, an error message will be output.
+ Files that do not exist in the "1.first" folder but exist in the "2.second" folder are not detected.
+ The assumption is that image files of the same dimension are being compared; attempting to compare images of different dimensions will be judged as "there are differences".
+ By default, this program runs in a multi-process manner, launching as many processes as the number of CPU cores on the PC on which it runs.
+ If you want to specify the number of processes to run, specify by changing "num_processors" in the source.

## Program body

In [1]:
#####
# * note
# When you re-run the program after editting module files,
# you'd better to restart the jupyter kernel, especially after editting settings.py.
#####

import multiprocessing as mp
import time
import datetime

from pathlib import Path
from mymodules import settings
from mymodules import workers

import requests

# If there is no folder(s) to save screenshots, create them.
settings.init()

### main (calls worker)

In [2]:
if __name__ == '__main__':
    
    skipUntil = 0 # 1 -> skip 1st line. 200 -> skip first 200 lines.
    stopAt = -1 # 2 -> stop after reading 2 lines. 200 -> stop after reading 200 lines. -1 -> read all lines.
    
    try:
        num_processors = mp.cpu_count()
        #num_processors = 1
        
        p = Path(settings.ORIG_IMAGE_DIRNAME).glob('**/*')
        files = sorted([x for x in p if x.is_file()])  # sort by file name.
        
        index = 0
        arr_orig_image_lists = []
        for orig_image_with_path in files:
            index += 1
            if index < skipUntil:
                continue
            if stopAt > 0 and stopAt == index:
                arr_orig_image_lists.append(orig_image_with_path)
                print(f'image file in writer() : {index} - {orig_image_with_path}')
                break
            print(f'image file in writer() : {index} - {orig_image_with_path}')
            arr_orig_image_lists.append(orig_image_with_path)
        
        time_start = time.time()
        
        str_start_datetime = datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
        print(f'Start at {str_start_datetime}')

        #p = mp.Pool(processes = num_processors)
        p = mp.Pool(processes = num_processors, initializer = workers.compimgdiff_init, initargs= [str_start_datetime])
        arr_result = p.map(workers.compare_image_diff, arr_orig_image_lists)
        
        time_end = time.time()
        time_diff =time_end - time_start
        str_end_datetime = datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
        print(f'time : {time_diff}')
        #dt_end = datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
        
        with open('results-' + str_start_datetime + '.csv', 'a', encoding="utf-8_sig") as result_file:
            for ar in arr_result:
                # ar's shape is [ssim_score, image_filename, messages]
                result_file.write(f'{ar[0]},{ar[1]},"{ar[2]}"\n')
        print("Finished")
        
        # slack_notify(msg = '[Image Difference] finished at {}. It took {} seconds.'.format(str_end_datetime, str(time_diff)))
        
    except Exception as ex:
        print(f'{type(ex).__name__}: {ex}')
        raise
    finally:
        #workers.quit_web_driver()
        pass

image file in writer() : 1 - img\01.first\1.png
image file in writer() : 2 - img\01.first\10.png
image file in writer() : 3 - img\01.first\100.png
image file in writer() : 4 - img\01.first\1000.png
image file in writer() : 5 - img\01.first\1001.png
image file in writer() : 6 - img\01.first\1002.png
image file in writer() : 7 - img\01.first\1003.png
image file in writer() : 8 - img\01.first\1004.png
image file in writer() : 9 - img\01.first\1005.png
image file in writer() : 10 - img\01.first\1006.png
image file in writer() : 11 - img\01.first\1007.png
image file in writer() : 12 - img\01.first\1008.png
image file in writer() : 13 - img\01.first\1009.png
image file in writer() : 14 - img\01.first\101.png
image file in writer() : 15 - img\01.first\1010.png
image file in writer() : 16 - img\01.first\1011.png
image file in writer() : 17 - img\01.first\1012.png
image file in writer() : 18 - img\01.first\1013.png
image file in writer() : 19 - img\01.first\1014.png
image file in writer() : 20 

time : 763.1893255710602
Finished
