## Image comparison and sorting

This jupyter notebook (.ipynb) is used to sort images based on it's similarity with a reference image. It can be used for multiple functions but this is designed to be used for detecting microbeads in a series of microscopy images.

### Instructions:

Under the settings tab:
1. Put a reference image into a folder and designate the folder path ```folder_reference```   The first image in the reference folder will be used.
2. Put all the images to compare in 1 folder and designate the folder path ```folder_input```.
3. Designate ```folder_same``` for images similar to the reference image. Likewise, designate ```folder_diff``` for images different to the reference image. The sorting process will transfer images in ```folder_input``` into one of the 2 designated folders depending on the threshold value ```thres_value```.
4. Run all lines of code. VS Code has the "Run All" function.

A log will also be created in the reference folder after sorting. Open with a text editor to see the information of the sorted files.

In [None]:
#Import prerequisite packages [Note that these packages has to be installed first]
import pandas as pd, numpy as np, cv2, os, logging

### 1. Settings

In [None]:
#Image file type
filetype = ".jpg"

#Change threshold value with assistance using logging
thres_value = 30

#Reference Folder
folder_reference = "/Users/aucheukyan/Library/CloudStorage/OneDrive-NationalUniversityofSingapore/_CTLimNishanth_BeadDetect/Reference"

#Input folder
folder_input = "/Users/aucheukyan/Library/CloudStorage/OneDrive-NationalUniversityofSingapore/_CTLimNishanth_BeadDetect/10umbeads" 

#Output folder
folder_same = "/Users/aucheukyan/Library/CloudStorage/OneDrive-NationalUniversityofSingapore/_CTLimNishanth_BeadDetect/Blank" #image same as reference i.e. empty channels
folder_diff = "/Users/aucheukyan/Library/CloudStorage/OneDrive-NationalUniversityofSingapore/_CTLimNishanth_BeadDetect/NotBlank" #image different from reference ie. channels with microspheres

### 2. Reference Image (Blank Channel)
Ensure that there is a reference image before proceeding to 3.

In [None]:
reffiles = [f for f in os.listdir(folder_reference) if f.endswith(filetype)] #select your image file type here
if not reffiles:
    print("No reference image found!!!")
else:
    logfile = os.path.join(folder_reference, reffiles[0][:-4]+'.log')
    logging.basicConfig(filename=logfile, filemode='a')
    logging.getLogger().setLevel(logging.INFO)
    img_ref = cv2.imread(os.path.join(folder_reference, reffiles[0]),0)
    img_sub = cv2.selectROI("select the area", img_ref)
    img_refroi = img_ref[int(img_sub[1]):int(img_sub[1]+img_sub[3]),
                         int(img_sub[0]):int(img_sub[0]+img_sub[2])]
    print("Ready to compare! Ref file: " + str(reffiles[0]))
    # print(img_sub) #Print ROI coordinates

### 3. Images to compare to Reference Image for sorting.

In [None]:
files = [f for f in os.listdir(folder_input) if f.endswith(filetype)] #select your image file type here
print("Number of files to compare against Reference image: " + str(len(files)))

In [None]:
# Sort the files according to threshold with reference image
for i in range(len(files)):
    # #Import file
    file = os.path.join(folder_input, files[i])
    img = cv2.imread(file,cv2.IMREAD_GRAYSCALE)
    img_roi = img[int(img_sub[1]):int(img_sub[1]+img_sub[3]),
                  int(img_sub[0]):int(img_sub[0]+img_sub[2])]
    
    # #Comparison
    # img_diff = cv2.absdiff(img, img_ref) #compare full image
    img_diff = cv2.absdiff(img_roi,img_refroi) #compare ROI

    # #Outputs image difference
    # outfile = files[i][:-4]+ "_subtract.jpg"
    # cv2.imwrite(outfile, img_diff)

    # #Find max difference
    total_diff = img_diff.max()

    # #putting them in place base on threshold value
    if total_diff > thres_value:
        os.rename(file, os.path.join(folder_diff, files[i]))
        logging.info("Processing:" + str(i) + " of " + str(len(files)) + ", filename:" + files[i] + ", maxdiff:" + str(total_diff) + ", result:Diff")
    else:
        os.rename(file, os.path.join(folder_same,files[i]))
        logging.info("Processing:" + str(i) + " of " + str(len(files)) + ", filename:" + files[i] + ", maxdiff:" + str(total_diff)+ ", result:Same")