# Saliency Mapper

> A tool to generate the saliency maps of images using a variety of techniques

Code was written by Nicholas M. Synovic, Oscar Yanek, and Rohan Sethi

## Optimal Performance

For optimal performance, in the *Runtime* tab of Google Collab, click *Change runtime type*, then choose **GPU** from the *Hardware Accelerator* dropdown.

## Upgrade Python `pip` Tool

Upgrade the Python `pip` tool to the latest version

In [14]:
%pip install --upgrade pip

Collecting pip
  Using cached pip-22.2.2-py3-none-any.whl (2.0 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.0.4
    Uninstalling pip-22.0.4:
      Successfully uninstalled pip-22.0.4
  Rolling back uninstall of pip
  Moving to c:\users\oyane_5ab4y9h\appdata\roaming\python\python39\scripts\
   from C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\~cripts
  Moving to c:\users\oyane_5ab4y9h\appdata\roaming\python\python39\site-packages\pip-22.0.4.dist-info\
   from C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\~ip-22.0.4.dist-info
  Moving to c:\users\oyane_5ab4y9h\appdata\roaming\python\python39\site-packages\pip\
   from C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\~ip
Note: you may need to restart the kernel to use updated packages.


ERROR: Exception:
Traceback (most recent call last):
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\cli\base_command.py", line 167, in exc_logging_wrapper
    status = run_func(*args)
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\cli\req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\commands\install.py", line 405, in run
    installed = install_given_reqs(
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\req\__init__.py", line 73, in install_given_reqs
    requirement.install(
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\req\req_install.py", line 769, in install
    install_wheel(
  File "C:\Users\oyane_5ab4y9h\AppData\Roaming\Python\Python39\site-packages\pip\_internal\operations\install\wheel.py"

## Install Python libaries via `pip`

Installed libraries are:

- opencv-contrib-python
- torch
- torchvision
- pandas
- progress
- timm
- ipywidgets

In [15]:
%pip install opencv-contrib-python torch torchvision torchaudio pandas progress timm ipywidgets timeit



ERROR: Could not find a version that satisfies the requirement timeit (from versions: none)
ERROR: No matching distribution found for timeit
You should consider upgrading via the 'c:\Users\oyane_5ab4y9h\anaconda3\python.exe -m pip install --upgrade pip' command.





## Import Dependencies 

In [16]:
from os import listdir
from os.path import join
from pathlib import PurePath

import cv2
import torch
#from google.colab import drive
from numpy import ndarray
from progress.bar import Bar

## Allow Data to be Loaded From Google Drive



In [17]:
#drive.mount('/content/gdrive')

## Main Application

Initialize variables for program scope 

In [18]:
spectralSaliency = cv2.saliency.StaticSaliencySpectralResidual_create()
fineGrainSaliency = cv2.saliency.StaticSaliencyFineGrained_create()
depth_DPTLarge: str = "DPT_Large"
depth_DPTHybrid: str = "DPT_Hybrid"
depth_MiDaSsmall: str = "MiDaS_small"

## Simple Directory Reader

when giving the program a dataset instead of singular image

In [19]:
def readDirectory(dir: str) -> list:
    files: list = listdir(dir)
    filepaths: list = [join(dir, f) for f in files]
    return filepaths

## EstimateDepth Main Logic

Determines desired training model from the arguments. Uses loading bar to parse through imagePaths List, performs transformation on each image and outputs the result to a new folder. 

In [20]:
def estimateDepth(imagePaths: list, modelType: str, outputFolder: str = "data") -> None:
    midas = torch.hub.load("intel-isl/MiDaS", modelType)
    midas_transforms = torch.hub.load("intel-isl/MiDaS", "transforms")

    device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
    midas.to(device)
    if modelType == "DPT_Large" or modelType == "DPT_Hybrid":
        transform = midas_transforms.dpt_transform
    else:
        transform = midas_transforms.small_transform

    with Bar(f"Estimating depth with {modelType}...", max=(len(imagePaths))) as bar:
        imagePath: str
        for imagePath in imagePaths:
            imageName: str = (
                PurePath(imagePath).with_suffix("").name
                + f'_{modelType.replace("_", "-")}.jpg'
            )
            outputPath: str = join(outputFolder, imageName)

            image: ndarray = cv2.imread(imagePath)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            input_batch = transform(image).to(device)

            with torch.no_grad():
                prediction = midas(input_batch)

                prediction = torch.nn.functional.interpolate(
                    prediction.unsqueeze(1),
                    size=image.shape[:2],
                    mode="bicubic",
                    align_corners=False,
                ).squeeze()

            output = prediction.cpu().numpy()
            cv2.imwrite(outputPath, output)
            bar.next()

## ComputeSpectralSaliency Main Logic

Creates output folder and path, takes input imagePath and creates a saliency map for the image using the spectral residual approach. Starting from the principle of natural image statistics, this method simulate the behavior of pre-attentive visual search. The algorithm analyze the log spectrum of each image and obtain the spectral residual. Then transform the spectral residual to spatial domain to obtain the saliency map, which suggests the positions of proto-objects.

In [21]:
def computeSpectralSaliency(imagePath: str, outputFolder: str = "data") -> None:
    imageName: str = PurePath(imagePath).with_suffix("").name + "_spectralResidual.jpg"
    outputPath: str = join(outputFolder, imageName)
    image: ndarray = cv2.imread(imagePath)
    (success, saliencyMap) = spectralSaliency.computeSaliency(image)
    saliencyMap: ndarray = (saliencyMap * 255).astype("uint8")
    cv2.imwrite(outputPath, saliencyMap)

## ComputeFineGrainSaliency Main Logic

Creates output folder and path, takes input imagePath and creates a saliency map for the image using the fine grained approach. This method calculates saliency based on center-surround differences. High resolution saliency maps are generated in real time by using integral images.

In [22]:
def computeFineGrainSaliency(imagePath: str, outputFolder: str = "data") -> None:
    imageName: str = PurePath(imagePath).with_suffix("").name + "_fineGrain.jpg"
    outputPath: str = join(outputFolder, imageName)
    image: ndarray = cv2.imread(imagePath)
    (success, saliencyMap) = fineGrainSaliency.computeSaliency(image)
    saliencyMap: ndarray = (saliencyMap * 255).astype("uint8")
    cv2.imwrite(outputPath, saliencyMap)

## Basic writeImage function

Writes an image to a desired path.

In [23]:
def writeImage(image: ndarray, imagePath: str) -> None:
    cv2.imwrite(imagePath, image)

## Otsu Threshold Background Removal

Blurs image to reduce noise. Bins the pixels between 1-5. Creates a single channel greyscale for thresholding. Performs Otsu Thresholding and extracts the background. Uses binary thershold to create an all white background. Converts black and white back into 3 channel greyscale. Performs Otsu thresholding and extracts the foreground. Uses TOZERO_INV to keep some detail of the foreground. Combines the background and foreground to make final image. returns final image.

In [24]:
def bgremove1(myimage):
 
    myimage = cv2.GaussianBlur(myimage,(5,5), 0)
 
    bins=numpy.array([0,51,102,153,204,255])
    myimage[:,:,:] = numpy.digitize(myimage[:,:,:],bins,right=True)*51
 
    myimage_grey = cv2.cvtColor(myimage, cv2.COLOR_BGR2GRAY)
 
    ret,background = cv2.threshold(myimage_grey,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
 
    background = cv2.cvtColor(background, cv2.COLOR_GRAY2BGR)
 
    ret,foreground = cv2.threshold(myimage_grey,0,255,cv2.THRESH_TOZERO_INV+cv2.THRESH_OTSU)  #Currently foreground is only a mask
    foreground = cv2.bitwise_and(myimage,myimage, mask=foreground)  # Update foreground with bitwise_and to extract real foreground
 
    finalimage = background+foreground
 
    return finalimage

## Basic Thresholding Background Removal

First converts to greyscale. Performs truncate threshold to get baseline. Extracts binary threshold using the baseline for the background and foreground. Updates foreground with bitwise_and to extract real foreground. Converts black and white back into 3 channel greyscale. Combines the background and foreground to obtain our final image.

In [25]:
def bgremove2(myimage):

    myimage_grey = cv2.cvtColor(myimage, cv2.COLOR_BGR2GRAY)
 
    ret,baseline = cv2.threshold(myimage_grey,127,255,cv2.THRESH_TRUNC)
 
    ret,background = cv2.threshold(baseline,126,255,cv2.THRESH_BINARY)
 
    ret,foreground = cv2.threshold(baseline,126,255,cv2.THRESH_BINARY_INV)
 
    foreground = cv2.bitwise_and(myimage,myimage, mask=foreground)  

    background = cv2.cvtColor(background, cv2.COLOR_GRAY2BGR)
 
    finalimage = background+foreground
    return finalimage

## Hue Saturation Value

Converts image from BGR to HSV. Takes saturation and removes any values that are less than half creating the saturation mask. Increases the brightness of the image and then mods by 255 . Extracts any value above 127 to be a part of the value mask. Combines the two masks into unified foreground. Casts back into 8-bit integer. Inverts foreground to get background in uint8. Converts background back into BGR. Applies foreground map to original image. Combines foreground and background.

Documentation: https://docs.opencv.org/4.x/df/d9d/tutorial_py_colorspaces.html

In [26]:
def bgremove3(myimage):

    myimage_hsv = cv2.cvtColor(myimage, cv2.COLOR_BGR2HSV)
     
    s = myimage_hsv[:,:,1]
    s = numpy.where(s < 127, 0, 1) 
 
    v = (myimage_hsv[:,:,2] + 127) % 255
    v = numpy.where(v > 127, 1, 0)  

    foreground = numpy.where(s+v > 0, 1, 0).astype(numpy.uint8)  

    background = numpy.where(foreground==0,255,0).astype(numpy.uint8) 
    background = cv2.cvtColor(background, cv2.COLOR_GRAY2BGR)  
    foreground=cv2.bitwise_and(myimage,myimage,mask=foreground) 
    finalimage = background+foreground 

    return finalimage


## Combined Main Method

In [27]:

def main() -> None:
    # dirImage = input("image dir path is: ")
    # dirOut = input("output image dir path is: ")
    # imagePaths: list = readDirectory(dir=dirImage)
    imagePaths: list = ["test.jpg"]
    dirOut = "data"
    with Bar(
        "Creating saliency maps of PascalVOC images...", max=len(imagePaths)
    ) as bar:
        imagePath: str
        for imagePath in imagePaths:
            computeSpectralSaliency(imagePath)
            computeFineGrainSaliency(imagePath)
            bar.next()

    imagePath: str = input("Enter path: ")
    image: ndarray = cv2.imread(imagePath)
    showimage(bgremove1(image))
    showimage(bgremove2(image))
    showimage(bgremove3(image))

    estimateDepth(imagePaths, depth_DPTHybrid)
    estimateDepth(imagePaths, depth_DPTLarge)
    estimateDepth(imagePaths, depth_MiDaSsmall)


if __name__ == "__main__":
    main()


TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

## Benchmark Code

Benchmark function that outputs to console

In [None]:
from timeit import timeit

accHW = input("Enter Accelerator type and test number: ")

f=open(accHW+".txt")
f.write(
    'SpectralSaliency time: '+timeit.timeit("computeSpectralSaliency(imageTestPath)", globals=globals())+"/n"+
    'FineGrainSaliency time: '+timeit.timeit("computeFineGrainSaliency(imageTestPath)", globals=globals())+"/n"+
    'DPTHybrid time: '+timeit.timeit("computeSpectralSaliency(imageTestPath)", globals=globals())+"/n"+
    'DPTLarge time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_DPTLarge)", globals=globals())+"/n"+
    'MiDaSsmall time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_MiDaSsmall)", globals=globals())+"/n"+
    'DPTHybrid time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_DPTHybrid)", globals=globals())+"/n"+
    'bgremove1 time: '+timeit.timeit("bgremove1(imageTestPath)",globals=globals())+"/n"+
    'bgremove2 time: '+timeit.timeit("bgremove2(imageTestPath)",globals=globals())+"/n"+
    'bgremove3 time: '+timeit.timeit("bgremove3(imageTestPath)",globals=globals())
)
f.close()

imageTestPath: str = input("Enter path: ")
imageTestPaths: list = [imageTestPath]
print('SpectralSaliency time: '+timeit.timeit("computeSpectralSaliency(imageTestPath)", globals=globals()))
print('FineGrainSaliency time: '+timeit.timeit("computeFineGrainSaliency(imageTestPath)", globals=globals()))
print('DPTHybrid time: '+timeit.timeit("computeSpectralSaliency(imageTestPath)", globals=globals()))
print('DPTLarge time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_DPTLarge)", globals=globals()))
print('MiDaSsmall time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_MiDaSsmall)", globals=globals()))
print('DPTHybrid time: '+timeit.timeit("estimateDepth(imageTestPaths, depth_DPTHybrid)", globals=globals()))
print('bgremove1 time: '+timeit.timeit("bgremove1(imageTestPath)",globals=globals()))
print('bgremove2 time: '+timeit.timeit("bgremove2(imageTestPath)",globals=globals()))
print('bgremove3 time: '+timeit.timeit("bgremove3(imageTestPath)",globals=globals()))