# Depth Estimation

> A tool to generate the area of interest maps of images using depth estimation techniques

Code was written by Nicholas M. Synovic, Oscar Yanek, and Rohan Sethi

## Setup

### Upgrade Python `pip` Tool

Upgrade the Python `pip` tool to the latest version

In [None]:
%pip install --upgrade pip

### Install Python libaries via `pip`

Installed libraries are:

- opencv-contrib-python
- torch
- torchvision
- progress
- timm

In [None]:
%pip install opencv-contrib-python torch torchvision progress timm

### Import Dependencies 

In [None]:
from os import listdir
from os.path import join
from pathlib import PurePath

import cv2
import numpy
import torch
from numpy import ndarray
from progress.bar import Bar

### Allow Data to be Loaded From Google Drive

If you wish to load data from Google Drive, uncomment the following lines.

In [None]:
#from google.colab import drive
#drive.mount('/content/gdrive')

## Application

### Read Directory

Function to read a directory and return a list of filepaths from that directory.

In [None]:
def readDirectory(dir: str) -> list:
    files: list = listdir(dir)
    filepaths: list = [join(dir, f) for f in files]
    return filepaths

### Estimate Depth with MiDaS

Takes a file path to an image (`imagePath`) and an output folder path (default is `./data`; `outputFolder`) as input. A model type (`modelType`) is required as well.

**NOTE**: `modelType` must be a compatible MiDaS model type. See [here](https://pytorch.org/hub/intelisl_midas_v2/) for supported model types.

It then uses the approach outlined in [1, 2](#citations) to estimate the depth of an image.

Area of interest maps are saved in `.jpg` format in the `outputFolder` with the following scheme:

- `outputFolder`/FILENAME_MODELTYPE`.jpg`

Where FILENAME is the original name of the file without the extension and MODELTYPE is the model type that was used for estimation.

In [None]:
def estimateDepth(imagePaths: list, modelType: str, outputFolder: str = "data") -> None:
    midas = torch.hub.load("intel-isl/MiDaS", modelType)
    midas_transforms = torch.hub.load("intel-isl/MiDaS", "transforms")

    device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
    midas.to(device)
    if modelType == "DPT_Large" or modelType == "DPT_Hybrid":
        transform = midas_transforms.dpt_transform
    else:
        transform = midas_transforms.small_transform

    with Bar(f"Estimating depth with {modelType}...", max=(len(imagePaths))) as bar:
        imagePath: str
        for imagePath in imagePaths:
            imageName: str = (
                PurePath(imagePath).with_suffix("").name
                + f'_{modelType.replace("_", "-")}.jpg'
            )
            outputPath: str = join(outputFolder, imageName)

            image: ndarray = cv2.imread(imagePath)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            input_batch = transform(image).to(device)

            with torch.no_grad():
                prediction = midas(input_batch)

                prediction = torch.nn.functional.interpolate(
                    prediction.unsqueeze(1),
                    size=image.shape[:2],
                    mode="bicubic",
                    align_corners=False,
                ).squeeze()

            prediction: ndarray = prediction.cpu().numpy()

            cv2.imwrite(outputPath, prediction)
            bar.next()

### Main Method

In [None]:
def main() -> None:
    depth_DPTLarge: str = "DPT_Large"
    depth_DPTHybrid: str = "DPT_Hybrid"
    depth_MiDaSsmall: str = "MiDaS_small"

    dir: str = input("Image directory to analyze: ")
    imagePaths: list = readDirectory(dir)

    estimateDepth(imagePaths, depth_DPTHybrid)
    estimateDepth(imagePaths, depth_DPTLarge)
    estimateDepth(imagePaths, depth_MiDaSsmall)

if __name__ == "__main__":
    main()

## Citations

1. R. Ranftl, A. Bochkovskiy, and V. Koltun, “Vision Transformers for Dense Prediction.” arXiv, Mar. 24, 2021. doi: 10.48550/arXiv.2103.13413.
2. R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer.” arXiv, Aug. 25, 2020. doi: 10.48550/arXiv.1907.01341.
