# Regression Model
This is the regression model that is trained to predict accurate depth maps (e.g 128,128) from single input RGB images (e.g 512. 512, 3). A set of experiments takes place, in this notebook. In summary:

*   Define X (Input Images) and Y (GT-depth maps) data
*   Select Depth Regression Network Architecture
*   Train Model for a number of initial samples
*   Test on random sample/example




## Dataset
Define the path to the dataset for the input data and the ground truth depth maps. Load everything to the dataset.

In [1]:
input_data_path = '/content/drive/MyDrive/datasets/eg3d/images/' #@param
gt_data_path = '/content/drive/MyDrive/datasets/eg3d/depth128x128/' #@param

## Model Architecture (Option 1)
The initial selection is the MiDaS pretrained model on monocular depth map estimation. The official repository can be found [here](https://github.com/isl-org/MiDaS) 

In [2]:
!git clone https://github.com/isl-org/MiDaS
!pip install timm

Cloning into 'MiDaS'...
remote: Enumerating objects: 501, done.[K
remote: Counting objects: 100% (93/93), done.[K
remote: Compressing objects: 100% (40/40), done.[K
remote: Total 501 (delta 70), reused 53 (delta 53), pack-reused 408[K
Receiving objects: 100% (501/501), 414.26 KiB | 9.21 MiB/s, done.
Resolving deltas: 100% (168/168), done.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting timm
  Downloading timm-0.6.5-py3-none-any.whl (512 kB)
[K     |████████████████████████████████| 512 kB 5.0 MB/s 
Installing collected packages: timm
Successfully installed timm-0.6.5


In [3]:
# Download model for HQ depth maps
model_path = '/content/MiDaS/weights/dpt_large-midas-2f21e586.pt'
!wget https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt -O {model_path}

--2022-07-13 15:54:36--  https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://github.com/isl-org/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt [following]
--2022-07-13 15:54:36--  https://github.com/isl-org/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/350409920/3568d880-8b45-11eb-8c45-12766a421e43?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220713%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220713T155436Z&X-Amz-Expires=300&X-Amz-Signature=527438c7d45cfce0b409262dcd3934977d0c0ae57c1d4e83868043389b446b23&X-Amz-SignedHeaders=host&acto

## Model Architecture (Option 2)
Another option for a model architecture is the PIFU model architecture. 

### PIFUHD
It is a bit more complex, but it can be found [here](https://github.com/facebookresearch/pifuhd)

In [8]:
%cd /content/
!git clone https://github.com/facebookresearch/pifuhd
%cd /content/pifuhd/

/content
fatal: destination path 'pifuhd' already exists and is not an empty directory.
/content/pifuhd


In [9]:
# Download pretrained weights for model
model_path = '/content/pifuhd/pifuhd.pt'
!wget "https://dl.fbaipublicfiles.com/pifuhd/checkpoints/pifuhd.pt" -O {model_path}

--2022-07-13 16:20:58--  https://dl.fbaipublicfiles.com/pifuhd/checkpoints/pifuhd.pt
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.74.142, 104.22.75.142, 172.67.9.4, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1548375177 (1.4G) [application/octet-stream]
Saving to: ‘/content/pifuhd/pifuhd.pt’


2022-07-13 16:21:36 (39.7 MB/s) - ‘/content/pifuhd/pifuhd.pt’ saved [1548375177/1548375177]



In [None]:
def recon(opt, use_rect=False):
    # load checkpoints
    state_dict_path = None
    if opt.load_netMR_checkpoint_path is not None:
        state_dict_path = opt.load_netMR_checkpoint_path

    print('test data size: ', len(test_dataset))
    projection_mode = test_dataset.projection_mode

    opt_netG = state_dict['opt_netG']
    netG = HGPIFuNetwNML(opt_netG, projection_mode).to(device=cuda)
    netMR = HGPIFuMRNet(opt, netG, projection_mode).to(device=cuda)

    def set_eval():
        netG.eval()

    # load checkpoints
    netMR.load_state_dict(state_dict['model_state_dict'])

    ## test
    with torch.no_grad():
        set_eval()

        print('generate mesh (test) ...')
        for i in tqdm(range(start_id, end_id)):
            if i >= len(test_dataset):
                break
            
            # for multi-person processing, set it to False
            if True:
                test_data = test_dataset[i]

                save_path = '%s/%s/recon/result_%s_%d.obj' % (opt.results_path, opt.name, test_data['name'], opt.resolution)

                print(save_path)
                gen_mesh(opt.resolution, netMR, cuda, test_data, save_path, components=opt.use_compose)
            else:
                for j in range(test_dataset.get_n_person(i)):
                    test_dataset.person_id = j
                    test_data = test_dataset[i]
                    save_path = '%s/%s/recon/result_%s_%d.obj' % (opt.results_path, opt.name, test_data['name'], j)
                    gen_mesh(opt.resolution, netMR, cuda, test_data, save_path, components=opt.use_compose)

### PIFU
The model architecture can be found [here](https://github.com/shunsukesaito/PIFu)

In [None]:
!git clone https://github.com/shunsukesaito/PIFu

## Test Model

In [4]:
%cd /content/MiDaS/
!cp /content/drive/MyDrive/datasets/eg3d/images/seed0001.png /content/MiDaS/input/
!python run.py --model_type dpt_large # --model_weights={model_weights_path}

/content/MiDaS
initialize
device: cuda
start processing
  processing input/seed0001.png (1/1)
finished


## Train Model

In [None]:
from torchvision.transforms import Compose
from midas.dpt_depth import DPTDepthModel
from midas.transforms import Resize, NormalizeImage, PrepareForNet
import torch.optim as optim
import torch.nn as nn
import torch
import os
import cv2
import utils
import numpy as np
import PIL
from midas.transforms import Resize, NormalizeImage, PrepareForNet

l1_loss = nn.L1Loss() # l1-loss
model = DPTDepthModel(
            path=model_path,
            backbone="vitl16_384",
            non_negative=False,) # define model architecture with pretrained weights

# Input Preprocessing
net_w, net_h = 128, 128
resize_mode="minimal"
normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
transform = Compose(
    [ Resize(net_w, net_h, resize_target=None, keep_aspect_ratio=True, ensure_multiple_of=32,
            resize_method=resize_mode, image_interpolation_method=cv2.INTER_CUBIC,), normalization,PrepareForNet(), ])

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model.to(device)
model.train() # set in train mode

optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9) # optimizer
lr_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, 
                                            patience=10, verbose=True)

# trained_model = model.load_state_dict(torch.load(final_weights_path)) # load the final training weights

In [None]:
### Create list with GT and input data paths
batch_size = 5
img_train = []
gt_train = []
for img in os.listdir(input_data_path):
    filepath = os.path.join(input_data_path, img)
    gt_path = os.path.join(gt_data_path, img)
    img_train.append(filepath)
    gt_train.append(gt_path)


img_train = [img_train[i:i + batch_size] for i in range(0, len(img_train), )]
gt_train = [gt_train[i:i+batch_size] for i in range(0, len(gt_train), batch_size)]

In [None]:
### Function to normalize depth
def normalize_depth(array : np.array):
  maxx = np.amax(array)
  minx = np.amin(array)
  x = 2 * ((array - minx) / (maxx - minx) ) - 1
  return np.array(x, dtype=np.float32)


def normalize_depth_torch(tensor : torch):
  maxx = torch.amax(tensor)
  minx = torch.amin(tensor)
  x = 2 * ((tensor - minx) / (maxx - minx) ) - 1
  return x




In [None]:
### Check Model parameters
p =  []
for name, param in model.named_parameters():
  if name == 'pretrained.model.patch_embed.proj.weight':
    a_param = param.detach().cpu().numpy()  

In [None]:
### Training Skeleton
trainX = img_train[0:200] # first 5 batches
trainY = gt_train[0:200]
n_epochs = 10

for epoch in range(1, n_epochs+1):
    total_loss = 0.0
    for i, batch in enumerate(trainX):
      inputs = [] # batch inputs
      gts = [] # batch labels/gt data
      for filepath in batch:
        gt_np = np.array(PIL.Image.open(gt_path))
        img = utils.read_image(filepath)
        img_input = transform({"image": img})["image"]
        input = torch.from_numpy(img_input).to(device).unsqueeze(0)
        gt_np = normalize_depth(gt_np)
        gt = torch.tensor(gt_np, requires_grad=True).to(device).unsqueeze(0)
        gts.append(gt)
        inputs.append(input)

      # Convert list to batch tensor
      gts = torch.cat(gts)
      inputs = torch.cat(inputs)

      # Per batch loss and update
      predictions = model(inputs)
      predictions = normalize_depth_torch(predictions)
      loss = l1_loss(gts, predictions)
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()
      total_loss += float(loss)

    print("e", str(epoch),"- Total L1 loss: %f" %total_loss)
    lr_scheduler.step(total_loss)

e 1 - Total L1 loss: 1.819955
e 2 - Total L1 loss: 1.786839
Epoch 00015: reducing learning rate of group 0 to 1.0000e-05.
e 3 - Total L1 loss: 1.766926
e 4 - Total L1 loss: 1.764382
e 5 - Total L1 loss: 1.761886
e 6 - Total L1 loss: 1.759425
e 7 - Total L1 loss: 1.756995
e 8 - Total L1 loss: 1.754597
Epoch 00021: reducing learning rate of group 0 to 1.0000e-06.
e 9 - Total L1 loss: 1.752976
e 10 - Total L1 loss: 1.752809
e 11 - Total L1 loss: 1.752643
e 12 - Total L1 loss: 1.752477
e 13 - Total L1 loss: 1.752311
e 14 - Total L1 loss: 1.752145
Epoch 00027: reducing learning rate of group 0 to 1.0000e-07.
e 15 - Total L1 loss: 1.752034
e 16 - Total L1 loss: 1.752028
e 17 - Total L1 loss: 1.752023
e 18 - Total L1 loss: 1.752017
e 19 - Total L1 loss: 1.752012
e 20 - Total L1 loss: 1.752006
Epoch 00033: reducing learning rate of group 0 to 1.0000e-08.
e 21 - Total L1 loss: 1.752003
e 22 - Total L1 loss: 1.752002
e 23 - Total L1 loss: 1.752002
e 24 - Total L1 loss: 1.752002
e 25 - Total L1 l

In [None]:
weights_savepath = '/content/drive/MyDrive/eg3d/weights.pt'
torch.save(model.state_dict(),weights_savepath)