# Part 0: Dataloader and Visualizations

In [1]:
import numpy as np
from PIL import Image, ImageDraw
import scipy.io
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.models as models
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
import wandb

from voc_dataset import VOCDataset
from utils import *

USE_WANDB = False
%load_ext autoreload
%autoreload 2

## Q0.1: Editing the Dataloader
The first part of the assignment involves editing the dataloader so that we can access bounding-box proposals as well as the ground-truth bounding boxes. The ground truth bounding box can be accessed through the VOC Dataset annotations itself and we have completed this part for you in the starter code. 

Unsupervised bounding box proposals are obtained through methods such as [Selective Search](https://ivi.fnwi.uva.nl/isis/publications/2013/UijlingsIJCV2013/UijlingsIJCV2013.pdf). Since Selective Search is slow to run on each image, we have pre-computed the bounding box proposals for you (you downloaded this in the data preparation step).

Your task is to change the dataloader to obtain the proposed bounding boxes for each image. Feel free to experiment with the data in the files to figure out the number of proposals per image, their scores, etc. Returning a dictionary would be convenient here. For the bounding boxes, using the relative positions is usually a better idea since they are invariant to changes in the size of the image.

In [30]:
dataset = VOCDataset('trainval', top_n=10,
                    data_dir='data/VOCdevkit/VOC2007/')

Path:C:\Users\Mo\OneDrive\Notes\CMU\16824-Visual Learning\object-localization\data\VOCdevkit\VOC2007


**Q0.1**: Load the image corresponding to index 2020 and print the GT labels associated with it.

**Hint**: items at a particular index can be accesed by usual indexing notation (dataset[idx])

In [32]:
# TODO: get the image information from index 2020
idx = 2020
data = dataset[idx]
print(data.keys())
print(data['rois'].shape)

dict_keys(['image', 'label', 'wgt', 'rois', 'gt_boxes', 'gt_classes'])
(10, 4)


## Q0.2 and Q0.3: Wandb Logging
First, let's initialize a Weights and Biases project. 

In [2]:
USE_WANDB = True
if USE_WANDB:
    wandb.init(project="vlr-hw1", reinit=False)

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33m3m-m[0m. Use [1m`wandb login --relogin`[0m to force relogin


**Q0.2**: Complete this block for overlaying the ground truth box on an image.

**Hint**: convert the image tensor to a PIL image and plot it (check `utils.py` for helper functions). You can use [this](https://docs.wandb.ai/library/log) as a reference for logging syntax.

In [11]:
class_id_to_label = dict(enumerate(dataset.CLASS_NAMES))

# TODO: load the GT information corresponding to index 2020.
original_image = tensor_to_PIL(data['image'])
gt_labels = data['gt_classes']
gt_boxes = data['gt_boxes']

# TODO: log the GT bounding box
img = wandb.Image(original_image, boxes={
    "ground_truth": {
        "box_data": get_box_data(gt_labels, gt_boxes),
        "class_labels": class_id_to_label,
    },
})
wandb.log({"Image": img}, step=1)

**Q0.3**: Visualize the top 10 bounding proposals corresponding to index 2020.

**Hint**: Check the `get_box_data` function in `utils.py` and understand how it is being used. 

In [None]:
roi_img = wandb.Image(original_image, boxes={
    "ground_truth": {
        "box_data": get_box_data(data['gt_classes']*len(data['rois']), data['rois']),
        "class_labels": class_id_to_label,
    },
})
wandb.log({"Proposals": roi_img})