This notebook demifies BBoxBlock. 

### Key Questions
1. After BBoxBlock transformation of my pipeline, what is the format of the bbox?
    - It depends on your bbox format before feeding into BBoxBlock
    - ```get_annotations``` output bbox in [x0, y0, x1, y1] format


### Notes
1. When ```Transform``` intake a transform function, no instantiation is needed. If you do subclass on ```Transform```, then you need instantiation.
2. In ```PointScaler.__init__```, there is an argument ```y_first``` indicate the input bbox format
3. ```PointScaler.encode``` apply ```_scale_pnts```, ```y_first``` will flip input from [y0, x0, y1, x1] to [x0, y0, x1, y1]
4. ```bb_pad``` takes list of tuple (each tuple represents an image, bbox and its label), take a max bbox # among the list of samples. And apply padding on bbox and label for the rest of samples

In [1]:
import os
import sys
path = os.path.join(os.getcwd(), '..')
sys.path.append(path)

from pathlib import Path
from pdb import set_trace
import warnings
warnings.filterwarnings('ignore')

from fastai2.vision.all import *

from src.data.dblock import build_dblock
from src.data.dls import build_dataloaders

%load_ext autoreload
%autoreload 2

### 1. Wrap a Function into ```Transform```

In [2]:
def double(x):
    return 2*x

In [3]:
dbl = Transform(double)
dbl(3)

6

### 2. BBoxBlock
```
BBoxBlock = TransformBlock(  
    type_tfms=TensorBBox.create,   
    item_tfms=PointScaler,   
    dls_kwargs = {'before_batch': bb_pad}  
    )
```

### 2a. type_tfms: TensorBBox.create (function)

In [4]:
a_bbox = [10., 20., 30., 40.]
b_bbox = TensorBBox.create([10., 20., 30., 40.])
b_bbox.shape

torch.Size([1, 4])

In [7]:
b_bbox

TensorBBox([[10., 20., 30., 40.]])

### 2b. item_tfms: PointScaler (Transform class)
```
class PointScaler(Transform):
    "Scale a tensor representing points"
    order = 1
    def __init__(self, do_scale=True, y_first=False): 
        self.do_scale,self.y_first = do_scale,y_first
    def _grab_sz(self, x):
        if isinstance(x, Tensor):
            self.sz = [x.shape[-1], x.shape[-2]]  
        else: 
            x.size
        return x
```

#### Notes
Essentially PointScaler do 2 things when transforming ```TensorBBox```:
1. it horizontally flip ```TensorBBox``` if ```y_first = True```
2. it rescale bbox such that center = (0, 0), and bbox ranges = [-1, +1]

In [10]:
type(b_bbox), isinstance(b_bbox, TensorPoint)

(fastai2.vision.core.TensorBBox, True)

In [12]:
b_bbox.sz = 100
b_bbox

TensorBBox([[10., 20., 30., 40.]])

In [13]:
b_bbox.get_meta('img_size')

In [19]:
PointScaler.sz = 256
PointScaler(do_scale = False)(b_bbox), PointScaler(do_scale = False, y_first = True)(b_bbox)

(TensorBBox([[10., 20., 30., 40.]]), TensorBBox([[20., 10., 40., 30.]]))

PointScaler() does the following:
1. Do not flip (x, y) coordinate
2. Apply scaling so that image center = [0, 0]

In [29]:
c_bbox = PointScaler()(b_bbox)
c_bbox

TensorBBox([[-0.9219, -0.8438, -0.7656, -0.6875]])

### 2c. batch transform: bb_pad

In [53]:
img = TensorImage(torch.randn((2, 2, 3)))
bbox = c_bbox
lbl = Tensor([1])

ccc_bbox = torch.stack((bbox, bbox, bbox)).squeeze()
ccc_lbl = Tensor([1, 1, 1])

sample1 = (img, c_bbox, lbl)
sample2 = (img, ccc_bbox, ccc_lbl)
samples = [sample1, sample2]

samples

[(TensorImage([[[ 2.1758, -0.0153,  0.6612],
           [ 0.2474, -0.0908,  0.7472]],
  
          [[ 0.6527,  0.2712,  1.7468],
           [ 0.2420, -1.5194, -1.2021]]]),
  TensorBBox([[-0.9219, -0.8438, -0.7656, -0.6875]]),
  tensor([1.])),
 (TensorImage([[[ 2.1758, -0.0153,  0.6612],
           [ 0.2474, -0.0908,  0.7472]],
  
          [[ 0.6527,  0.2712,  1.7468],
           [ 0.2420, -1.5194, -1.2021]]]),
  tensor([[-0.9219, -0.8438, -0.7656, -0.6875],
          [-0.9219, -0.8438, -0.7656, -0.6875],
          [-0.9219, -0.8438, -0.7656, -0.6875]]),
  tensor([1., 1., 1.]))]

In [55]:
bb_pad(samples)

[(TensorImage([[[ 2.1758, -0.0153,  0.6612],
           [ 0.2474, -0.0908,  0.7472]],
  
          [[ 0.6527,  0.2712,  1.7468],
           [ 0.2420, -1.5194, -1.2021]]]),
  tensor([[-0.9219, -0.8438, -0.7656, -0.6875],
          [ 0.0000,  0.0000,  0.0000,  0.0000],
          [ 0.0000,  0.0000,  0.0000,  0.0000]]),
  tensor([1., 0., 0.])),
 (TensorImage([[[ 2.1758, -0.0153,  0.6612],
           [ 0.2474, -0.0908,  0.7472]],
  
          [[ 0.6527,  0.2712,  1.7468],
           [ 0.2420, -1.5194, -1.2021]]]),
  tensor([[-0.9219, -0.8438, -0.7656, -0.6875],
          [-0.9219, -0.8438, -0.7656, -0.6875],
          [-0.9219, -0.8438, -0.7656, -0.6875]]),
  tensor([1., 1., 1.]))]

### 3. How I read BBox before Feeding into BBoxBlock?

In [58]:
from src.data.utils import decode_coco_json

In [57]:
data_path = Path('/userhome/34/h3509807/wheat-data')
data_path.ls()

(#7) [Path('/userhome/34/h3509807/wheat-data/train.json'),Path('/userhome/34/h3509807/wheat-data/bkup'),Path('/userhome/34/h3509807/wheat-data/train_mini.json'),Path('/userhome/34/h3509807/wheat-data/train'),Path('/userhome/34/h3509807/wheat-data/train.csv'),Path('/userhome/34/h3509807/wheat-data/sample_submission.csv'),Path('/userhome/34/h3509807/wheat-data/test')]

In [60]:
img_ids, lbl_bbox, img2bbox = decode_coco_json(data_path / 'train_mini.json')

In [62]:
img2bbox[img_ids[0]]

([[834.0, 222.0, 890.0, 258.0],
  [226.0, 548.0, 356.0, 606.0],
  [377.0, 504.0, 451.0, 664.0],
  [834.0, 95.0, 943.0, 202.0],
  [26.0, 144.0, 150.0, 261.0],
  [569.0, 382.0, 688.0, 493.0],
  [52.0, 602.0, 134.0, 647.0],
  [627.0, 302.0, 749.0, 377.0],
  [412.0, 367.0, 480.0, 449.0],
  [953.0, 220.0, 1009.0, 323.0],
  [30.0, 70.0, 156.0, 203.0],
  [35.0, 541.0, 81.0, 587.0],
  [103.0, 60.0, 220.0, 143.0],
  [417.0, 4.0, 527.0, 95.0],
  [764.0, 299.0, 883.0, 392.0],
  [539.0, 58.0, 597.0, 188.0],
  [139.0, 274.0, 260.0, 350.0],
  [461.0, 634.0, 579.0, 698.0],
  [215.0, 634.0, 328.0, 709.0],
  [134.0, 903.0, 261.0, 952.0],
  [737.0, 545.0, 824.0, 593.0],
  [292.0, 930.0, 335.0, 976.0],
  [0.0, 827.0, 86.0, 885.0],
  [324.0, 44.0, 381.0, 114.0],
  [663.0, 794.0, 779.0, 858.0],
  [325.0, 730.0, 401.0, 802.0],
  [155.0, 554.0, 229.0, 624.0],
  [783.0, 833.0, 853.0, 924.0],
  [534.0, 46.0, 607.0, 270.0],
  [155.0, 281.0, 261.0, 419.0],
  [101.0, 240.0, 183.0, 315.0],
  [583.0, 329.0, 663.0, 