<a href="https://colab.research.google.com/github/DavoodSZ1993/Dive-into-Deep-Learning-Notes-/blob/main/14_4_anchor_boxes_notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install d2l==1.0.0-alpha1.post0 --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m93.0/93.0 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m121.9/121.9 kB[0m [31m9.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.9/84.9 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m29.5 MB/s[0m eta [36m0:00:00[0m
[?25h

## 14.4 Anchor Boxes 

* `torch.set_printoptions(precision=None)`: Set options for printing. Items shamelessly are taken from NumPy.
* **Precision**: Number of digits of precision for floating point outputs.

In [1]:
import torch
torch.set_printoptions(2)

torch.tensor([1.123456])

tensor([1.12])

### 14.4.1 Generating Multiple Anchor Boxes

* `torch.meshgrid(*tensors, indexing=None)`: Creates grids of coordinates specified by the 1D inputs in attr: tensors.
* **tensors (list of tensors)**: list of scalar or 1 dimensional tensors. Scalars will be treated as tensors of size (1,) automatically.
* **indexing (optional [str])**: The indexing mode, either "xy" or "ij", defaults to "ij".
* If "xy" is selected, the first dimension corresponds to the cardinality of the second input and the second dimension corresponds to the cardinality of the first input.
* If "ij" is selected, the dimensions are in the same order as the cardinality of the input.
* **Returns**: if the input has $N$ tensors of size $S_0, ..., S_{N-1}$, then the output will also have $N$ tensors, where each tensor is of shape ($S_0, ..., S_{N-1}$)

In [None]:
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])

grid_x, grid_y = torch.meshgrid(x, y, indexing='ij')
grid_x, grid_y, grid_x.shape

(tensor([[1, 1, 1],
         [2, 2, 2],
         [3, 3, 3]]),
 tensor([[4, 5, 6],
         [4, 5, 6],
         [4, 5, 6]]),
 torch.Size([3, 3]))

In [None]:
grid_x = grid_x.reshape(-1)
grid_x

tensor([1, 1, 1, 2, 2, 2, 3, 3, 3])

* `torch.cat(tesnors, dim)`: Concatenates the given sequence of tensors in the given dimension.
* All tensors must either have the same shape (except in the concatenating dimension) or be empty.

In [None]:
x = torch.tensor([[1, 2],
                  [3, 4]])

Y1 = torch.cat((x, x), dim=0)
Y2 = torch.cat((x, x), dim=1)
Y1, Y2

(tensor([[1, 2],
         [3, 4],
         [1, 2],
         [3, 4]]),
 tensor([[1, 2, 1, 2],
         [3, 4, 3, 4]]))

* `torch.stack(tensors, dim=0)`: Concatenates a sequence of tensors along a new dimension.
* All tensors need to be os the same size.

In [None]:
x = torch.tensor([[1, 2],
                  [3, 4]])

Y1 = torch.stack((x, x), dim=0)
Y2 = torch.stack((x, x), dim=1)
Y1, Y2

(tensor([[[1, 2],
          [3, 4]],
 
         [[1, 2],
          [3, 4]]]),
 tensor([[[1, 2],
          [1, 2]],
 
         [[3, 4],
          [3, 4]]]))

* `torch.t(input)`: Expects `input` to be $<= 2D$ tensor and transposes dimensions 0, 1. 

In [None]:
x.T

tensor([[1, 3],
        [2, 4]])

* `torch.tensor.repeat(*sizes)`: Repeats this tensor along the specified dimensions.

In [None]:
x.repeat(2, 1)

tensor([[1, 2],
        [3, 4],
        [1, 2],
        [3, 4]])

* `torch.repeat_interleave(unput, repeats, dim=0)`: Repeats elements of a tensor.

In [None]:
x.repeat_interleave(2)

tensor([1, 1, 2, 2, 3, 3, 4, 4])

* `torch.unsqueeze(input, dim)`: Returns a new tensor with a dimension of size one inserted at the specified position.

In [None]:
x.unsqueeze(0).shape

torch.Size([1, 2, 2])

In [None]:
in_height, in_width = 561, 728
sizes, ratios = [0.75, 0.5, 0.25], [1, 2, 0.5]

num_sizes, num_ratios = len(sizes), len(ratios)
boxes_per_pixel = (num_sizes + num_ratios - 1)
size_tensor = torch.tensor(sizes)
ratio_tensor = torch.tensor(ratios)

In [None]:
offset_h, offset_w = 0.5, 0.5
steps_h = 1.0 / in_height   # 1 / 561
steps_w = 1.0 / in_width    # 1 / 728

center_h = (torch.arange(in_height) + offset_h) * steps_h
center_w = (torch.arange(in_width) + offset_w) * steps_w
center_h.shape, center_w.shape

(torch.Size([561]), torch.Size([728]))

In [None]:
shift_y, shift_x = torch.meshgrid(center_h, center_w, indexing='ij')
shift_y, shift_x = shift_y.reshape(-1), shift_x.reshape(-1)

In [None]:
size_tensor * torch.sqrt(ratio_tensor[0]) # n

tensor([0.75, 0.50, 0.25])

In [None]:
sizes[0] * torch.sqrt(ratio_tensor[1:])  # m - 1

tensor([1.06, 0.53])

### 14.4.2 Intersection over Union (IoU)

* `torch.clamp(input, min=None, max=None)`: Clamps all elements in `input` into the range [`min`, `max`].

In [2]:
A = torch.randn(4)
A

tensor([ 2.14, -2.83,  0.39,  1.29])

In [3]:
A.clamp(min=-0.5, max=0.5)

tensor([ 0.50, -0.50,  0.39,  0.50])

### 14.4.3 Labeling Anchor Boxes in Training Data

* `torch.full(size, fill_value)`: Creates a tensor of size `size` filled with `fill_value`. The tensor's `dtype` is infered from `fill_value`.

In [5]:
torch.full((2, 2), -1)

tensor([[-1, -1],
        [-1, -1]])

* `torch.nonzero(input)`: Returns a tensor containing the indices of all non-zero elements of the input.

In [6]:
a = torch.tensor([1, 1, 1, 0, 1])
torch.nonzero(a)

tensor([[0],
        [1],
        [2],
        [4]])

* `self.long()` is equivalent to `self.to(torch.int64)`

### 14.4.4 Predicting Bounding Boxes with Non-Maximum Suppression

* `torch.argsort(input, dim=-1, descending=None)`: Returns the indices that sort a tensor along a given dimension in ascending order by value.

In [8]:
A = torch.randn(4, 4)
A

tensor([[ 0.20,  0.04, -0.52, -0.92],
        [ 0.52, -0.85,  1.39,  1.90],
        [-0.23,  0.09,  0.76, -0.87],
        [ 0.95, -1.16, -0.58, -2.17]])

In [9]:
torch.argsort(A, dim=1)

tensor([[3, 2, 1, 0],
        [1, 0, 2, 3],
        [3, 0, 1, 2],
        [3, 1, 2, 0]])

In [10]:
torch.argsort(A, dim=1, descending=True)

tensor([[0, 1, 2, 3],
        [3, 2, 0, 1],
        [2, 1, 0, 3],
        [0, 2, 1, 3]])

* `torch.numel(input)`: Returns the total number of elements in the input tensor.

In [11]:
A = torch.randn(4, 4)
A.numel()

16

* `torch.unique(input, return_counts=False)`: Returns the unique elements of the input tensor.
* **return_counts (bool)**: whether to also return the counts for each unique element.

In [12]:
A = torch.tensor([1, 3, 2, 3])
A.unique(return_counts=True)

(tensor([1, 2, 3]), tensor([1, 1, 2]))

In [16]:
label = ('dog=', 'cat=')[0] 
label

'dog='

In [18]:
label = ('dog=', 'cat=')[1] 
label

'cat='