# Named Tensor

The dimensions (or axes) of our tensors usually index something like pixel locations or color channels. This means when we want to index into a tensor, we need to remember the ordering of the dimensions and write our indexing accordingly. As data is transformed through multiple tensors, keeping track of which dimension contains what data can be error-prone. To make things concrete, imagine that we have a 3D tensor like img_t from previous exercise, and we want to convert it to grayscale. We looked up typical weights for the colors to derive a single brightness value

In [2]:
import torch

img_t = torch.randn(3, 5, 5) # shape [channels, row, columns]
weights = torch.tensor([0.2126, 0.7152, 0.0722])

In [5]:
img_t

tensor([[[-0.5589, -1.0746, -0.6100, -0.8683, -1.6213],
         [ 0.5417, -0.1866,  0.4269, -0.7971, -0.3946],
         [-0.6690, -0.3845, -1.7485,  0.1154,  0.9909],
         [-0.0973, -0.1809, -1.3386,  0.2153, -0.8393],
         [-2.1066, -0.5012,  0.1529, -0.9267,  0.0737]],

        [[-1.1205, -0.9946,  0.4220,  0.6887,  0.0620],
         [-0.3219,  0.9497, -0.2740, -1.4052, -0.3920],
         [-0.0355, -1.2609, -0.3999, -0.3529, -1.7494],
         [-1.5558,  0.3576, -0.1922, -0.6631,  0.1494],
         [-0.1353, -0.4318,  1.6777, -1.0682, -0.1450]],

        [[ 0.0434,  1.8187,  0.3129,  0.6049, -0.0367],
         [ 1.0821, -1.0859, -0.0346, -0.4126, -2.1605],
         [-0.2485,  0.0835,  0.1817, -0.6669,  2.2044],
         [-0.0636,  0.5213, -0.0782,  0.1915, -0.7568],
         [-0.0136,  1.1733, -0.5161, -1.7136, -1.2537]]])

In [8]:
weights

tensor([0.2126, 0.7152, 0.0722])

We also often want our code to generalize for example, from grayscale images represented as 2D tensors with height and width dimensions to color images adding a third channel dimension (as in RGB), or from a single image to a batch of images. In previous exercise (`notebook 2a`), we introduced an additional batch dimension in `batch_t` here we pretend to have a batch of 2

In [10]:
batch_t = torch.randn(2, 3, 5, 5) # shape [batch, channels, rows, column]
batch_t

tensor([[[[-0.8129, -0.1675, -0.8787, -0.3952,  1.9291],
          [-0.8607,  0.3021,  1.7827,  0.0149,  0.4058],
          [-0.1087,  0.3600,  0.2970,  1.6537,  0.0348],
          [-1.8040, -2.7514,  0.5948, -1.2703,  1.0410],
          [-0.4613, -0.8717, -0.4430, -1.0025, -1.1186]],

         [[ 0.6857, -0.8334,  0.1545, -0.7820, -0.6724],
          [-1.9802, -0.9720,  0.7111,  0.6594,  0.4160],
          [-0.2974,  1.5287, -0.4158,  1.2487, -1.2285],
          [ 0.3075,  0.6630, -2.3555, -0.7474,  0.0067],
          [ 0.7398,  0.3044,  0.3735, -1.1308,  0.9371]],

         [[ 0.6127, -0.7907, -2.1932,  0.1061, -0.3831],
          [-0.3182,  1.6606, -0.0240,  1.2927,  0.1177],
          [ 0.8799, -0.2720,  0.6009, -0.0716, -1.0625],
          [-0.3228,  0.6162,  0.0630,  1.8843, -0.0269],
          [ 0.2552,  0.8008,  1.6354,  0.1947,  0.1643]]],


        [[[ 1.7715, -0.1015,  1.0379, -1.6435,  0.7330],
          [ 1.1886,  0.3734, -1.0309, -1.1605,  0.4632],
          [ 0.9654, -0.

In [14]:
img_gray_naive = img_t.mean(-3) # mean from all channels
img_gray_naive

tensor([[-0.5453, -0.0835,  0.0416,  0.1417, -0.5320],
        [ 0.4340, -0.1076,  0.0394, -0.8716, -0.9824],
        [-0.3177, -0.5206, -0.6556, -0.3015,  0.4820],
        [-0.5722,  0.2327, -0.5363, -0.0854, -0.4822],
        [-0.7518,  0.0801,  0.4382, -1.2362, -0.4417]])

In [17]:
batch_gray_naive = batch_t.mean(-3) # mean all 3 channels 2 batch, resulting 2 channels bcs before its 2 batch
batch_gray_naive

tensor([[[ 0.1619, -0.5972, -0.9725, -0.3570,  0.2912],
         [-1.0530,  0.3302,  0.8233,  0.6557,  0.3132],
         [ 0.1579,  0.5389,  0.1607,  0.9436, -0.7521],
         [-0.6064, -0.4907, -0.5659, -0.0445,  0.3403],
         [ 0.1779,  0.0778,  0.5220, -0.6462, -0.0058]],

        [[ 1.0293,  0.2124,  0.5916, -0.5614,  0.6824],
         [-0.1450,  0.0072, -0.9007,  0.6172,  0.4515],
         [ 0.0479,  0.7052,  0.1393, -0.3158, -0.8363],
         [-0.3273,  0.0604, -0.2565, -0.2618,  0.2394],
         [-0.3183, -0.9241, -0.2589,  0.2064,  1.0050]]])

In [19]:
img_gray_naive.shape, batch_gray_naive.shape

(torch.Size([5, 5]), torch.Size([2, 5, 5]))

But now we have the weight, too. PyTorch will allow us to multiply things that are the same shape, as well as shapes where one operand is of size 1 in a given dimension. It also appends leading dimensions of size 1 automatically. This is a feature called broadcasting. `batch_t` of shape (2, 3, 5, 5) is multiplied by `unsqueezed_weights` of shape (3, 1, 1), resulting in a tensor of shape (2, 3, 5, 5), from which we can then sum the third dimension from the end (the three channels):

In [22]:
unsqueezed_weights = weights.unsqueeze(-1).unsqueeze_(-1)
img_weights = (img_t * unsqueezed_weights)
batch_weights = (batch_t * unsqueezed_weights)
img_gray_weighted = img_weights.sum(-3)
batch_gray_weighted = batch_weights.sum(-3)

img_weights.shape, batch_weights.shape, unsqueezed_weights.shape

(torch.Size([3, 5, 5]), torch.Size([2, 3, 5, 5]), torch.Size([3, 1, 1]))

# Managing Tensor dtype

In order to allocate a tensor of the right numeric type, we can specify the proper dtype as an argument to the constructor

In [27]:
double_points = torch.ones(10, 2, dtype=torch.double)
double_points.dtype

torch.float64

In [28]:
short_points = torch.tensor([[1, 2], [3, 4]], dtype=torch.short)
short_points.dtype

torch.int16

We can also cast the output of a tensor creation function to the right type using the corresponding casting method

In [29]:
double_points = torch.ones(10, 2).double()
double_points.dtype

torch.float64

In [31]:
short_points = torch.ones(10, 2).short()
short_points.dtype

torch.int16

In [32]:
# more convinience method
double_points = torch.zeros(10, 2).to(torch.double)
short_points = torch.ones(10, 2).to(dtype=torch.short)

When mixing input types in operations, the inputs are converted to the larger type automatically. Thus, if we want 32-bit computation, we need to make sure all our inputs are (at most) 32-bit

In [33]:
points_64 = torch.randn(5, dtype=torch.double)
points_64

tensor([0.3175, 0.5838, 1.2843, 1.3425, 0.5230], dtype=torch.float64)

In [34]:
points_short = points_64.to(torch.short)
points_short

tensor([0, 0, 1, 1, 0], dtype=torch.int16)

In [35]:
points_64 * points_short

tensor([0.0000, 0.0000, 1.2843, 1.3425, 0.0000], dtype=torch.float64)

The result show `int16` converted to larger type, which is `float64`