# PyTorch tutorials

Tensors are fundamental data structures for deep learning.

Indices required | Computer Science | Mathematics
--- | --- | ---
0 | number | scalar
1 | array | vector
2 | 2D array | matrix

any dimension greater than 2 called n-dimension tensor / nd-tensor. 3D array called, 3D-tensor, etc

In [1]:
import torch
import numpy as np

### Accessing tensors with indices

In [2]:
a = [1,2,3,4]

In [3]:
a[2]

3

In [4]:
dd = [[1,2,3],
     [4,5,6],
     [7,8,9]]

In [5]:
dd[0][1]

2

## Rank, Axes and Shape

Build up concept of tensors.
1. Rank of a tensor indicates # of dimensions present within a tensor. In short it tells how many indices are required to access the tensor.

2. Rank = # of axes in the tensor 

3. Shape of the tensor is determined by the shape of each axis. (#rows, #columns, #channels/depth)

In [6]:
a = [[1,2,3],
    [4,5,6],
    [7,8,9]]

In [7]:
a[0]

[1, 2, 3]

In [8]:
a[1]

[4, 5, 6]

In [9]:
a[2]

[7, 8, 9]

In [10]:
a[0][0]

1

In [11]:
a[1][0]

4

In [12]:
a[2][0]

7

In [13]:
dd = [[[111, 112, 113], [121, 122, 123]],
     [[211, 212, 213], [221, 222, 223]],
     [[311, 312, 313], [321, 322, 323]]]
dd

[[[111, 112, 113], [121, 122, 123]],
 [[211, 212, 213], [221, 222, 223]],
 [[311, 312, 313], [321, 322, 323]]]

In [14]:
t = torch.tensor(dd)
t

tensor([[[111, 112, 113],
         [121, 122, 123]],

        [[211, 212, 213],
         [221, 222, 223]],

        [[311, 312, 313],
         [321, 322, 323]]])

In [15]:
t.shape  # size and shape of tensor is the same

torch.Size([3, 2, 3])

In [16]:
type(t)

torch.Tensor

In [17]:
t[0][0][2]

tensor(113)

In [18]:
t[1][1][0]

tensor(221)

In [20]:
t[2][0][1]

tensor(312)

### Tensor Reshaping

Changes the shape but not the underlying data.

In [21]:
a = torch.tensor(a)
a

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [22]:
a.shape

torch.Size([3, 3])

In [23]:
a.reshape(1,9)

tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [24]:
a.reshape(1,9).shape

torch.Size([1, 9])

## CNN Tensors

Shape of CNN tensor typically has length of 4. i.e we have rank 4 tensor with 4 axis. last axis is where data values are located

a = [a0,a1,a2,a3]

image data usually represented as:
[# of data samples, # channels, height, width], hence a 4D-tensor.

1. First axis: total samples / batch size of samples.
2. Second axis: color channels or depth of image
3. Third axis: height / rows in an image
4. Fourth axis: width, columns in an image

As image passes through the network, its shape changes. That means, the number of channels will increase of decrease based on number of filters present in the layer, while the height and width will change too, based on the convolution operations being performed in each layer.


When the convolution operation is performed, the output of each convolutional layer are modified images. The number of these images corresponds directly to the number of filters present in these layers. These are called **Feature Maps**. These feature maps are created due to convolution between the input color channels and the convolution filters.

### PyTorch Tensors

Apart from Rank, Axes, and Shape which are the fundamental attributes for all the tensors for all the deep learning frameworks, the attributes which are specific to PyTorch tensors are datatype, device and layout.

In [25]:
t = torch.Tensor()
t  # we can create an empty tensor

tensor([])

In [26]:
print(t.dtype)
print(t.device)
print(t.layout)

torch.float32
cpu
torch.strided


PyTorch attributes are:

1. DataType: specifies type of data contained in a tensor. Each data has CPU and GPU datatype. Tensor operations between tensors must happen with tensors of the same datatype.

2. Device: indicates where the tensor computations will be performed. Whether CPU or GPU. 

3. Layout: strided (tells us way in which data is laid out in memory). It is the default layout and not usually changed.


Tensors contain data of a uniform type. Tensor computations depends on the type and device.

In [27]:
t1 = torch.tensor([1,2,3])
t2 = torch.tensor([1.,2.,3.,4.])

In [28]:
t1.dtype

torch.int64

In [29]:
t2.dtype

torch.float32

In [30]:
t1 + t2   # will fail due to non compatible datatypes

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

In [31]:
t1.device

device(type='cpu')

In [32]:
t2 = t1.cuda()

In [33]:
t2.device

device(type='cuda', index=0)

In [34]:
t2

tensor([1, 2, 3], device='cuda:0')

In [35]:
t1

tensor([1, 2, 3])

In [36]:
t1 + t2   # if tensor there on GPU and CPU then can't be added since there on different device

RuntimeError: expected device cpu but got device cuda:0

### Creating Tensors from data

4 primary ways of creating data objects in PyTorch

1. torch.Tensor(data)  - odd ball, since it is class Constructor.

below three may look similar in output but work differently internally
2. torch.tensor(data)
3. torch.as_tensor(data)
4. torch.from_numpy(data)

In [37]:
data = np.array([1,2,3])
type(data)

numpy.ndarray

In [38]:
torch.Tensor(data)    # creates float value array

tensor([1., 2., 3.])

lower case tensor is factory function which builds tensor objects, while capital case Tensor is Tensor constructor calling

In [39]:
torch.tensor(data) 

tensor([1, 2, 3])

In [40]:
torch.as_tensor(data)

tensor([1, 2, 3])

In [41]:
torch.from_numpy(data)

tensor([1, 2, 3])

In [42]:
torch.eye(2)

tensor([[1., 0.],
        [0., 1.]])

In [43]:
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [44]:
torch.ones(2,2)

tensor([[1., 1.],
        [1., 1.]])

In [45]:
torch.rand(2,2)

tensor([[0.8844, 0.0195],
        [0.3193, 0.4995]])

## Creating PyTorch Tensors for Deep Learning

PyTorch tensors are instances of torch.Tensor class. 
As discussed above, the torch.Tensor is a Constructor which helps us return object of the Tensor class. while the other 3 methods are factory methods. Factory methods are the notation of object oriented programming, in which they take in parameters and returns the instance of the class.



In [46]:
data = np.array([1,2,3])
type(data)

numpy.ndarray

In [47]:
t1 = torch.Tensor(data)  # constructor method
t2 = torch.tensor(data)   # factory method
t3 = torch.as_tensor(data)  # factory method
t4 = torch.from_numpy(data)  # factory method

In [48]:
print(t1)
print(t2)
print(t3)
print(t4)

tensor([1., 2., 3.])
tensor([1, 2, 3])
tensor([1, 2, 3])
tensor([1, 2, 3])


In [49]:
print(t1.dtype)
print(t2.dtype)
print(t3.dtype)
print(t4.dtype)

torch.float32
torch.int64
torch.int64
torch.int64


In [50]:
torch.get_default_dtype()

torch.float32

As can be seen above, the default datatype for the tensor is float32 which is assigned irrespective of the type of the data which is passed to it. the above line proves that the global default datatype of data for the data in PyTorch is **float32**.

Factory functions on the other hand choose dtype based on the inputs which are provided. example shown below. Below 2 lines show automatic type inference based on the data provided to it.

In [51]:
torch.tensor(np.array([1,2,3]))

tensor([1, 2, 3])

In [52]:
torch.tensor(np.array([1.,2.,3.]))  

tensor([1., 2., 3.], dtype=torch.float64)

data types can be explicitly set as shown below.

In [53]:
torch.tensor(np.array([1,2,3]), dtype=torch.float64)  

tensor([1., 2., 3.], dtype=torch.float64)

constructor however does not have type casting functionality. there is no type inference. it can be shown below

In [54]:
t = np.array([1,2,3], dtype=np.float64)

In [55]:
t.dtype

dtype('float64')

In [56]:
t

array([1., 2., 3.])

In [57]:
t5 = torch.Tensor(t)

In [58]:
t5.dtype

torch.float32

In [59]:
t5 = torch.Tensor(t, dtype=torch.float64)  # unable to typecast since a Constructor

TypeError: new() received an invalid combination of arguments - got (numpy.ndarray, dtype=torch.dtype), but expected one of:
 * (torch.device device)
 * (torch.Storage storage)
 * (Tensor other)
 * (tuple of ints size, torch.device device)
      didn't match because some of the keywords were incorrect: dtype
 * (object data, torch.device device)
      didn't match because some of the keywords were incorrect: dtype


## Memory sharing vs copying

In [60]:
data = np.array([1,2,3])
data

array([1, 2, 3])

In [61]:
t1 = torch.Tensor(data)  # constructor method
t2 = torch.tensor(data)   # factory method
t3 = torch.as_tensor(data)  # factory method
t4 = torch.from_numpy(data)  # factory method

In [62]:
data[0] = 0
data[1] = 0
data[2] = 0

In [63]:
# contain the original data values as given in the array
print(t1)  
print(t2)

tensor([1., 2., 3.])
tensor([1, 2, 3])


In [64]:
print(t3)
print(t4)

tensor([0, 0, 0])
tensor([0, 0, 0])


From above, t1 and t2 show the same values as before. t3 and t4 contains data after the change, i.e it mirrors the data. This has to do with how memory is allocated by the tensors.

t1 and t2 create additional copies of input data in memory, i.e they copy the data into memory, while t3 and t4, share the data in memory with the numpy array. This has to do with sharing of memory for performance.

i.e
torch.as_tensor() and torch.from_numpy() share data between memory and the tensor, while
torch.tensor() and torch.Tensor() copy data from memory to tensor. 

Best option usually is to use torch.tensor() function for converting numpy data to tensor objects. Second option, if we would like to have performance boost, is to use torch.as_tensor()

Things to remember:
1. torch.as_tensor() does not work with built in Python data structures.
2. since numpy.ndarray allocate memory on the CPU, the as_tensor() must copy data from CPU to GPU.
3. as_tensor() performance improvement will be greater if there are lots of back and forth operations between numpy.ndarray and tensor objects.

## Flatten, Reshape and Squeeze

Tensor operations usually fall into 4 higher level operations:
1. Reshaping operations
2. Element-wise operations
3. Reduction operations
4. Access operations

Reshaping are one of the most important type of tensor operations

In [65]:
t = torch.tensor([[1,1,1,1],
                 [2,2,2,2],
                 [3,3,3,3]], dtype=torch.float32)

For the above tensor t, elements in the first axis are vectors, while those in the second axis are numbers.

Axis 1 = rows / height of image / x <br>
Axis 2 = columns / width of image / y <br>
Axis 3 = depth / # of channels in an image / z <br>

In [66]:
# rank 2 tensor with 2 axes with shape (3,4), 2 ways to access it as below
t.shape

torch.Size([3, 4])

In [67]:
t.size()

torch.Size([3, 4])

In [68]:
len(t.shape)   # gives rank of the tensor

2

In [69]:
torch.tensor(t.shape).prod()   # total 12 elements in this tensor

tensor(12)

In [70]:
t.numel() # returns number of elements in the tensor.

12

### Reshaping a Tensor

In [71]:
t1 = t.reshape(1,12)

In [72]:
print(t1.shape)
print(len(t1))

torch.Size([1, 12])
1


In [73]:
t2 = t.reshape(2,6)

In [74]:
print(t2.shape)
print(len(t2))

torch.Size([2, 6])
2


In [75]:
t3 = t.reshape(4,3)

In [76]:
print(t3.shape)
print(len(t3))

torch.Size([4, 3])
4


In [77]:
t4 = t.reshape(12,1)

In [78]:
print(t4.shape)
print(len(t4))

torch.Size([12, 1])
12


Reshape operations do not affect rank of a tensor if reshape has 2 factors of the total number of elements. 

In the above example, the tensor had 12 elements in total. Therefore, 2 element factors of 12 are:
1. (1,12) = 1 x 12
2. (2,6) = 2 x 6
3. (3,4) = 3 x 4
4. (4,3) = 4 x 3
5. (6,2) = 6 x 2
6. (12,1) = 12 x 1

These are the possible reshape combinations for a tensor having total elements 12 in it with 2 factor elements. 
However, we can have 3 factors as well, in which case the rank of the tensor will change: <br>

One such example is: <br>
2 x 2 x 3 = (2,2,3) = 12 elements in total. There can be more combinations as well.

In [79]:
t5 = t.reshape(2,2,3)

In [80]:
t5

tensor([[[1., 1., 1.],
         [1., 2., 2.]],

        [[2., 2., 3.],
         [3., 3., 3.]]])

In [81]:
t5.shape

torch.Size([2, 2, 3])

In [82]:
len(t5.shape)

3

### Squeezing a Tensor

Squeezing drops the axis with a length of 1. While unsqueezing adds an extra axis to the tensor. Below, we can see that initially the tensor had a shape of (1,12); on squeezing it drops the axis with length 1. But on Unsqueezing it adds an axis along the dimension we specify, in this case for dimension 0.

In [83]:
print(t.reshape(1,12))
print(t.reshape(1,12).shape)

tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])
torch.Size([1, 12])


In [84]:
print(t.reshape(1,12).squeeze())
print(t.reshape(1,12).squeeze().shape)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])
torch.Size([12])


In [85]:
print(t.reshape(1,12).squeeze().unsqueeze(dim=0))
print(t.reshape(1,12).squeeze().unsqueeze(dim=0).shape)

tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])
torch.Size([1, 12])


### Flattening a Tensor

Squeezing helps us expand or shrink a tensor. A common usecase for squeezing is best described with a Flatten function. Flattening a tensor essentially means creating a 1D array from all the elements from the tensor. All the other axis except one with length 1 is kept. It helps us reduce the rank of the tensor. Flattening helps in converting a 2D convolutional layer into a 1D fully connected layer at the end stage of the CNN pipeline. For example a 28 x 28 image is flattened into a (784,) array before passing it to the fully connected layer.

In [86]:
def flatten(t):
    t = t.reshape(1,-1)
    t = t.squeeze()
    return t

In [87]:
flatten(t)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

In [88]:
t.reshape(1,12)

tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])

From above, we can observe that, reshape retains the additional axis, while flattening (essentially squeezing removes it). Notice that in reshape, we used (1,-1). This tell PyTorch that the first dimension of tensor will be of length 1 while the second dimension should be inferred on its own, based on the number of elements present in the tensor. -1 indicates self inference of the dimension value. Here we get a (1,12) tensor. If we want automatic inference of rows and just have 1 column, we do the opposite. It can be seen below

In [89]:
t.reshape(-1,1)

tensor([[1.],
        [1.],
        [1.],
        [1.],
        [2.],
        [2.],
        [2.],
        [2.],
        [3.],
        [3.],
        [3.],
        [3.]])

However, if you want to achieve flattening without squeezing, we can exploit this functionality of -1 indexing. Flatten function above gives us shape as length of the tensor, which is 12. We can do it in 1 more way as below.

In [90]:
flatten(t)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

In [91]:
flatten(t).shape

torch.Size([12])

In [92]:
t.reshape(-1,)

tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])

In [93]:
t.reshape(-1,).shape

torch.Size([12])

### Concatenating Tensors

Concatenating tensors is possible only when width or height of the tensor matches, not to forget depth too, if that is involved. For instance, two 2x2 tensors can concatenate to be a (4,2) or a (2,4) tensor, since the height and width of both the tensors are one and the same. However, if we have a (2,4) and a (2x6) tensor, we can only concatenate along the columns (axis=1). Concatenating along an axis simply means increasing the length of that axis from its previous length. Here concatenation will increase num of columns. <br>

If we try to concatenate the rows, it will throw an error as below due to mismatching column numbers.

In [94]:
a = torch.rand(2,4)
a

tensor([[0.6527, 0.6749, 0.8400, 0.3468],
        [0.5510, 0.7167, 0.1529, 0.9549]])

In [95]:
b = torch.rand(2,6)
b

tensor([[0.0461, 0.9802, 0.2830, 0.1092, 0.8832, 0.0994],
        [0.8176, 0.8110, 0.7208, 0.7664, 0.9285, 0.4371]])

In [96]:
c = torch.cat((a,b), dim=1)
c

tensor([[0.6527, 0.6749, 0.8400, 0.3468, 0.0461, 0.9802, 0.2830, 0.1092, 0.8832,
         0.0994],
        [0.5510, 0.7167, 0.1529, 0.9549, 0.8176, 0.8110, 0.7208, 0.7664, 0.9285,
         0.4371]])

In [97]:
c.shape

torch.Size([2, 10])

In [98]:
c = torch.cat((a,b), dim=0)
c

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 4 and 6 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

In [99]:
a = torch.rand(2,2)
a

tensor([[0.6916, 0.2646],
        [0.0282, 0.0037]])

In [100]:
b = torch.rand(2,2)
b

tensor([[0.3574, 0.6882],
        [0.9785, 0.2609]])

In [101]:
# since both are 2x2 concatenating in either dimension will yield a new tensor.
c = torch.cat((a,b), dim=0)
d = torch.cat((a,b), dim=1) 
print(c)
print(d)

tensor([[0.6916, 0.2646],
        [0.0282, 0.0037],
        [0.3574, 0.6882],
        [0.9785, 0.2609]])
tensor([[0.6916, 0.2646, 0.3574, 0.6882],
        [0.0282, 0.0037, 0.9785, 0.2609]])


In [102]:
print(c.shape)
print(d.shape)

torch.Size([4, 2])
torch.Size([2, 4])


## CNN Tensor Flatten operation Explained

In CNN we need to flatten the convolutional layers. But, when we do so, we don't do on a single image or a tensor as explained above, but on a batch of images. Images can be 3 channel if color or 1 channel if grayscale. When we flatten a batch of images in a convolutional layer, we essentially want to have a 1D tensor for each image, but would like to have number of samples in each batch as well <br>

Consider an example. Say we have a batch of 3 images. Each image is grayscale and is of shape (4,4). The resulting batch tensor shape will be (3,4,4). If we have a single channel then it will be (3,1,4,4) or if we have color images then (3,3,4,4). In either case, our batch tensor will be of form, (batch size, num_channels, height, width) format. On flattening a CNN tensor, we essentially want (batch_size, # of elements in flattened grayscale / color images) format.


In [103]:
t1 = torch.tensor([
    [1,1,1,1],
    [1,1,1,1],
    [1,1,1,1],
    [1,1,1,1]
])

t2 = torch.tensor([
    [2,2,2,2],
    [2,2,2,2],
    [2,2,2,2],
    [2,2,2,2]
])

t3 = torch.tensor([
    [3,3,3,3],
    [3,3,3,3],
    [3,3,3,3],
    [3,3,3,3]
])

We need to combine above tensors into a batch. We stack them vertically. 

In [104]:
t = torch.stack((t1,t2,t3), axis=0)
t

tensor([[[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]],

        [[2, 2, 2, 2],
         [2, 2, 2, 2],
         [2, 2, 2, 2],
         [2, 2, 2, 2]],

        [[3, 3, 3, 3],
         [3, 3, 3, 3],
         [3, 3, 3, 3],
         [3, 3, 3, 3]]])

In [105]:
t.shape

torch.Size([3, 4, 4])

## **NOTE: Numpy arrays or images are in (height, width, channels) format while PyTorch represents images in (channels, height, width) format.**

Adding an additional axis as shown below does not change the total element count in the tensor. 

In [106]:
t = t.reshape(3,1,4,4)
t

tensor([[[[1, 1, 1, 1],
          [1, 1, 1, 1],
          [1, 1, 1, 1],
          [1, 1, 1, 1]]],


        [[[2, 2, 2, 2],
          [2, 2, 2, 2],
          [2, 2, 2, 2],
          [2, 2, 2, 2]]],


        [[[3, 3, 3, 3],
          [3, 3, 3, 3],
          [3, 3, 3, 3],
          [3, 3, 3, 3]]]])

Let's index the above tensor. We do 4 things.
1. We first access one image
2. We then access the channel of that image
3. We then access a row of the image in that channel
4. We lastly access a pixel value in the row.

In [107]:
t[0]   # first image

tensor([[[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]]])

In [108]:
t[0][0]  # channel 1 consisting of an image

tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]])

In [109]:
t[0][0][0]   # first row

tensor([1, 1, 1, 1])

In [110]:
t[0][0][0][0]  # first pixel value

tensor(1)

There are different ways to flatten an image tensor. Few are explained below:

In [111]:
t.reshape(1,-1)[0]

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

In [112]:
t.reshape(-1)

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

In [113]:
t.view(t.numel())

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

In [114]:
t.flatten()   # provided by PyTorch

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

We need not flatten all the images into a single long vector without retaining the batch shape. Instead we need to retain the batch size but just flatten individual images. We essentially want (batch_size, # of elements in flattened grayscale / color images) format as discussed earlier.

In [115]:
t.flatten(start_dim=1)  # built in flatten method specifying from which dimension to flatten the tensor

tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])

In [116]:
t.flatten(start_dim=1).shape

torch.Size([3, 16])

We can observe that, we want to retain the shape 0, i.e the batch_size but flatten everything beyond that (channels, height and width) into a single 1D tensor. For our example, we get 3 images flattened into a 16 element 1D-tensor.

In [117]:
t.reshape(t.shape[0],-1)    # same operation performed using reshape operation

tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])

## Broadcasting and Element-wise operations

Operations that operate on tensor elements with same index locations. 

In [118]:
t1 = torch.tensor([
    [1,2],
    [3,4]
], dtype=torch.float32)

t2 = torch.tensor([
    [9,8],
    [7,6]
], dtype=torch.float32)

In [119]:
t1

tensor([[1., 2.],
        [3., 4.]])

In [120]:
# elements in the same indices, i.e corresponding elements in both t1 and t2
print(t1[0][0])
print(t2[0][0])

tensor(1.)
tensor(9.)


Two tensors must have same shape in order to perform elementwise operations. Having same shape essentially means having same length of elements in either of the x, y or z axis; or any other axes in that case

In [121]:
# element wise addition
t1 + t2

tensor([[10., 10.],
        [10., 10.]])

Scalar values are rank 0 tensor since they have no shape. As a result, we can add them without the above criteria being satisfied of 2 tensors being of same rank. Any rank greater than 0 needs to be same for elementwise operation to succeed; scalar operations are special case like below

In [122]:
t1 + 2

tensor([[3., 4.],
        [5., 6.]])

In [123]:
t1.add(2)

tensor([[3., 4.],
        [5., 6.]])

In [124]:
t1 - 2

tensor([[-1.,  0.],
        [ 1.,  2.]])

In [126]:
t1.sub(2)

tensor([[-1.,  0.],
        [ 1.,  2.]])

In [127]:
t2 * 2

tensor([[18., 16.],
        [14., 12.]])

In [128]:
t2.mul(2)

tensor([[18., 16.],
        [14., 12.]])

In [129]:
t2 / 2

tensor([[4.5000, 4.0000],
        [3.5000, 3.0000]])

In [130]:
t2.div(2)

tensor([[4.5000, 4.0000],
        [3.5000, 3.0000]])

One might wonder that the scalar value of rank 0 is operated iteratively on all tensor elements of other tensor so as to succeed in achieving above results. Although, may sound true, it uses concept of **Broadcasting**. Broadcasting is a way in which tensors of different shapes interact to have element wise operations. Essentially, a scalar value is broadcasted to a rank / shape as same as the other tensor and then the element wise operation is performed internally.

In [131]:
np.broadcast_to(2, t1.shape)

array([[2, 2],
       [2, 2]])

In [132]:
t1 + 2

tensor([[3., 4.],
        [5., 6.]])

In [134]:
t1 + torch.tensor(np.broadcast_to(2, t1.shape), dtype=torch.float32)   # result same as above

tensor([[3., 4.],
        [5., 6.]])

Lets take another example. Here, t2 will be broadcasted to match higher rank tensor t1 before any further operations

In [135]:
t1 = torch.tensor([
    [1,1],
    [1,1]
], dtype=torch.float32)

t2 = torch.tensor([2,4], dtype=torch.float32)

In [136]:
t1.shape

torch.Size([2, 2])

In [137]:
t2.shape

torch.Size([2])

In [138]:
np.broadcast_to(t2.numpy(), t1.shape)

array([[2., 4.],
       [2., 4.]], dtype=float32)

In [139]:
t1 + t2

tensor([[3., 5.],
        [3., 5.]])

## Comparison operation

This too happens element wise and outputs a True/False vector based on element wise comparsion

In [153]:
t = torch.tensor([
    [0,5,7],
    [6,0,7],
    [0,8,0]
], dtype=torch.float32)

In [154]:
t.eq(0)   # elements equal to 0

tensor([[ True, False, False],
        [False,  True, False],
        [ True, False,  True]])

In [155]:
t.ge(0)  # elements greater than or equal to 0

tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

In [156]:
t.gt(0)  # elements greater than 0

tensor([[False,  True,  True],
        [ True, False,  True],
        [False,  True, False]])

In [157]:
t.lt(0)  # elements less than 0

tensor([[False, False, False],
        [False, False, False],
        [False, False, False]])

In [158]:
t.le(7) # elements less than and equal to 0

tensor([[ True,  True,  True],
        [ True,  True,  True],
        [ True, False,  True]])

Above operation is essentially, as below

In [160]:
t <= torch.tensor(np.broadcast_to(7, t.shape), dtype=torch.float32)

tensor([[ True,  True,  True],
        [ True,  True,  True],
        [ True, False,  True]])

Other element wise operations also have similar functionality. This includes built-in functions which operate on each element

In [161]:
t.abs()

tensor([[0., 5., 7.],
        [6., 0., 7.],
        [0., 8., 0.]])

In [162]:
t.sqrt()

tensor([[0.0000, 2.2361, 2.6458],
        [2.4495, 0.0000, 2.6458],
        [0.0000, 2.8284, 0.0000]])

In [163]:
t.neg()

tensor([[-0., -5., -7.],
        [-6., -0., -7.],
        [-0., -8., -0.]])

In [164]:
t.neg().abs()

tensor([[0., 5., 7.],
        [6., 0., 7.],
        [0., 8., 0.]])

## ArgMax and Reduction Operations

Reduction operation on a tensor is one which reduces number of elements present in the tensor

In [165]:
t = torch.tensor([
    [0,1,0],
    [2,0,2],
    [0,3,0]
], dtype=torch.float32)

First reduction operation: **Summation**

In [167]:
t.sum()   # reduced tensor to rank 0 scalar value

tensor(8.)

In [168]:
t.numel()   # original tensor has 9 elements

9

In [169]:
t.sum().numel()   # finally only 1 element left

1

In [170]:
t.sum().numel() < t.numel()

True

Some other reduction operations include:

In [171]:
t.mean()

tensor(0.8889)

In [172]:
t.prod()

tensor(0.)

In [173]:
t.std()

tensor(1.1667)

Does reduction operations always lead to a scalar value?  - **NO**
If the axis along which the reduction is to be performed is mentioned, the reduction operation will yield a vector or higher dimensional tensor

In [174]:
t = torch.tensor([
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3]
], dtype=torch.float32)

In [175]:
t.shape

torch.Size([3, 4])

In [176]:
t.sum(dim=0)

tensor([6., 6., 6., 6.])

In [177]:
t.sum(dim=1)

tensor([ 4.,  8., 12.])

What just happened? Elementwise additions along along dim=0 means what? Simply put, it says add all the rows togther individually. What does that mean? Let's see all the rows

In [178]:
t[0]

tensor([1., 1., 1., 1.])

In [179]:
t[1]

tensor([2., 2., 2., 2.])

In [180]:
t[2]

tensor([3., 3., 3., 3.])

In [181]:
t[0] + t[1] + t[2] 

tensor([6., 6., 6., 6.])

The result above is same as above, of summing with dim=0. This means the dimensions we specify are the ones which get reduced to a lower number essentially to 1. If we do, dim=1, we essentially mean to say we want to reduce all the columns to a single column, i.e add all the columns up.

## ArgMax

ArgMax is a mathematical concept which returns the index value of the maximum element in the array. Given an array of many numbers, which number is a maximum and what is it's corresponding index? We get the index value as an answer

In [182]:
t = torch.tensor([
    [1,0,0,2],
    [0,3,3,0],
    [4,0,0,5]
], dtype=torch.float32)

In [183]:
t.max()

tensor(5.)

In [184]:
t.argmax()

tensor(11)

In [185]:
t.flatten()

tensor([1., 0., 0., 2., 0., 3., 3., 0., 4., 0., 0., 5.])

As can be seen above, the maximum value is 5 in the above tensor but the argmax returned the index 11. This is because, if no axis is specified, the argmax returns maximum value of the tensor provided it is flattened.

In [189]:
t.max(dim=0)   # returns maximum elements as well as indices

torch.return_types.max(
values=tensor([4., 3., 3., 5.]),
indices=tensor([2, 1, 1, 2]))

Well, it may seem confusing but lets see what just happened. Above code asks, what is the maximum value along the row for all elements having compared elementwise. Essentially this means, we have 3 rows:

1. row1 = 1,0,0,2
2. row2 = 0,3,3,0
3. row3 = 4,0,0,5

Along the row we see, 0 < 1 < 4, for first element in all the rows. The maximum is 4. It belongs to row 3. Next we compare 0,3,0; i.e the second elements of all the rows and we get 0 < 3, i.e row 1 has maximum element. Likewise, we have (4,3,3,5) as maximum elements having compared all the rows elementwise. The corresponding result is the row number which has maximum value element (2,1,1,2

In [188]:
t.argmax(dim=0)   # just returns indices

tensor([2, 1, 1, 2])

In [190]:
t.max(dim=1)

torch.return_types.max(
values=tensor([2., 3., 5.]),
indices=tensor([3, 2, 3]))

In [191]:
t.argmax(dim=1)

tensor([3, 2, 3])