# About

In this notebook, you will have a detailed insight about what a PyTorch Tensor is, different ways to initialize it and use it to perform mathematical operations, store model parameters etc. 

Tensors are the primary data structure in PyTorch which handles almost every numeric task required in a deep learning workflow. Generally speaking, PyTorch Tensors are multi-dimensional arrays (just like numpy ndarray), with certain specific features like storing gradient values (if required), device interobility (can switch between cpu/gpu), etc. 

# 0. Setup

In [1]:
import torch
import matplotlib.pyplot as plt
%matplotlib inline

# 1. What are Tensors?

In deep learning context, every data structure containing elements of **same** data-type (numeric/boolean) is a tensor. Thus, a tensor can be a scalar value (rank 0 tensor), list/vector (rank 1 tensor), matrix (rank 2 tensor) or any higher order matrix.
For example, if you have 1000 color images of 129*128 size, you effective have a rank 4 tensor described as (N, C, H, W);  
where, N is the number of samples (1000 here),   
       C is the channel size (3 since color),   
       H is the height of each image (=128) and    
       W is the width of each image (=128).  

Tensors in PyTorch are defined in torch.Tensor class which lists out its datatypes, attributes, arguments and methods.  

# 2. Creating Tensors

There are four ways of creating tensors in PyTorch:

1. From GIVEN size description 
2. Creating Tensor from other Tensor of SAME size
3. Creating a Sequence Tensor
4. From Numpy Arrays

But before creating tensor, there are a couple of things we need to keep in mind.   
Firstly, what `dtype` we want our tensors to be in. Default is torch.FloatTensor.
Secondly, what `device` we want our tensor to be created on. Default is 'cpu'. 

Each torch.Tensor has a torch.dtype and torch.device atrribute and almost all tensor creation process gives us the option to specify them.

### Data Types in PyTorch

PyTorch has specified 12 data types with different variants for 'cpu' and 'gpu'. You probably would never use specific variants for creating tensors, just knowing the dtype would suffice. More information is available here: https://pytorch.org/docs/stable/tensors.html

The default tensor type in PyTorch is `torch.FloatTensor` which is a cpu variant of `torch.float32` dtype. (can be chaged using `torch.set_default_tensor_type()` function).  

Some of the common datatype that we may use are:

1. torch.float32 / torch.float for 32-bit floating point
2. torch.float16 / torch.half for 16-bit floating point
3. torch.uint8 for unsigned 8-bit integer (its range is 0-255, generally used in computer vision)
4. torch.bool for boolean values

For specifying device, you can use, device = 'cpu'/'cuda'/'cuda:0', here 0 represents which cuda device you want to put your tensors on (In case of multi-gpu setups)!

## 1a. Fixed-Size Tensor - Non Random

In [2]:
# Creating empty tensor
empty_tnsr = torch.empty(3,4)

# Zero-filled tensor
zero_tnsr = torch.zeros(3,4)

# One-filled tensor
one_tnsr = torch.ones(3,4)

# Constant-filled tensor
const_tnsr = torch.full((3,4), 12345)

print(f'Empty Tensor: \n {empty_tnsr} \n\nZero Filled Tensor: \n {zero_tnsr} \n\nOne Filled Tensor: \n {one_tnsr} \n\nConstant Filled Tensor: \n {const_tnsr} \n')

# All of them take dtype and device as arguments. For eg:
# torch.ones((3,4), device='cuda:0', dtype=torch.half)

Empty Tensor: 
 tensor([[-6.5921e+31,  4.5719e-41, -6.6178e+31,  4.5719e-41],
        [-3.3651e+03, -1.3531e-34, -6.5924e+31,  4.5719e-41],
        [-6.6281e+31,  4.5719e-41, -1.2111e+03,  1.9404e-26]]) 

Zero Filled Tensor: 
 tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]) 

One Filled Tensor: 
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]) 

Constant Filled Tensor: 
 tensor([[12345, 12345, 12345, 12345],
        [12345, 12345, 12345, 12345],
        [12345, 12345, 12345, 12345]]) 



## 1b. Fixed-Size Tensor - Random

In [28]:
# Random numbers from uniform distribution on the interval [0,1)
unif_tnsr = torch.rand(2,3)

# Random integers from uniform distribution on the interval [low,high)
low = 0; high = 10
unif_int_tnsr = torch.randint(low, high, (2,3))

# Random numbers from normal distribution with 0 mean and 1 variance
normal_tnsr = torch.randn(2,3)

print(f'Uniformly Distributed Random Tensor: \n {unif_tnsr} \n\nUniformly Distributed Random Integer-Filled Tensor: \n {unif_int_tnsr} \n\nNormally Distributed Random Tensor: \n {normal_tnsr}')

# All of them take dtype and device as arguments. For eg:
# torch.rand((3,4), device='cuda:0', dtype=torch.half)

Uniformly Distributed Random Tensor: 
 tensor([[0.0028, 0.5947, 0.1142],
        [0.7142, 0.8491, 0.5064]]) 

Uniformly Distributed Random Integer-Filled Tensor: 
 tensor([[2, 2, 3],
        [5, 6, 0]]) 

Normally Distributed Random Tensor: 
 tensor([[-0.1196, -1.8458,  1.6861],
        [-0.9417, -1.0843,  0.4500]])


### Manual Seeding of Random Tensors

In [24]:
# For the sake of reproducibility, we fix a seed value at the beginning, so that all generated random tensors in a program are same no matter when and where it is executed.
torch.manual_seed(331)
a = torch.randint(2,30,(2,5))
b = torch.randint(2,30,(2,5))
c = torch.rand(2,3)

# Lets rerun the scripts using same seed
torch.manual_seed(331)
d = torch.randint(2,30,(2,5))
e = torch.randint(2,30,(2,5))
f = torch.rand(2,3)

# Check for yourself if a and d are same, b and e are same, c and f are same.
print(c==f)
print(b==e) 

print(a==e) # This should not be same

tensor([[True, True, True],
        [True, True, True]])
tensor([[True, True, True, True, True],
        [True, True, True, True, True]])
tensor([[False, False, False, False, False],
        [False, False, False, False, False]])


## 2. Same-Size Tensor

In [4]:
x = torch.rand(2,5)

# empty_like
empty_like_tnsr = torch.empty_like(x) 

# zeros_like
zero_like_tnsr = torch.zeros_like(x) 

# ones_like
one_like_tnsr = torch.ones_like(x) 

# rand_like
rand_like_tnsr = torch.rand_like(x) 

print(f'Empty Like Tensor: \n {empty_like_tnsr} \n\nZero Filled Like Tensor: \n {zero_like_tnsr} \n\nOne Filled Like Tensor: \n {one_like_tnsr} \n\nRandomly Filled Like Tensor: \n {rand_like_tnsr}')

# All of them take dtype and device as arguments. For eg:
# torch.zeros_like(x, device='cuda:0', dtype=torch.half)

Empty Like Tensor: 
 tensor([[5.2600e+22, 2.4286e-18, 1.8788e+31, 7.9303e+34, 6.1949e-04],
        [7.3313e+22, 7.2151e+22, 2.8404e+29, 2.3089e-12, 7.1856e+22]]) 

Zero Filled Like Tensor: 
 tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]) 

One Filled Like Tensor: 
 tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]]) 

Randomly Filled Like Tensor: 
 tensor([[0.1769, 0.4177, 0.8316, 0.4345, 0.6412],
        [0.9399, 0.9448, 0.9567, 0.0889, 0.9263]])


## 3. Sequence Tensor

In [5]:
# arange: return tensor with values between [start (default = 0), end) and step size = 'step' (default 1) 
arange_tnsr_0 = torch.arange(5)
arange_tnsr_1 = torch.arange(3,10)
arange_tnsr_2 = torch.arange(1,5,0.5)

# linspace: return tensor with values between [start, end) and no of step = 'steps'
lrange_tnsr = torch.linspace(4, 40, 10)

print(f'arange tensors: \n {arange_tnsr_0}\n {arange_tnsr_1}\n {arange_tnsr_2}\n\nlinspace tensor: \n {lrange_tnsr}')

# All of them take dtype and device as arguments. For eg:
# torch.arange(5, device='cuda', dtype=torch.half)

arange tensors: 
 tensor([0, 1, 2, 3, 4])
 tensor([3, 4, 5, 6, 7, 8, 9])
 tensor([1.0000, 1.5000, 2.0000, 2.5000, 3.0000, 3.5000, 4.0000, 4.5000])

linspace tensor: 
 tensor([ 4.,  8., 12., 16., 20., 24., 28., 32., 36., 40.])


## 4. From Numpy arrays/ Python lists

In [27]:
import numpy as np
array = np.array([[1.,3.,5.]])

# torch.from_numpy: Converts numpy arrays (only) to torch tensor. no copy is performed. Thus, changes to the tensor is reflected in numpy array and vice versa
from_np_tnsr = torch.from_numpy(array)

# Note: from_numpy doesn't take any argument. Thus, adding device/dtype will throw ERROR:
# torch.from_numpy(array, device='cpu') # ! will throw ERROR

# torch.as_tensor: Converts python lists, numpy arrays, scaler, etc. to torch tensor. copy/no copy depending on whether dtype/device is different/same. 
as_tnsr0 = torch.as_tensor([1,2])
as_tnsr1 = torch.as_tensor(array)

# Note: torch.as_tensor takes in dtype and device as arguments. Eg:
# torch.as_tensor([1,2], device='cpu', dtype=torch.int8)

# Note: Using torch.tensor does same thing as torch.as_tensor except that it will ALWAYS copy data.

tensor([[1., 3., 5.]], dtype=torch.float64)

# 3 torch.Tensor Methods: Part 1

There several functions and operations that are defined on PyTorch Tensors of which we will look into some of them here in Part 1. Some others tensor methods, mostly related to gradient operations will be discussed in next module as Part 2 of torch.Tensor Methods.

## 3a. Access and change Tensor Attributes  (`dtype` and `device`)

In [97]:
tnsr = torch.rand(3,4)
# access datatype and device type
print(tnsr.dtype, tnsr.device)

# Remember python inbuilt type() function is different from .dtype attribute
print(type(tnsr))

torch.float32 cpu
<class 'torch.Tensor'>


In [62]:
# change datatype and device type

# .to: you can change both datatype and device type using this method
tnsr.to(device='cuda')
# Note it returns a NEW tensor hence needs to be (re)assigned.
new_tnsr1 = tnsr.to(device='cuda')

# You can also use the same notation to change dtype
new_tnsr2 = tnsr.to(dtype=torch.float16)
# tnsr.to(device='cpu',dtype=torch.float64)

print(f'Original\n{tnsr}\n\nModified\n{new_tnsr1}\n{new_tnsr2}')

Original
tensor([[0.3905, 0.1861, 0.8887, 0.0073],
        [0.4402, 0.8681, 0.5956, 0.6009],
        [0.3325, 0.0368, 0.7867, 0.6241]])

Modified
tensor([[0.3905, 0.1861, 0.8887, 0.0073],
        [0.4402, 0.8681, 0.5956, 0.6009],
        [0.3325, 0.0368, 0.7867, 0.6241]], device='cuda:0')
tensor([[0.3904, 0.1860, 0.8887, 0.0073],
        [0.4402, 0.8682, 0.5957, 0.6011],
        [0.3325, 0.0368, 0.7866, 0.6240]], dtype=torch.float16)


In [68]:
# Other ways of changing datatype or device # Not Recommended
new_tnsr3 = tnsr.cuda()
new_tnsr4 = tnsr.int()
new_tnsr5 = tnsr.half().cpu()

## 3b. Create a copy of tensor : `x.clone()`

In [109]:
# copies the device type, data type and gradient properties of the tensor. We will look into gradient properties in the next module.
tnsr_copy = tnsr.clone()

## 3c. Convert to Numpy array format: `x.numpy()`

In [110]:
np_array = tnsr.cpu().numpy() 
print(type(np_array))
# If the tensor has gradient enabled or is not on cpu, it will throw an ERROR.  We will look into gradient properties in the next module.
np_array = tnsr.cuda().numpy() # ERROR

## 3d. Change Tensor Shape/Size

In [98]:
# Access tensor shape/size
tnsr = torch.rand(3,4)
tnsr.shape
# tnsr.size()

torch.Size([3, 4])

In [99]:
# reshape: Return New Tensor after reshaping it to a different dimension
# If one of the dimension is kept -1, it automatically infers that dimension and number of elements
new_tnsr = tnsr.reshape(4,-1) # similar to torch.reshape(tnsr, (4,-1))
print(new_tnsr)
# tnsr.view() also does the same thing but has more specific criteria to be fulfilled. Not Recommended.

tensor([[0.4163, 0.4046, 0.3389],
        [0.8736, 0.1720, 0.5574],
        [0.6029, 0.6549, 0.6424],
        [0.2368, 0.4377, 0.3623]])
tensor([[0.4163, 0.4046, 0.3389],
        [0.8736, 0.1720, 0.5574],
        [0.6029, 0.6549, 0.6424],
        [0.2368, 0.4377, 0.3623]])


In [91]:
# unsqueeze() = Add a dimension of size one inserted at given position. Useful in case of stretching the single input tensor to include batch dimension (at 0th position).
tnsr = torch.as_tensor([3.13,12.45])
new_tnsr = tnsr.unsqueeze(0) # similar to torch.unsqueeze(tnsr, 0)
print(tnsr, new_tnsr)

# unsqueeze_() = Inplace version of unsqueeze()
tnsr.unsqueeze_(0)

tensor([ 3.1300, 12.4500]) tensor([[ 3.1300, 12.4500]])


tensor([[ 3.1300, 12.4500]])

In [95]:
# squeeze() = Remove all dimensions of size 1 and returns the tensor. 
new_tnsr = tnsr.squeeze() # similar to torch.unsqueeze(tnsr, 0)
print(tnsr.shape, new_tnsr.shape)

# squeeze_() = Inplace version of squeeze()
tnsr.squeeze_()

torch.Size([1, 2]) torch.Size([2])


tensor([ 3.1300, 12.4500])

## 3e. Perform Mathematical Operations

There are in general two ways to perform math ops in PyTorch. One way is to directly invoke the math function on the tensor object itself, for eg: x.sin(), x.greater(y) etc.! The advantage here is that MANY of these math functions also has an associated inplace variant, like x.sin_() or x.greater_(y). The other way is to use torch functions explicitly, like torch.sin(x) or torch.greater(). This method is discussed in section 4.

Note: Only a few examples are shown here. More comprehensive list of math ops in torch is listed in section 4. For complete list of math operations on tensor object, refer the official documentaion: https://pytorch.org/docs/stable/tensors.html  

### In-place operations:

In-place operations directly changes the content of a given Tensor without making a copy. 
PyTorch provides inplace operators as a direct variant to the non inplace one, distinguished by an trailing underscore. For example, sin() function has an inplace variant, sin_(). But note that there are some operations which do not have an inplace variant.

The obvious advantage to this is less memory consumption as no new tensor has to be formed. But there is a major caveat to using in-place operations: Gradient values cannot be calculated on in-place operators. Thus, if an operation is within the scope of gradient calculation (or is a part of your model architecture), in-place variants must be avoided.

In [None]:
# trigonometric ops (sin, cos, tan etc.)
new_tnsr = tnsr.sin()
tnsr.sin_() # inplace variant

# logaritgmic and exponential ops
new_tnsr = tnsr.log()
tnsr.log_()

# comparison ops
new_tnsr = tnsr.equal(y)
tnsr.equal_(y)

# matrix ops
new_tnsr = tnsr.inverse()
# no inplace variant for inverse

# reduction ops
new_tnsr = tnsr.mean()
# no inplace variant for reduction.

# 4. Common Mathematical Operations

## 4a. Element-wise addition, substraction, multiplication and division

In [5]:
A = torch.tensor([[1.,2.],[3.,5.],[3.,1.]])
B = torch.zeros(3,2)
C = torch.ones(3,2)
# Using overloaded operators (+,-,*,/)
print(A+B)
print(A-B)
print(A*B)
print(A/C)

tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])
tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])
tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])


In [6]:
# Using pytorch functions
print( torch.add(A,B) ) 
print( torch.sub(A,B) ) # same as torch.subtract()
print( torch.mul(A,B) ) # same as torch.multiply()
print( torch.div(A,C) ) # same as torch.divide()

tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])
tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])
tensor([[1., 2.],
        [3., 5.],
        [3., 1.]])


## 4b. Element-wise Transformation (trignometric, exponential, etc.)

In [19]:
torch.abs(A) # same as torch.absolute()
torch.ceil(A) # calculates ceiling value
torch.floor(A) # calculates floor value
torch.reciprocal(A) # returns reciprocal of all elements in a tensor

# clips minimum and maximum value of a tensor to min and max respectively
torch.clamp(A, min=0.4,max=1.5) # same as torch.clip 

# trignometric ops
torch.sin(A)
torch.cos(A)
torch.tan(A)
torch.sinh(A)
torch.cosh(A)
torch.tanh(A)

# logaritmic and exponential ops
torch.log(A)
torch.log2(A)
torch.log10(A)
torch.exp(A)
torch.exp2(A)
torch.pow(A, 3)
torch.sqrt(A);

## 4c. Element-wise comparison ops (greater, maximum, or, and etc.)

In [20]:
# performs logical operations (like python and, or , not etc.)
torch.logical_or(A,B)
torch.logical_not(A)
torch.logical_and(A,B)
torch.logical_xor(A,B)

a = torch.zeros((3,2), dtype=torch.bool)
b = torch.ones((3,2), dtype=torch.bool)
# performs bitwise operations
torch.bitwise_or(a,b)
torch.bitwise_not(a)
torch.bitwise_and(a,b)
torch.bitwise_xor(a,b)

# comparison operators
torch.equal(A,B)
torch.not_equal(A,B)
torch.greater(A,B)
torch.greater_equal(A,B)
torch.less(A,B)
torch.less_equal(A,B)
torch.maximum(A,B)
torch.minimum(A,B);

## 4d. Matrix and Linear Algebra related ops 

In [27]:
A = torch.rand(2,2)
B = torch.rand(2,2)
a = torch.arange(1,6)
b = torch.arange(4,9)

# matrix multiplication
print(A @ B)
print(torch.matmul(A,B)) # same as above

# matrix transpose
print(A.T)
print(torch.transpose(A, dim0=0, dim1=1)) # same as above

# returns inverse of a square matrix
torch.inverse(A)

# returns determinant of a square matrix
torch.det(A)

# eigenvalues and eigenvectors of square matrix
torch.eig(A)

# dot product of two 1d tensors
torch.dot(a,b)

# trace of a matrix
torch.trace(A)

# calculates inner product (dot product in case of 1d tensors)
torch.inner(A,B)

# calculates outer product of input tensor with a vector
torch.outer(a,b);

tensor([[0.2464, 0.2189],
        [0.1544, 0.1686]])
tensor([[0.2464, 0.2189],
        [0.1544, 0.1686]])
tensor([[0.3076, 0.1327],
        [0.1311, 0.1741]])
tensor([[0.3076, 0.1327],
        [0.1311, 0.1741]])


## 4e. Reduction ops

In [35]:
# Reduction Operations
torch.max(A) # Returns the max value of all elements in A
torch.argmax(A) # Returns the indices of the max value of all elements in A
torch.min(A) # Returns the min value of all elements in A
torch.argmin(A) # Returns the indices of the min value of all elements in A

torch.amax(A, dim=1) # Returns the maximum value in the given dimension
torch.amin(A, dim=0) # Returns the minimum value in the given dimension

# Perform statistical operations 
torch.mean(A) 
torch.median(A)
torch.mode(A)

torch.std(A) # Calculates standard deviation
torch.var(A) # Calculates variance
torch.norm(A) # Calculates matrix norm/vector norm

torch.count_nonzero(A); # Count non zeros in the matrix

## 4f. Others

In [34]:
# sorting
torch.sort(A, dim=1) # sorts elements along a given dimension in ascending order 
torch.msort(A) # sorts elements along its first dimension in ascending order

torch.topk(A, k=2, dim=0); # Returns the k largest elements along a given dimension

# 5. Broadcasting of Tensors

The goal of broadcasting in tensors is to make them of the same shape so that element wise operation can be performed on them. For example, multiplying `[2]` with `[3,4]` will yield `[6,8]` as a result of broadcasting, even when their dimensions are not equal. 

If the tensors satisfy the rules of broadcasting, then it is automatically applied to the respective tensor. Rules for broadcasting are as follows:

1. Tensors must have atleast one dimension.   
2. Tensor dimensions must be compatible with each other, i.e., traverse from Last to First dimension, and for EACH dim 'i', check if:   
    a. dim 'i' of BOTH tensors are equal, OR  
    b. dim 'i' of either one of tensors = 1, OR   
    c. dim 'i' doesn't exist in one of the tensor.   

In [143]:
# Example 1
A = torch.rand(3,4)
B = torch.rand(1,4)

print( A * B) # Here dim=2 is same (2a satisfied) and dim=1 is 1 in B (2b satisfied)

tensor([[0.5760, 0.3414, 0.1601, 0.9438],
        [0.3104, 0.2757, 0.2393, 0.5958],
        [0.8138, 0.3213, 0.0454, 0.8992]])


In [144]:
# Example 2
A = torch.rand(3,2)
B = torch.rand(  2)

print( A * B) # Here dim=2 is same (2a satisfied) and dim=1 doesn't in B (2c satisfied)

tensor([[0.0947, 0.2014],
        [0.1436, 0.0672],
        [0.1100, 0.2458]])


In [147]:
# Example 3
A = torch.rand(3,2,4,1)
B = torch.rand(  1,4  )

print( A * B) # Here dim=4 doesn't exist in B, dim=3 is same, dim=2 is 1 in B & dim=1 doesn't in B

tensor([[[[0.0351, 0.1868, 0.0954, 0.0715],
          [0.1261, 0.6710, 0.3429, 0.2569],
          [0.0653, 0.3474, 0.1775, 0.1330],
          [0.0456, 0.2428, 0.1240, 0.0929]],

         [[0.0118, 0.0630, 0.0322, 0.0241],
          [0.0831, 0.4421, 0.2259, 0.1692],
          [0.0386, 0.2055, 0.1050, 0.0786],
          [0.0377, 0.2007, 0.1025, 0.0768]]],


        [[[0.1192, 0.6344, 0.3241, 0.2428],
          [0.0313, 0.1667, 0.0852, 0.0638],
          [0.1501, 0.7987, 0.4081, 0.3057],
          [0.0239, 0.1271, 0.0649, 0.0486]],

         [[0.1508, 0.8026, 0.4101, 0.3072],
          [0.0341, 0.1813, 0.0927, 0.0694],
          [0.0175, 0.0934, 0.0477, 0.0357],
          [0.0660, 0.3512, 0.1795, 0.1344]]],


        [[[0.0680, 0.3620, 0.1849, 0.1386],
          [0.1637, 0.8714, 0.4452, 0.3336],
          [0.1423, 0.7572, 0.3869, 0.2899],
          [0.1594, 0.8487, 0.4336, 0.3249]],

         [[0.0545, 0.2900, 0.1482, 0.1110],
          [0.0615, 0.3273, 0.1673, 0.1253],
          [0.1669,