## Basics of Tensors

A tensor is a generalization of vectors and matrices and is easily understood as a multidimensional array.It is a term and set of techniques known in machine learning in the training and operation of deep learning models can be described in terms of tensors. In many cases tensors are used as a replacement for NumPy to use the power of GPUs.

Tensors are a type of data structure used in linear algebra, and like vectors and matrices, you can calculate arithmetic operations with tensors.

## You remember Numpy, right?

In [1]:
# Creating an array using numpy library
import numpy as np
arr = np.array([1,2,3,4,5,6,7])

In [2]:
# data type of array
print(arr.dtype)

# shape of array
print(arr.shape)
print(arr.size)

int32
(7,)
7


## Here comes... tensors!

In [3]:
# loading torch library 
import torch 

# checking the version of torch library
torch.__version__

'1.8.1'

In [4]:
# to set the device to cuda if available otherwise set it to cpu
if torch.cuda.is_available():
    device = "cuda"
else:
    device = "cpu"

In [5]:
# converting the numpy array arr to tensor
tensor = torch.from_numpy(arr)
tensor

tensor([1, 2, 3, 4, 5, 6, 7], dtype=torch.int32)

In [6]:
# converting a tensor to array
array_form = tensor.numpy()
print(array_form)
print(array_form.dtype)

[1 2 3 4 5 6 7]
int32


In [7]:
# converting the numpy array arr to tensor with dtype float32 and device to cude
tensor = torch.tensor(arr, dtype=torch.float32, device=device)
tensor

tensor([1., 2., 3., 4., 5., 6., 7.], device='cuda:0')

In [8]:
# checking the shape or size of tensor
print(tensor.shape)
print(tensor.size())

torch.Size([7])
torch.Size([7])


In [9]:
# accessing tensor using indexing like arrays
print(tensor[4])
print(tensor[:4])
print(tensor[4:])

tensor(5., device='cuda:0')
tensor([1., 2., 3., 4.], device='cuda:0')
tensor([5., 6., 7.], device='cuda:0')


In [10]:
# changing the value of tensor[6] that is 7th element
tensor[6] = 1000
print(tensor)

tensor([   1.,    2.,    3.,    4.,    5.,    6., 1000.], device='cuda:0')


In [11]:
# array arr has the same effect because they share the same memory location
if(arr==tensor):
    print("Yes! arr has been affected too!")
else:
    print("Nope! arr and tensor are different now!")

Nope! arr and tensor are different now!


In [12]:
# make a copy of that array separately
tensor = torch.tensor(arr)
print(tensor)
tensor[0] = 101

# let's check again if arr and tensor are still same?
if(arr==tensor):
    print("Yes! arr has been affected too!")
else:
    print("Nope! arr and tensor are different now!")

tensor([1, 2, 3, 4, 5, 6, 7], dtype=torch.int32)
Nope! arr and tensor are different now!


## Wanna play with some built-in methods?

In [13]:
# creating a tensor using empty method (it will give uninitialized values)
tensor = torch.empty(size=(4,4), device=device, dtype=torch.float32)
tensor

tensor([[   1.,    2.,    3.,    4.],
        [   5.,    6., 1000.,    0.],
        [   0.,    0.,    0.,    0.],
        [   0.,    0.,    0.,    0.]], device='cuda:0')

In [14]:
# creating a tensor using zeros method
tensor = torch.zeros(size=(4,3),device=device, dtype=torch.float32)
tensor

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], device='cuda:0')

In [15]:
# creating a tensor using ones method
tensor = torch.ones(size=(3,2),device=device, dtype=torch.float32)
tensor

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]], device='cuda:0')

In [16]:
# creating a tensor using eye method
tensor = torch.eye(n=5,device=device, dtype=torch.float32)
tensor

tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]], device='cuda:0')

In [17]:
# preserving the diagnol tensor of 5,5 ones tensor
tensor = torch.diag(torch.ones(size=(5,5),device=device, dtype=torch.float32))
tensor

tensor([1., 1., 1., 1., 1.], device='cuda:0')

In [18]:
# preserving the diagnol tensor of 5,5 random tensor
tensor = torch.rand(size=(5,5),device=device, dtype=torch.float32)
print(tensor)
tensor = torch.diag(tensor)
tensor

tensor([[0.3969, 0.2061, 0.5271, 0.8853, 0.8499],
        [0.7224, 0.2511, 0.3693, 0.8641, 0.1821],
        [0.7108, 0.6561, 0.5387, 0.9183, 0.0274],
        [0.7633, 0.4002, 0.4925, 0.0281, 0.5771],
        [0.4156, 0.8315, 0.5032, 0.8055, 0.0225]], device='cuda:0')


tensor([0.3969, 0.2511, 0.5387, 0.0281, 0.0225], device='cuda:0')

In [19]:
# creating a tensor using rand method
tensor = torch.rand(size=(3,2),device=device, dtype=torch.float32)
tensor

tensor([[0.0146, 0.9221],
        [0.4816, 0.0253],
        [0.8029, 0.8999]], device='cuda:0')

In [20]:
# creating a tensor of 6x2 of random values
tensor = torch.rand(size=(6,2), device=device, dtype=torch.float32)
tensor

tensor([[0.7238, 0.8447],
        [0.8070, 0.7114],
        [0.7940, 0.7067],
        [0.3377, 0.6768],
        [0.7892, 0.9637],
        [0.4420, 0.2635]], device='cuda:0')

#### Do you know the difference between the **arange** method and **linspace** method?

In [21]:
# creating a tensor of sequence 10 to 50 with skipping every 5 step
tensor = torch.arange(start=10, end=60, step=5)
tensor

tensor([10, 15, 20, 25, 30, 35, 40, 45, 50, 55])

In [22]:
# creating a tensor of sequence 10 to 50 with 7 equidistant values in between
tensor = torch.linspace(start=10, end=60, steps=5)
tensor

tensor([10.0000, 22.5000, 35.0000, 47.5000, 60.0000])

In [23]:
# creating a 3x4 tensor of sequence 10 to 120 with skipping every 10 step
tensor = torch.tensor(np.arange(10, 121, 10).reshape(3,4), device=device, dtype=torch.float32)
tensor

tensor([[ 10.,  20.,  30.,  40.],
        [ 50.,  60.,  70.,  80.],
        [ 90., 100., 110., 120.]], device='cuda:0')

#### How do you get a tensor of normally disributed or uniformaly distributed values?

In [24]:
# creating a 3x4 tensor of unassigned values but normally distributed
tensor = torch.empty(size=(3,4)).normal_(mean=0, std=1)
tensor

tensor([[ 0.4347,  0.4417,  1.3457,  0.3893],
        [ 0.5008,  0.7331,  0.4060, -0.0068],
        [ 1.4895, -1.7004,  0.5748,  2.1318]])

In [25]:
# creating a 4x5 tensor of unassigned values but uniformly distributed
tensor = torch.empty(size=(3,4)).uniform_(0, 2)
tensor

tensor([[1.3345, 1.4067, 0.8894, 1.8556],
        [0.2864, 1.9816, 1.6658, 1.4109],
        [1.6114, 1.5105, 0.7166, 0.8750]])

## Can we convert our tensors to different types? Yeah!

In [26]:
tensor = torch.arange(start=0, end=15, step=3)
tensor.dtype

torch.int64

In [27]:
# converting the above tensor to int16
tensor.short()

tensor([ 0,  3,  6,  9, 12], dtype=torch.int16)

In [28]:
# converting the above tensor back to int64
tensor.long()

tensor([ 0,  3,  6,  9, 12])

In [29]:
# converting the above tensor to boolean
tensor.bool()

tensor([False,  True,  True,  True,  True])

In [30]:
# converting the above tensor to float16
tensor.half()

tensor([ 0.,  3.,  6.,  9., 12.], dtype=torch.float16)

In [31]:
# converting the above tensor to float32
tensor.float()

tensor([ 0.,  3.,  6.,  9., 12.])

In [32]:
# converting the above tensor to float64
tensor.double()

tensor([ 0.,  3.,  6.,  9., 12.], dtype=torch.float64)

In [33]:
# just get the 1st and 2nd columns of the tensor 
# tensor[:,0:2]

## How about some mathematical operations?

In [34]:
tensor = torch.tensor(np.arange(1,100,7).reshape(3,5), device=device, dtype=torch.float32)
tensor

tensor([[ 1.,  8., 15., 22., 29.],
        [36., 43., 50., 57., 64.],
        [71., 78., 85., 92., 99.]], device='cuda:0')

In [35]:
# add a scalar value to tensor made above using 2 different methods

# method 1
print(tensor + 10)

# method 2
print(torch.add(tensor,10))


tensor([[ 11.,  18.,  25.,  32.,  39.],
        [ 46.,  53.,  60.,  67.,  74.],
        [ 81.,  88.,  95., 102., 109.]], device='cuda:0')
tensor([[ 11.,  18.,  25.,  32.,  39.],
        [ 46.,  53.,  60.,  67.,  74.],
        [ 81.,  88.,  95., 102., 109.]], device='cuda:0')


In [36]:
# add a tensor to the tensor made above using 4 different methods 

# method 1
# shape of the tesors should be same and both should be on same device, although dtype can differ
print(tensor.shape)
print(tensor.device)
print(tensor.dtype)
c = tensor + torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device)
print(c)

# method 2
c = torch.add(tensor,torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device))
print(c)

# method 3
# the same operation above can be done using out argument of add method but initializing output variable is necessary
d = torch.empty(size=(3,5), device=device, dtype=torch.float64)
torch.add(tensor,torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device), out=d)
print(d)

# method 4
# the same operation above can be done using inplace which much more better computationlly
tensor.add_(torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device))


torch.Size([3, 5])
cuda:0
torch.float32
tensor([[ 11.,  23.,  35.,  47.,  59.],
        [ 71.,  83.,  95., 107., 119.],
        [131., 143., 155., 167., 179.]], device='cuda:0', dtype=torch.float64)
tensor([[ 11.,  23.,  35.,  47.,  59.],
        [ 71.,  83.,  95., 107., 119.],
        [131., 143., 155., 167., 179.]], device='cuda:0', dtype=torch.float64)
tensor([[ 11.,  23.,  35.,  47.,  59.],
        [ 71.,  83.,  95., 107., 119.],
        [131., 143., 155., 167., 179.]], device='cuda:0', dtype=torch.float64)


tensor([[ 11.,  23.,  35.,  47.,  59.],
        [ 71.,  83.,  95., 107., 119.],
        [131., 143., 155., 167., 179.]], device='cuda:0')

In [37]:
# get a total of all the values in tensor c and d
print(c.sum())
print(d.sum())

tensor(1425., device='cuda:0', dtype=torch.float64)
tensor(1425., device='cuda:0', dtype=torch.float64)


In [38]:
# subtract a tensor from the tensor made above using 4 different methods

# method 1
# shape of the tesors should be same and both should be on same device, although dtype can differ
print(tensor.shape)
print(tensor.device)
print(tensor.dtype)
c = tensor - torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device)
print(c)

# method 2
c = torch.subtract(tensor,torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device))
print(c)

# method 3
# the same operation above can be done using out argument of add method but initializing output variable is necessary
d = torch.empty(size=(3,5), device=device, dtype=torch.float64)
torch.subtract(tensor,torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device), out=d)
print(d)

# method 4
# the same operation above can be done using inplace which much more better computationlly
tensor.subtract_(torch.tensor(np.arange(10,81,5).reshape(3,5), dtype=torch.float64, device=device))


torch.Size([3, 5])
cuda:0
torch.float32
tensor([[ 1.,  8., 15., 22., 29.],
        [36., 43., 50., 57., 64.],
        [71., 78., 85., 92., 99.]], device='cuda:0', dtype=torch.float64)
tensor([[ 1.,  8., 15., 22., 29.],
        [36., 43., 50., 57., 64.],
        [71., 78., 85., 92., 99.]], device='cuda:0', dtype=torch.float64)
tensor([[ 1.,  8., 15., 22., 29.],
        [36., 43., 50., 57., 64.],
        [71., 78., 85., 92., 99.]], device='cuda:0', dtype=torch.float64)


tensor([[ 1.,  8., 15., 22., 29.],
        [36., 43., 50., 57., 64.],
        [71., 78., 85., 92., 99.]], device='cuda:0')

#### We have two exponentiation ways...

In [39]:
tensor = torch.tensor(np.arange(2,20,2).reshape(3,3), dtype=torch.int64, device=device)
tensor

tensor([[ 2,  4,  6],
        [ 8, 10, 12],
        [14, 16, 18]], device='cuda:0')

In [40]:
# exponential values with pow method
print(tensor)
print(tensor.pow_(2)) # underscore will make it inplace

# exponential values with asterick asterik
print(tensor)
print(tensor ** 3)

tensor([[ 2,  4,  6],
        [ 8, 10, 12],
        [14, 16, 18]], device='cuda:0')
tensor([[  4,  16,  36],
        [ 64, 100, 144],
        [196, 256, 324]], device='cuda:0')
tensor([[  4,  16,  36],
        [ 64, 100, 144],
        [196, 256, 324]], device='cuda:0')
tensor([[      64,     4096,    46656],
        [  262144,  1000000,  2985984],
        [ 7529536, 16777216, 34012224]], device='cuda:0')


## Are you afraid of multiplication and dot products of tensors? Don't be.

In [41]:
# create 1D two tensors x and y 
x = torch.tensor(np.arange(1,5,1), dtype=torch.float64)
y = torch.tensor(np.arange(5,9,1), dtype=torch.float64)
print(x)
print(y)

tensor([1., 2., 3., 4.], dtype=torch.float64)
tensor([5., 6., 7., 8.], dtype=torch.float64)


In [42]:
# using mul method to multiply x and y
z = torch.ones(4,dtype=torch.float64)
torch.mul(x, y, out=z)

tensor([ 5., 12., 21., 32.], dtype=torch.float64)

In [43]:
# using dot method to get the dot product of tensors x and y
# (1*5) + (2*6)+ (3*7) + (4*8)
answer = torch.tensor(0, dtype=torch.float64)
torch.dot(x,y, out=answer)

tensor(70., dtype=torch.float64)

In [44]:
# create 2D two tensors x and y 
x = torch.tensor(np.repeat([1,2,3],3).reshape(3,3), dtype=torch.float64)
y = torch.tensor(np.arange(1,10,1), dtype=torch.float64)
print(x)
print(y)

# Reshape tensor y to 3x3
y = y.view(3,3)
print(y)

tensor([[1., 1., 1.],
        [2., 2., 2.],
        [3., 3., 3.]], dtype=torch.float64)
tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=torch.float64)
tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]], dtype=torch.float64)


In [45]:
# using mul method to multiply x and y
z1 = torch.ones(3,3, dtype=torch.float64)
torch.mul(x, y, out = z1)
print(z1)

# using matmul method to perform matrix multiplication on tensors x and y
z2 = torch.ones(3,3, dtype=torch.float64)
torch.matmul(x, y, out = z2)
print(z2)

# using x@y to perform matmul operation
if torch.all(torch.eq(x@y, z2)):
    print("Yes! matmul function works the same way as x@y.")
else:
    print("No! matmul function does not works the same way as x@y.")

tensor([[ 1.,  2.,  3.],
        [ 8., 10., 12.],
        [21., 24., 27.]], dtype=torch.float64)
tensor([[12., 15., 18.],
        [24., 30., 36.],
        [36., 45., 54.]], dtype=torch.float64)
Yes! matmul function works the same way as x@y.


### Difference between mul and matmul methods
**mul** method is used to perform scalar multiplication on tensors where each value of a matrix is multiplied by the corresponding value from another matrix yet, **matmul** or **mm** performs the proper matrix multiplication. 

## Ever heard of Broadcasting?

In [46]:
tensor_1 = torch.rand(size=(3,6), dtype=torch.float32, device=device)
tensor_2 = torch.rand(size=(1,6), dtype=torch.float32, device=device)

print(tensor_1)
print(tensor_2)

tensor([[0.6782, 0.7768, 0.0088, 0.6625, 0.6776, 0.7875],
        [0.1564, 0.0812, 0.3240, 0.8309, 0.1952, 0.4457],
        [0.6701, 0.5125, 0.3807, 0.7906, 0.8422, 0.9730]], device='cuda:0')
tensor([[0.3471, 0.8846, 0.7467, 0.2393, 0.3272, 0.9386]], device='cuda:0')


#### Can you add or subtract both tensors of different shapes?

Yes, tensor_2 will duplicate its first row upto three rows to match the shape and will perform element wise add or subtract. Let's see...

In [47]:
tensor_1.add_(tensor_2)

tensor([[1.0252, 1.6614, 0.7554, 0.9017, 1.0048, 1.7261],
        [0.5035, 0.9658, 1.0706, 1.0702, 0.5224, 1.3843],
        [1.0171, 1.3971, 1.1273, 1.0299, 1.1695, 1.9117]], device='cuda:0')

In [48]:
tensor_1.subtract_(tensor_2)

tensor([[0.6782, 0.7768, 0.0088, 0.6625, 0.6776, 0.7875],
        [0.1564, 0.0812, 0.3240, 0.8309, 0.1952, 0.4457],
        [0.6701, 0.5125, 0.3807, 0.7906, 0.8422, 0.9730]], device='cuda:0')

## Wait, there are useful mathematical methods still left...

In [49]:
tensor = torch.tensor([[2,-1,5,0],[-1,3,3,-2]], dtype=torch.float32, device=device)
tensor

tensor([[ 2., -1.,  5.,  0.],
        [-1.,  3.,  3., -2.]], device='cuda:0')

In [50]:
# what is the maximum value of the tensor overall?
value = torch.max(tensor)
print(value)

tensor(5., device='cuda:0')


In [51]:
# what is the maximum value of the tensor above and at which index value for every column?
index, value = torch.max(tensor, dim=0) # dimension 0 means column wise
print(index)
print(value)

tensor([2., 3., 5., 0.], device='cuda:0')
tensor([0, 1, 0, 0], device='cuda:0')


In [52]:
# what is the maximum value of the tensor above and at which index value for every row?
index, value = torch.max(tensor, dim=1) # dimension 1 means row wise
print(index)
print(value)

tensor([5., 3.], device='cuda:0')
tensor([2, 1], device='cuda:0')


In [53]:
# what is the minimum value of the tensor overall?
value = torch.min(tensor)
print(value)

tensor(-2., device='cuda:0')


In [54]:
# what is the minimum value of the tensor above and at which index value for every column?
index, value = torch.min(tensor, dim=0) # dimension 0 means column wise
print(index)
print(value)

tensor([-1., -1.,  3., -2.], device='cuda:0')
tensor([1, 0, 1, 1], device='cuda:0')


In [55]:
# what is the minimum value of the tensor above and at which index value for every row?
index, value = torch.min(tensor, dim=1) # dimension 1 means row wise
print(index)
print(value)

tensor([-1., -2.], device='cuda:0')
tensor([1, 3], device='cuda:0')


In [56]:
# convert tensor to absolute values
value = torch.abs(tensor)
print(value)

tensor([[2., 1., 5., 0.],
        [1., 3., 3., 2.]], device='cuda:0')


In [57]:
value = torch.argmax(tensor)
print(value)

tensor(2, device='cuda:0')


In [58]:
value = torch.argmin(tensor)
print(value)

tensor(7, device='cuda:0')


In [59]:
# what is the mean value of the tensor overall?
mean_tensor  = torch.mean(tensor)
mean_tensor

tensor(1.1250, device='cuda:0')

In [62]:
# sorting the tensor row wise
print(tensor)
print(torch.sort(tensor, descending=False, dim=1))

tensor([[ 2., -1.,  5.,  0.],
        [-1.,  3.,  3., -2.]], device='cuda:0')
torch.return_types.sort(
values=tensor([[-1.,  0.,  2.,  5.],
        [-2., -1.,  3.,  3.]], device='cuda:0'),
indices=tensor([[1, 3, 0, 2],
        [3, 0, 2, 1]], device='cuda:0'))


In [63]:
# sorting the tensor column wise
print(tensor)
print(torch.sort(tensor, descending=False, dim=1))

tensor([[ 2., -1.,  5.,  0.],
        [-1.,  3.,  3., -2.]], device='cuda:0')
torch.return_types.sort(
values=tensor([[-1.,  0.,  2.,  5.],
        [-2., -1.,  3.,  3.]], device='cuda:0'),
indices=tensor([[1, 3, 0, 2],
        [3, 0, 2, 1]], device='cuda:0'))
