torch - a DL library for manipulation of tensors [multi-dimensional arrays]. \
supports 13 different types - float32, float16, bfloat16(higher exponent), float64
complex : 32,64,128 bits, int : int8, uint8, int16, int32, int64 and bool

Tensors of different types are represented by different classes - torch.FloatTensor (for float32), torch.LongTensor(int64), torch.ByteTensor(uint8)

In [2]:
import torch
import numpy as np
a = torch.FloatTensor(3,2) # calling the constructor
a

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

In [3]:
a = torch.zeros(3,2) # torch.FloatTensor(3,2) initializes with zeros but in the previous versions, it kept the tensor uninitialized.

In [4]:
#alternative approach
a = torch.FloatTensor(3,2)
a.zero_()
a

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

Two types of operations on tensors : inplace and functional \
Inplace operations have underscore appended to their name and operate on the tensor's content. The functional equivalent creates a copy. \
Inplace - more efficient and does not require extra memory but might lead to hidden bugs.

In [5]:
# tensor from python iterable like list, tuple

a = torch.FloatTensor([[1,2],[3,4],[5,6]])
a

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])

In [6]:
n = np.zeros(shape = (3,2))
n.shape , n.dtype

((3, 2), dtype('float64'))

In [7]:
b = torch.tensor(n)

In [8]:
b.shape, b.dtype

(torch.Size([3, 2]), torch.float64)

Usually in deep learning, float64 is too much memory overhead. float32 or float16 is enough.

to convert from numpy array to torch tensor, torch.from_numpy was used but is now deprecated and torch.tensor() is encouraged with torch datatypes available as well

In [9]:
n = np.zeros(shape=(3,2))
print(n.shape, n.dtype)

(3, 2) float64


In [10]:
t = torch.tensor(n, dtype=torch.float32)
print(t.shape, t.dtype)

torch.Size([3, 2]) torch.float32


In [11]:
#Scalar tensor - Now, zero-dimensional tensors are natively supported and returned by the appropriate functions
a = torch.tensor([1,2,3])
s = a.sum()
print(s.shape)


torch.Size([])


In [12]:
print(s.item())

6


GPU tensors: 
Pytorch supports CUDA GPUs. it has two versions - CPU and GPU.where to process the tensors depends on the tensor configuration. GPU tensors reside in the torch.cuda class instead of the torch package. So, the tensor is torch.cuda.FloatTensor instead of torch.FloatTensor. \
Under the hood, there is no CPU, GPU. there is a backend, which is an abstract computation device with memory. it could be CUDA, CPU or Apple Metal performance Shader given by mps

In [13]:
a = torch.Tensor([1,2,3,4])
print(a.shape, a.dtype)

torch.Size([4]) torch.float32


In [14]:
c = a.to('mps') # tensor copied to Apple's MPS 
c

tensor([1., 2., 3., 4.], device='mps:0')

device = 'mps:0' refers to the fact that the computation device in use for tensor c is mps and it uses the first card. if there are multiple cards, we could have mps:1 as well.

In [15]:
a+1

tensor([2., 3., 4., 5.])

In [16]:
c+1

tensor([2., 3., 4., 5.], device='mps:0')

In [17]:
c.device

device(type='mps', index=0)

Gradient calculation methods : 
Static graph method : define your calculations in advance and cannot be changed later. graph is optimized by the dl library like tensorflow/theano and many other DL toolkits
Dynamic graph method : As you apply transformations on the data, the dl library will keep track of the computations and when requested will compute the gradients , accumulating the gradients of the network parameters.

From version 2.0, pytorch has torch.compile() method which speeds up pytorch code by using JIT (just in time) compiling into optimized kernels.