In [392]:
import torch
import numpy as np

## Create tensors

A tensor is a mathematical object that represents a multi-dimensional array of numerical values. 
In machine learning and deep learning, tensors are used to represent data inputs and outputs, as well as the weights and biases of neural networks. Tensors enable efficient computation of complex mathematical operations on large datasets, making them an essential tool in modern data science.

In [393]:
# scalar
scalar = torch.tensor(7)
scalar.ndim, scalar.item()

(0, 7)

In [394]:
# vector
vector = torch.tensor([7,7])
vector.ndim, vector.shape

(1, torch.Size([2]))

In [395]:
# MATRIX
MATRIX = torch.tensor([[7,8],[9,10]])
MATRIX.ndim, MATRIX.shape, MATRIX[0]

(2, torch.Size([2, 2]), tensor([7, 8]))

In [396]:
# tensor
TENSOR = torch.tensor([[[1,2,3],[3,6,9],[2,5,4]]])
TENSOR.ndim, TENSOR.shape, TENSOR[0], TENSOR[0][0], TENSOR[0][0][0]

(3,
 torch.Size([1, 3, 3]),
 tensor([[1, 2, 3],
         [3, 6, 9],
         [2, 5, 4]]),
 tensor([1, 2, 3]),
 tensor(1))

In [397]:
# Random tensors 
random_tensor = torch.rand(3,4)
random_tensor, random_tensor.ndim

(tensor([[0.8694, 0.5677, 0.7411, 0.4294],
         [0.8854, 0.5739, 0.2666, 0.6274],
         [0.2696, 0.4414, 0.2969, 0.8317]]),
 2)

In [398]:
# random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size = (224,224,3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

In [399]:
# tensors with zeros
zeros = torch.zeros(size=(3,4))
zeros, zeros*random_tensor, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

In [400]:
# tesors with ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [401]:
# create a rage of tensors
torch_arange = torch.arange(start=1, end=1000, step=85)
torch_arange

tensor([  1,  86, 171, 256, 341, 426, 511, 596, 681, 766, 851, 936])

In [402]:
# create tensors like
tensor_like = torch.zeros_like(input=torch_arange)
tensor_like

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

In [403]:
float_32_tensor = torch.tensor([3.0,8.9,6.7], dtype = None, device=None, requires_grad=False)
float_32_tensor, float_32_tensor.dtype

(tensor([3.0000, 8.9000, 6.7000]), torch.float32)

Note:
- dtype: specifies the data type of the tensor or variable. PyTorch supports various data types, such as float, integer, and boolean, each represented by a different dtype.
- device: specifies the device on which the tensor or variable resides. PyTorch supports different devices, such as CPU or GPU, and the device argument allows you to specify which device to use. If device is not specified, PyTorch will use the default device, which is typically the CPU.
- requires_grad: specifies whether or not the tensor or variable should have its gradients computed during backpropagation. If requires_grad is set to True, then the tensor or variable will have a gradient attribute that stores the gradients computed during backpropagation. If requires_grad is set to False, then the tensor or variable will not have a gradient attribute and its gradients will not be computed during backpropagation.

In [404]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor, float_16_tensor.dtype

(tensor([3.0000, 8.8984, 6.6992], dtype=torch.float16), torch.float16)

## Tensor attributes

**3 big errors you will run into Pytorch and deep learning:** 
- tensors not right datatype
- tensors not rights shape
- tensors not on the right device

In [405]:
some_tensor = torch.rand(3,4)
some_tensor.dtype, some_tensor.shape, some_tensor.device

(torch.float32, torch.Size([3, 4]), device(type='cpu'))

## Tensor Operations

Tensor operations include:
- Addition and subtraction: This involves adding or subtracting two tensors of the same shape element-wise.
- Multiplication: This involves multiplying two tensors, which can be done element-wise or using matrix multiplication.
- Division: This involves element-wise division of two input tensors.

In [406]:
tensor = torch.tensor([1,2,3])

In [407]:
tensor + 10

tensor([11, 12, 13])

In [408]:
# PyTorch in-build functions
torch.add(tensor, 10)

tensor([11, 12, 13])

In [409]:
tensor - 10

tensor([-9, -8, -7])

In [410]:
# PyTorch in-build functions
torch.sub(tensor, 10)

tensor([-9, -8, -7])

In [411]:
tensor * 10

tensor([10, 20, 30])

In [412]:
# PyTorch in-build functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [413]:
tensor/10

tensor([0.1000, 0.2000, 0.3000])

In [414]:
# PyTorch in-build functions
torch.div(tensor, 10)

tensor([0.1000, 0.2000, 0.3000])

In [415]:
# Matric multiplication

# element wise
tw = tensor*tensor

# dot product
tm = torch.matmul(tensor, tensor)

tw, tm

(tensor([1, 4, 9]), tensor(14))

In [416]:
%%time
value = 0
for i in range (len(tensor)):
    value += tensor[i]*tensor[i]
print(value)

tensor(14)
CPU times: user 1.08 ms, sys: 76 µs, total: 1.15 ms
Wall time: 587 µs


In [417]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 0 ns, sys: 828 µs, total: 828 µs
Wall time: 431 µs


tensor(14)

In [418]:
# inner dimension must match
# torch.matmul(torch.rand(3,2), torch.rand(3,2)) #RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)
torch.matmul(torch.rand(3,2), torch.rand(2,4)).shape # shape = outer dimension

torch.Size([3, 4])

In [419]:
# shapes for matrix multiplication
tensor_A = torch.tensor([[1,2],[3,4],[5,6]])
tensor_B = torch.tensor([[7,8],[9,10],[11,12]])
torch.matmul(tensor_A,tensor_B.T) #manipulate the shape of tensor_B by transpose

tensor([[ 23,  29,  35],
        [ 53,  67,  81],
        [ 83, 105, 127]])

## Tensor agrregation

In [420]:
x = torch.arange(1,100,10)

In [421]:
# find the min
torch.min(x), x.min()

(tensor(1), tensor(1))

In [422]:
# find the max
torch.max(x), x.max()

(tensor(91), tensor(91))

In [423]:
# finding the mean requires a float32 tensor
torch.mean(x.type(torch.float32)) , x.type(torch.float32).mean()

(tensor(46.), tensor(46.))

In [424]:
# find the sum
torch.sum(x), x.sum()

(tensor(460), tensor(460))

In [425]:
# finding the positional min and max
x.argmin(), x.argmax()

(tensor(0), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing, permuting tensors

Reshaping, stacking, squeezing, and unsqueezing are common operations performed on tensors in deep learning and machine learning.

Reshaping: Reshaping a tensor means changing its shape while keeping the number of elements the same. This operation is also called flattening or raveling. To reshape a tensor, you need to specify the new shape using the reshape() method. For example, if you have a tensor a of shape (2,3,4), you can reshape it to a tensor of shape (3,8) using a.reshape(3,8).

Stacking: Stacking two or more tensors along a new dimension is called stacking. This operation is useful for combining tensors of the same shape or stacking batches of data. To stack tensors, you can use the stack() method. For example, if you have two tensors a and b of shape (2,3) and you want to stack them along a new dimension, you can use torch.stack([a,b], dim=0).

Squeezing: Squeezing a tensor means removing any dimensions or axes that have a length of 1. This operation is useful for removing unnecessary dimensions from a tensor. To squeeze a tensor, you can use the squeeze() method. For example, if you have a tensor a of shape (1,2,1,3,1), you can squeeze it to a tensor of shape (2,3) using a.squeeze().

Unsqueezing: Unsqueezing is the opposite of squeezing. It means adding a new dimension or axis with a length of 1 to a tensor. This operation is useful for broadcasting tensors of different shapes. To unsqueeze a tensor, you can use the unsqueeze() method. For example, if you have a tensor a of shape (2,3) and you want to unsqueeze it along a new dimension, you can use a.unsqueeze(dim=0) to get a tensor of shape (1,2,3).

Permuting: The permute() method takes a tuple of integers that represents the new order of dimensions. Each integer in the tuple corresponds to the index of the current dimension that should be moved to the new position. For example, if you have a tensor a of shape (2,3,4) and you want to swap the first and second dimensions, you can use a.permute(1,0,2).

In [426]:
x = torch.arange(1.,11.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [427]:
# reshape: add an extra dimension
x_reshaped = x.reshape(2,5) #working because 2*5 = 10
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]),
 torch.Size([2, 5]))

In [428]:
# change the view (changing z changes x)
z = x.view(1,10)
z[:,0] = 5
x, z, z.shape

(tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]),
 tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

In [429]:
# stack tensors on top of each other
dim=1
x_stacked = torch.stack([x,x,x,x], dim=dim)
x_stacked, x_reshaped.shape

(tensor([[ 5.,  5.,  5.,  5.],
         [ 2.,  2.,  2.,  2.],
         [ 3.,  3.,  3.,  3.],
         [ 4.,  4.,  4.,  4.],
         [ 5.,  5.,  5.,  5.],
         [ 6.,  6.,  6.,  6.],
         [ 7.,  7.,  7.,  7.],
         [ 8.,  8.,  8.,  8.],
         [ 9.,  9.,  9.,  9.],
         [10., 10., 10., 10.]]),
 torch.Size([2, 5]))

Note on "dim" values: 
- For a 1-dimensional tensor (i.e., a vector), dim can only be 0.
- For a 2-dimensional tensor (i.e., a matrix), dim can be 0 or 1.
- For tensors with more than 2 dimensions, dim can be any integer from 0 to n-1, where n is the number of dimensions in the tensor.


In [430]:
# torch squeeze - remove dimensions of size 1

# Create a tensor with a singleton dimension
a = torch.rand(3, 1, 4)

# Squeeze the tensor to remove the singleton dimension
b = torch.squeeze(a)

# Print the shapes of the tensors
print(a.shape)  # Output: torch.Size([3, 1, 4])
print(b.shape)  # Output: torch.Size([3, 4])
print(a)
print(b)

torch.Size([3, 1, 4])
torch.Size([3, 4])
tensor([[[0.5194, 0.5337, 0.7050, 0.3362]],

        [[0.7891, 0.1694, 0.1800, 0.7177]],

        [[0.6988, 0.5510, 0.2485, 0.8518]]])
tensor([[0.5194, 0.5337, 0.7050, 0.3362],
        [0.7891, 0.1694, 0.1800, 0.7177],
        [0.6988, 0.5510, 0.2485, 0.8518]])


In [431]:
# torch unsqueeze - add a new dimension or axis with a length of 1
c = torch.unsqueeze(b, dim=1)
c, c.shape

(tensor([[[0.5194, 0.5337, 0.7050, 0.3362]],
 
         [[0.7891, 0.1694, 0.1800, 0.7177]],
 
         [[0.6988, 0.5510, 0.2485, 0.8518]]]),
 torch.Size([3, 1, 4]))

In [432]:
# torch permute dimension
x = torch.randn(2, 3, 5)
x.size(), torch.permute(x, (2, 0, 1)).size()

(torch.Size([2, 3, 5]), torch.Size([5, 2, 3]))

## Tensor indexing

In [433]:
# basic indexing - access individual elements of a tensor
x = torch.tensor([1, 2, 3])
print(x[0])   # output: tensor(1)

tensor(1)


In [434]:
# slicing: slice a tensor to extract a subset of its elements
x = torch.tensor([1, 2, 3, 4, 5])
print(x[1:4])   # output: tensor([2, 3, 4])

tensor([2, 3, 4])


In [435]:
#advanced indexing: advanced indexing to select specific elements of a tensor based on their indices
x = torch.tensor([[[1, 2, 3], [4, 5, 6], [7, 8, 9]]])
(x[0]), x[0,1], x[0,2,0], x[:, :, 1], x[:,1,2], x[:,:,2]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor([4, 5, 6]),
 tensor(7),
 tensor([[2, 5, 8]]),
 tensor([6]),
 tensor([[3, 6, 9]]))

## PyTorch tensors and NumPy

PyTorch tensors and NumPy arrays are both used for numerical computing and data analysis, but they have some differences in terms of their features and functionalities.

PyTorch is a popular deep learning framework that uses tensors as the primary data structure for numerical computations. Tensors in PyTorch are similar to multi-dimensional arrays in NumPy, but they are designed to work efficiently with deep learning models and GPU acceleration. PyTorch tensors have many similarities with NumPy arrays in terms of indexing, slicing, and broadcasting, making it easy to integrate PyTorch with other Python libraries such as NumPy, SciPy, and Pandas.

One of the key differences between PyTorch tensors and NumPy arrays is that PyTorch tensors can be used with GPU acceleration, which can significantly speed up computations for deep learning models. In contrast, NumPy arrays are typically limited to CPU computations, although there are some libraries such as CuPy that can provide GPU acceleration for NumPy.

Another difference between PyTorch tensors and NumPy arrays is that PyTorch tensors can be easily converted to and from NumPy arrays using the torch.from_numpy() and torch.Tensor.numpy() functions. This makes it easy to integrate PyTorch with existing NumPy-based workflows and codebases.

Finally, PyTorch tensors have some additional features compared to NumPy arrays that make them well-suited for deep learning tasks. For example, PyTorch provides a built-in autograd system for automatic differentiation, which allows for efficient computation of gradients and backpropagation in deep learning models. PyTorch also provides a rich set of functions and modules for building and training deep learning models, such as convolutional and recurrent neural networks, as well as support for distributed computing and deployment on different platforms.


In [436]:
# NumPy array to tensor
array = np.arange(1.0,8.0) #deafult dtype is float64
tensor = torch.from_numpy(array).type(torch.float32)
array, tensor, tensor.dtype

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.]),
 torch.float32)

In [437]:
# change the value of array
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [438]:
# tenson to numpy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy().astype(np.float64) #if not specified, the type will be tensor's type float32
tensor, numpy_tensor, numpy_tensor.dtype

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.]),
 dtype('float64'))

In [439]:
# change the tensor
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]), array([1., 1., 1., 1., 1., 1., 1.]))

## PyTorch reproducibility

Reproducibility is an important aspect of deep learning research, as it allows other researchers to verify and build upon previously published results. 
Setting random seeds for the random number generators used in the code ensures that the same random numbers are generated every time the code is run. 

In [440]:
# create two random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)
random_tensor_A, random_tensor_B, random_tensor_A == random_tensor_B

(tensor([[0.7900, 0.2483, 0.8160, 0.1168],
         [0.9158, 0.9107, 0.8897, 0.5155],
         [0.5067, 0.7787, 0.9570, 0.2658]]),
 tensor([[0.6693, 0.7904, 0.1429, 0.2596],
         [0.3507, 0.6683, 0.4601, 0.9545],
         [0.1936, 0.8199, 0.3752, 0.6261]]),
 tensor([[False, False, False, False],
         [False, False, False, False],
         [False, False, False, False]]))

In [441]:
# create random but reproducible tensors

RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

random_tensor_C, random_tensor_D, random_tensor_C == random_tensor_D

(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[True, True, True, True],
         [True, True, True, True],
         [True, True, True, True]]))

## Runnig on GPUs

A GPU (Graphics Processing Unit) is a specialized processor designed for handling complex graphical computations, such as those required for 3D graphics, image and video processing, and scientific simulations. GPUs are highly parallelized, meaning they can handle multiple calculations at the same time, making them particularly useful for tasks that involve large amounts of data.

In general, GPUs are faster than CPUs when it comes to tasks that involve large amounts of data or complex calculations. However, CPUs are better suited for tasks that require a high degree of flexibility or that involve multiple types of calculations.

When it comes to choosing between a GPU and a CPU, it ultimately depends on the specific task or application. For tasks that involve heavy graphical computations, such as gaming or video editing, a GPU is likely to be the better choice. For more general computing tasks, such as browsing the web or word processing, a CPU is likely to be sufficient.

From where you can get GPU?
1. use google colab
2. use your own GPU (see this post https://timdettmers.com/2018/12/16/deep-learning-hardware-guide/)
3. use cloud computing - GCP, AWS, Azure

In [442]:
# check the GPU access
torch.cuda.is_available()

False

In [443]:
# setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [444]:
# count the number of devices
torch.cuda.device_count()

0

In [445]:
# putting tensors (and models) on the GPU

# create a tensor (default on CPU)
tensor = torch.tensor([1,2,2])

# move tensor on GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu, tensor_on_gpu.device

(tensor([1, 2, 2]), device(type='cpu'))

In [446]:
# moving tensors back to CPU

# if tensor on GPU, can't transformt it on NumPy
tensor_on_gpu.numpy()

tensor_back_on_cpu = tensor_on_gpu.cpu()
tensor_back_on_cpu.numpy()
tensor_back_on_cpu

tensor([1, 2, 2])