Tensor is the basic and fundamental building block of ML & DL.

In [1]:
import torch
torch.__version__

'2.5.0'

Tensor's job is to represent data in a numerical way.

### Creating Tensors

Using torch.empty() with the dtype argument is the recommended way of creating tensors as mentioned in PyTorch docs.

**torch.Tensor class**

In [2]:
# A Scaler is a zero dimension tensor i.e., a single number.
scaler = torch.tensor(3)
scaler

tensor(3)

In [3]:
# Dimensions check
scaler.ndim

0

In [4]:
# To access the number within the 0-dim tensor
scaler.item()

3

In [5]:
# A Vector is a single dimension tensor with many nums.
vector = torch.Tensor([3, 3])
vector

tensor([3., 3.])

In [6]:
# Dimensions (A quick way is to count the square brackets on the outside of one side for dims.)
vector.ndim

1

In [7]:
# The shape attribute tells how the elements inside tensors are arranged.
vector.shape

torch.Size([2])

In [8]:
# Matrix has an extra dim and is as flexible as vectors.
MATRIX = torch.tensor([[1, 2],
                       [3, 4]])
MATRIX

tensor([[1, 2],
        [3, 4]])

In [9]:
MATRIX.ndim

2

In [10]:
MATRIX.shape

torch.Size([2, 2])

In [11]:
# Tensor (n-dimnal array of numbers)
TENSOR = torch.Tensor([[[1, 2, 3],
                        [4, 5, 6],
                        [7, 8, 9]]])
TENSOR

tensor([[[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]])

In [12]:
TENSOR.ndim

3

Dimensions go outer to inner i.e., TENSOR has 1 dimension of 3 x 3.

In [13]:
TENSOR.shape

torch.Size([1, 3, 3])

Scalers & Vectors are in lowercase letters while Matrices & Tensors are in Uppercase letters.

Even though the names matrix & tensor are used interchangeably, which is common, the shape and dimensions of what's inside will dictate what it actually is.

A 0-dim tensor is a scaler and a 1-dim tensor is a vector.

### Random Tensors

In [14]:
# Create a random tensor of size parameter
random_tensor = torch.rand(size=(3, 4))
random_tensor, random_tensor.dtype

(tensor([[0.7973, 0.0797, 0.3333, 0.2791],
         [0.8921, 0.4333, 0.1706, 0.1832],
         [0.9034, 0.8427, 0.8575, 0.4944]]),
 torch.float32)

In [15]:
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and Ones (Used a lot for masking)

In [16]:
# A tensor of all zeros
zeros = torch.zeros(size = (3, 4))
zeros, zeros.dtype

# A tensor of all ones.
ones = torch.ones(size = (3, 4))
ones, ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

### Creating a range and tensors like

In [17]:
# Range of values from 0 to 10
zero_to_ten = torch.arange(start=0, end=10, step=1) # torch.range(start, end) is deprecated.
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [18]:
# torch.zeros_like(input) and torch.ones_like(input)
ten_zeros = torch.zeros_like(input = zero_to_ten)
ten_zeros

ten_ones = torch.ones_like(input = zero_to_ten)
ten_ones

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

### Tensor datatypes

Tensors from torch.cuda seen anywhere means they are being used for GPU. Generally, torch.float32 is the default data type.

All this mess around Tensor datatypes has a reason to do with **precision in computing.**

Precision is the amount of detail used to describe a number. This leads to a trade off between faster computation & performance on evaluation metrics.

dtype parameter is used to create a tensor of some specific datatype.

In [19]:
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype = None, device = None, requires_grad = False)
# requires_grad allows to record operations performed on the tensor, if True.

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

Aside from shape mismatch issues, datatype and device issues are the other most common issues inn PyTorch.

In [20]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0], dtype = torch.float16)
float_16_tensor.dtype

torch.float16

In [21]:
# GETTING INFORMATION FROM TENSORS

some_tensor = torch.rand(3, 4)

# Details about the tensor are as follows: (3 attributes are shape, type and device)
print(some_tensor)
print(some_tensor.shape)
print(some_tensor.dtype)
print(some_tensor.device)

tensor([[0.4628, 0.7803, 0.8970, 0.2979],
        [0.3584, 0.9926, 0.5619, 0.2524],
        [0.5067, 0.8404, 0.0892, 0.9977]])
torch.Size([3, 4])
torch.float32
cpu


### Manipulating Tensors (tensor operations)

Addition, Subtraction, Multiplication (element-wise), Division and Matrix Multiplication -- Basic building blocks of NNs.

The most sophisticated of NNs can be created by stacking these building blocks in the right way.

In [22]:
tensor = torch.tensor([1, 2, 3])
tensor + 10
tensor * 10
tensor

tensor = tensor - 10
tensor
tensor = tensor + 10
tensor

# Some built-in functions for these ops: torch.mul(), torch.add()
torch.multiply(tensor, 10)
tensor

# element-wise multiplication
tensor * tensor

# Matrix Multiplication (denoted by @ in Python) -- One of the most common ops in ML & DL algos.
# Two rules to follow: 1) Inner dimensions must match. 2) Resulting matrix has the shape of the outer dimensions
tensor.shape

# The difference between matmul and element-wise mul is the addition of values.
tensor * tensor
torch.matmul(tensor, tensor)
tensor @ tensor # Not recommended to use

tensor(14)

Avoid doing operations with for loop at all cost as they are computationally expensive.

**One of the most common errors in DL is shape mismatches.** However, this can be prevented by making the tensors' inner dimensions match. 

One of the ways to do so is to take the **transpose**.

Ways to do so are: torch.transpose(input, dim0, dim1) or tensor.T

In [23]:
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype = torch.float32)
print(tensor_A)
print(tensor_A.T)

# torch.mm can also be used for matmul -- short for torch.matmul()
torch.mm(tensor_A, tensor_A.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[1., 3., 5.],
        [2., 4., 6.]])


tensor([[ 5., 11., 17.],
        [11., 25., 39.],
        [17., 39., 61.]])

NNs are full of matrix muls. and dot products.

### Aggregation operartions on Tensors

In [24]:
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [25]:
print(x.min())
print(x.max())
print(x.type(torch.float32).mean()) # x.mean() will give error thus, won't work.
print(x.sum())

# Same above ops can be done using torch
print(f"{torch.max(x)}, {torch.min(x)}, {torch.mean(x.type(torch.float32))}, {torch.sum(x)}")

# torch.mean() or mean over tensors require tensors to be in specific datatype, otherwise the operation will fail.

# Positional min, max (index of min, max)
print(f"{x.argmax()}, {x.argmin()}")
torch.argmax(x), torch.argmin(x)

tensor(0)
tensor(90)
tensor(45.)
tensor(450)
90, 0, 45.0, 450
9, 0


(tensor(9), tensor(0))

### Change tensor datatype

The datatypes can be changed using **torch.Tensor.type(dtype=None)** where the dtype parameter is the datatype one would like to use.

In [26]:
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [27]:
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [28]:
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

### Reshaping, stacking, sqeezing and unsqueezing

Used to avoid shape mismatch errors and allows to be compliant with the tensor manipulations done in DL models i.e., NNs.

In [29]:
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [30]:
# Add extra dimension
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [31]:
# Change view of the original tensor into a different size tensor with the data kept intact
# This creates a new view of the same tensor.
z = x.view(1, 7)
print(f"{z}, {z.shape}")

# Changing the view changes the original tensor too i.e., changing z changes x
z[:, 0] = 5
z, x

tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7])


(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))

In [32]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim = 0) # dim value just means along rows(0) or along cols(1).
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7.],
        [5., 2., 3., 4., 5., 6., 7.],
        [5., 2., 3., 4., 5., 6., 7.],
        [5., 2., 3., 4., 5., 6., 7.]])

In [33]:
# Removing all single dimensions from a tensor -- squeezing the tensor (torch.squeeze()).
print(f"{x_reshaped}, {x_reshaped.shape}")

x_squeezed = x_reshaped.squeeze()
print(f"{x_squeezed}, {x_squeezed.shape}")

# To reverse the effect of squeezing -- use torch.unsqueeze()
x_unsqueezed = x_squeezed.unsqueeze(dim = 0)
print(f"{x_unsqueezed}, {x_unsqueezed.shape}")

tensor([[5., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7])
tensor([5., 2., 3., 4., 5., 6., 7.]), torch.Size([7])
tensor([[5., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7])


In [34]:
# To rearrange the order of axes values, where the iput gets turned into a view with new dims
# Use torch.permute(input, dims)
x_original = torch.rand(size = (224, 224, 3))

# Permuting the orig tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # It is like transposing using the dims value of the axes

print(f"orig shape: {x_original.shape}, permuted/new shape: {x_permuted.shape}")
# However, changes in view (as given by .permute) will apply to the original as well as stated above but the values remains unchanged unless explicitly changed.

orig shape: torch.Size([224, 224, 3]), permuted/new shape: torch.Size([3, 224, 224])


### Indexing (selecting data from tensors) -- akin to accessing of values in list or arrays.

In [35]:
x = torch.arange(1, 10).reshape(1, 3, 3)
print(f"{x}, {x.shape}")

# Indexing values goes from outer dim to inner dim -- indexing done as follows, bracket by bracket:
print(x[0])
print(x[0][0])
print(x[0][0][0])
print("\n")

# Alternatives -- using : to specify all vals in this dimension and using a comma(,)
print(f"{x[:, 0]}")
print(f"{x[:, :, 1]}")
print(f"{x[:, 1, 1]}")
print(f"{x[0, 0, :]}")

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]]), torch.Size([1, 3, 3])
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([1, 2, 3])
tensor(1)


tensor([[1, 2, 3]])
tensor([[2, 5, 8]])
tensor([5])
tensor([1, 2, 3])


### PyTorch tensors & NumPy

Mainly deals with conversion from NumPy arrays to PyTorch tensors and vice-versa.

Two main methods for this are:

torch.from_numpy(ndarray) & torch.Tensor.numpy()

In [36]:
# NumPy array to PyTorch tensor
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
print(f"{array}, {tensor}")
array = array + 1
array, tensor

[1. 2. 3. 4. 5. 6. 7.], tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64)


(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [37]:
# PyTorch tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

One thing to note that changing the original array/tensor doesn't change the corresponding converted tensor/array.

### Reproducibility

NNs in short: 
start with random numbers -> tensor operations -> try to make better (again and again and again)

In [38]:
# Creating 2 random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"{random_tensor_A}")
print(f"{random_tensor_B}")
# Equality Check
random_tensor_A == random_tensor_B

tensor([[0.6852, 0.2281, 0.8645, 0.3337],
        [0.3616, 0.5672, 0.7066, 0.9459],
        [0.7997, 0.0789, 0.9352, 0.8612]])
tensor([[0.1847, 0.1068, 0.8875, 0.9612],
        [0.6770, 0.9360, 0.8160, 0.4056],
        [0.8611, 0.7475, 0.5178, 0.4935]])


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [39]:
# What if wanted to create random tensors again but with the same flavour? But still have random values.
# Here comes torch.manual_seed(seed) comes in.

import random
RANDOM_SEED = 42
torch.manual_seed(seed = RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

# Need to reset the seed everytime a new rand() is called. Otherwise, different unflavoured tensors.
torch.random.manual_seed(seed = RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(f"{random_tensor_C}")
print(f"{random_tensor_D}")
# Equality Check
random_tensor_C == random_tensor_D

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

### Running tensors on GPUs (for faster computations)

GPUs are much faster than CPU at performing the specific types of operatioons NNs need (matmuls).

Once the GPU is being detected, the next step is to get PyTorch to run on the GPU i.e., for storing data (tensors) & computing on data (performing operations on tensors).

**torch.cuda package** is employed to do so.

In [40]:
# To check if pytorch has access to the GPU
torch.cuda.is_available()

True

In [41]:
# Set device type depending on whatever is available
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [42]:
# Count number of devices PyTorch has access to.
torch.cuda.device_count() # Useful when need to use multiple GPUs in cases of wanting even faster computations.

1

#### Putting tensors (& models) on the GPU

Done by calling **to(device)** on the tensor (or model) that is to be put on GPU, where device is the target device.

In [43]:
# Tensor default on CPU
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

In [44]:
# Moving tensors back to the CPU -- .cpu() is used to do so.
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy() # This copies the orig tensor on GPU to the CPU memory to be uable with CPUs.
tensor_back_on_cpu

array([1, 2, 3])

In [45]:
tensor_on_gpu # However, the orig tensor is still on the GPU.

tensor([1, 2, 3], device='cuda:0')

## Exercises

In [46]:
rt = torch.rand(size=(7,7))
rt

tensor([[0.8694, 0.5677, 0.7411, 0.4294, 0.8854, 0.5739, 0.2666],
        [0.6274, 0.2696, 0.4414, 0.2969, 0.8317, 0.1053, 0.2695],
        [0.3588, 0.1994, 0.5472, 0.0062, 0.9516, 0.0753, 0.8860],
        [0.5832, 0.3376, 0.8090, 0.5779, 0.9040, 0.5547, 0.3423],
        [0.6343, 0.3644, 0.7104, 0.9464, 0.7890, 0.2814, 0.7886],
        [0.5895, 0.7539, 0.1952, 0.0050, 0.3068, 0.1165, 0.9103],
        [0.6440, 0.7071, 0.6581, 0.4913, 0.8913, 0.1447, 0.5315]])

In [47]:
rt2 = torch.rand(size=(1,7))
print(rt2)
rslt = torch.matmul(rt, rt2.T)
rslt

tensor([[0.1587, 0.6542, 0.3278, 0.6532, 0.3958, 0.9147, 0.2036]])


tensor([[1.9625],
        [1.0950],
        [0.9967],
        [1.8910],
        [1.9205],
        [1.0674],
        [1.6949]])

In [48]:
torch.manual_seed(0)
rt = torch.rand(size=(7,7))
print(rt)
rt2 = torch.rand(size=(1,7))
print(rt2)
rslt_seed = torch.matmul(rt, rt2.T)
print(rslt_seed)

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074, 0.6341, 0.4901],
        [0.8964, 0.4556, 0.6323, 0.3489, 0.4017, 0.0223, 0.1689],
        [0.2939, 0.5185, 0.6977, 0.8000, 0.1610, 0.2823, 0.6816],
        [0.9152, 0.3971, 0.8742, 0.4194, 0.5529, 0.9527, 0.0362],
        [0.1852, 0.3734, 0.3051, 0.9320, 0.1759, 0.2698, 0.1507],
        [0.0317, 0.2081, 0.9298, 0.7231, 0.7423, 0.5263, 0.2437],
        [0.5846, 0.0332, 0.1387, 0.2422, 0.8155, 0.7932, 0.2783]])
tensor([[0.4820, 0.8198, 0.9971, 0.6984, 0.5675, 0.8352, 0.2056]])
tensor([[1.8542],
        [1.9611],
        [2.2884],
        [3.0481],
        [1.7067],
        [2.5290],
        [1.7989]])


In [49]:
torch.cuda.manual_seed(1234)

In [50]:
torch.manual_seed(1234)

print(torch.cuda.is_available())

rt = torch.rand(size=(2,3))
rt_gpu = rt.to(device)
print(rt_gpu)

rt2 = torch.rand(size=(2,3))
rt_gpu2 = rt2.to(device)
rt_gpu2

True
tensor([[0.0290, 0.4019, 0.2598],
        [0.3666, 0.0583, 0.7006]], device='cuda:0')


tensor([[0.0518, 0.4681, 0.6738],
        [0.3315, 0.7837, 0.5631]], device='cuda:0')

In [51]:
rslt_gpu = torch.matmul(rt_gpu, rt_gpu2.T)
rslt_gpu

tensor([[0.3647, 0.4709],
        [0.5184, 0.5617]], device='cuda:0')

In [52]:
torch.max(rslt_gpu), torch.min(rslt_gpu)

(tensor(0.5617, device='cuda:0'), tensor(0.3647, device='cuda:0'))

In [53]:
torch.argmax(rslt_gpu), torch.argmin(rslt_gpu)

(tensor(3, device='cuda:0'), tensor(0, device='cuda:0'))

In [54]:
torch.manual_seed(7)

rand_tensor = torch.rand(size=(1, 1, 1, 10))
print(rand_tensor, rand_tensor.shape)

rand_tensor_squeezed = rand_tensor.squeeze()
print(rand_tensor_squeezed, rand_tensor_squeezed.shape)

tensor([[[[0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297,
           0.3653, 0.8513]]]]) torch.Size([1, 1, 1, 10])
tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
        0.8513]) torch.Size([10])


**In-place operations are denoted by an _ suffix.**