<a href="https://colab.research.google.com/github/tnzmnjm/christmas-coding-challenge-2024/blob/main/notebooks/Day%202/Pytorch_day2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch

A.2 Understanding tensors

Tensors represent a mathematical concept that generalises vectors and matrices to potentially higher dimensions. In other words, tensors are mathematical objects that can be characterized by their order (or rank), which provides the number of dimensions. For example, a scalar (just a number) is a tensor of rank 0, a vector is a tensor of rank 1, and a matrix is a tensor of rank 2.

From a computational perspective, tensors serve as data containers. For instance, they hold multidimensional data, where each dimension represents a different feature. Tensor libraries like PyTorch can create, manipulate, and compute with these arrays efficiently. In this context, a tensor library functions as an array library.

PyTorch tensors are similar to NumPy arrays but have several additional features that are important for deep learning. For example, PyTorch adds an automatic differentiation engine, simplifying computing gradients (see section A.4). PyTorch tensors also support GPU computations to speed up deep neural network training (see section A.8).

PyTorch adopts most of the NumPy array API and syntax for its tensor operations.

[Scientific Computing in Python: Introduction to NumPy and Matplotlib](https://colab.research.google.com/drive/10xCthdT-_r5AX7XyImcx6dfBNdIys_32#scrollTo=SGmwe13TqUtu&line=9&uniqifier=1)

A.2.1 Scalars, vectors, matrices, and tensors

As mentioned earlier, PyTorch tensors are data containers for array-like structures.

Tensors can be a :

1.   scalers --> a zero-dimensional tensor (an integer 5)
2.   vecor --> one-dimensional tensor
3.   matrix --> two-dimensional tensor
4.   no specific name for higher-dimensional tensors (3D, 4D, ...)


We can create objects of PyTorch’s `Tensor` class using the `torch.tensor` function as shown in the following listing.



In [5]:
# Scalar (0D): A single number (e.g., 5). Creates a zero-dimensional tensor (scalar) from a Python integer.
# Tensor equivalent: Shape is (), no dimensions.
scaler = torch.tensor(5)
print(scaler)
print(scaler.shape)

tensor(5)
torch.Size([])


In [12]:
# Vector (1D): A 1D array of numbers (e.g., [1, 2, 3]).
# Tensor equivalent: Shape (n,).
# Creates a one-dimensional tensor (vector) from a Python list
vector = torch.tensor([1,2,3])
print(vector)
print(vector.shape)

tensor([1, 2, 3])
torch.Size([3])


In [8]:
# Matrix (2D): A 2D array of numbers (e.g., a table or grid).
# Tensor equivalent: Shape (n, m).
# Creates a two-dimensional tensor from a nested Python list
matrix = torch.tensor([[1,2], [3,4]])
print(matrix)
print(matrix.shape)

tensor([[1, 2],
        [3, 4]])
torch.Size([2, 2])


In [9]:
# Higher-dimensional Tensors (nD): Generalised arrays. For example, a 3D tensor can represent a stack of matrices (e.g., for RGB images).
# Tensor equivalent: Shape (d1, d2, ..., dn).
# Creates a three-dimensional tensor from a nested Python list
tensor_3d = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(tensor_3d)
print(tensor_3d.shape)

tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])
torch.Size([2, 2, 2])


A.2.2 Tensor data types

PyTorch adopts the default 64-bit integer data type from Python. We can access the data type of a tensor via the `.dtype` attribute of a tensor:

In [13]:
print(scaler.dtype)
print(vector.dtype)
print(matrix.dtype)
print(tensor_3d.dtype)

torch.int64
torch.int64
torch.int64
torch.int64


If we create tensors from Python floats, PyTorch creates tensors with a 32-bit precision by default:

In [14]:
float_vector = torch.tensor([1.2, 3.5, 6.7])
print(float_vector.dtype)

torch.float32


This choice is primarily due to the balance between precision and computational efficiency. A 32-bit floating-point number offers sufficient precision for most deep learning tasks while consuming less memory and computational resources than a 64-bit floating-point number. Moreover, GPU architectures are optimized for 32-bit computations, and using this data type can significantly speed up model training and inference.

Moreover, it is possible to change the precision using a tensor’s `.to` method. The following code demonstrates this by changing a 64-bit integer tensor into a 32-bit float tensor:

In [15]:
floatvector = vector.to(torch.float32)
print(floatvector.dtype)

torch.float32


For more information about different tensor data types available in PyTorch, check the official documentation at https://pytorch.org/docs/stable/tensors.html.