# Introduction to PyTorch tensors

This notebook contains an introduction to PyTorch tensors.

- [Creating tensors](#creating-tensors)
  - [Scalars](#scalars)
  - [Vectors](#vectors)
  - [Matrices](#matrices)
  - [Tensors (3D+)](#tensors-3d)
  - [Creating random tensors](#creating-random-tensors)
  - [Ones and Zeros](#ones-and-zeros)
  - [Range of tensors and tensors-like](#range-of-tensors-and-tensors-like)
- [Tensor's data types](#tensors-data-types)
- [Requires Grad](#requires-grad)
- [Reproducibility](#reproducibility)


In [34]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [35]:
# Setting default device
torch.set_default_device('mps')

## Creating tensors


### Scalars


In [36]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7, device='mps:0')

In [37]:
# Get tensor back as python int
scalar.item()

7

In [38]:
# Get dimension of scalar
scalar.ndim

0

### Vectors


In [39]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7], device='mps:0')

In [40]:
# Get dimension of vector
vector.ndim

1

In [41]:
# Checking shape of vector
vector.shape

torch.Size([2])

### Matrices


In [42]:
# MATRIX
MATRIX = torch.tensor(
    [
        [7, 8],
        [9, 10],
    ],
)
MATRIX

tensor([[ 7,  8],
        [ 9, 10]], device='mps:0')

In [43]:
# Checking dimensions of matrix
MATRIX.ndim

2

In [44]:
# Checking shape of matrix
MATRIX.shape

torch.Size([2, 2])

### Tensors (3D+)


In [45]:
# TENSOR
TENSOR = torch.tensor(
    [
        [
            [1, 2, 3],
            [3, 6, 9],
            [2, 4, 5],
        ]
    ]
)

In [46]:
# Checking dimensions of tensor
TENSOR.ndim

3

In [47]:
# Checking shape of tensor
TENSOR.shape

torch.Size([1, 3, 3])

### Creating random tensors


For scalars and vectors, use lowercase letters, but for matrices or tensors, use uppercase.


In [48]:
# Random tensors
random_tensor = torch.rand(1, 3, 4)
random_tensor

tensor([[[0.8523, 0.5161, 0.1860, 0.1054],
         [0.8480, 0.4638, 0.8012, 0.4781],
         [0.0817, 0.4037, 0.8348, 0.7221]]], device='mps:0')

Random tensors are important, because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those numbers to better represent the data.


In [49]:
# Checking dimensions of tensor
random_tensor.ndim

3

In [50]:
# Creating tensors of similar shape to an image
random_image_tensor = torch.rand(size=(224, 224, 3))
random_image_tensor.shape, random_image_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Ones and Zeros


In [51]:
# Tensors of ones and zeros
ones = torch.ones(size=(3, 3))
zeros = torch.zeros(size=(2, 3))
ones, zeros

(tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]], device='mps:0'),
 tensor([[0., 0., 0.],
         [0., 0., 0.]], device='mps:0'))

Creating tensors filled with zeros or ones (mostly the first) is useful for instances such as where you are creating a mask for an object. Masks are filters that determine what types of information is useful and which type should be ignored by the model. This is just one of the many applications of these tensors.


### Range of tensors and tensors-like


In [52]:
# Tensor defined in range [a,b)
one_to_ten = torch.arange(1, 11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], device='mps:0')

`torch.arange` works similarly to the `range` built-in function in python, like the ability to set a step.


In [53]:
# Zeros tensor same shape as one_to_ten
zeros_1_to_10 = torch.zeros_like(one_to_ten)
zeros_1_to_10

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='mps:0')

The `torch.(...)_like()` nomenclature denotes functions that allow the creation of tensors with some specific order (such as ones or zeros), but with the same shape as the tensor passed as argument.


In [54]:
# Checking if shapes are equal
zeros_1_to_10.shape == one_to_ten.shape

True

## Tensor's data types

**Note:** Tensor datatypes is one of the 3 big issues with you'll run into with PyTorch and deep learning:

1. Tensors not the right datatype (this will happen in some functions, usually if the datatype is not `float32`)
2. Tensors not the right shape
3. Tensors not on the right device.


In [55]:
# Float 32 tensor
float_32_tensor = torch.tensor(
    [3.0, 6.0, 9.0],
    dtype=None,  # The datatype of the tensor (float32 by default)
    device=None,  # The device where the tensor is located (cpu, gpu)
    requires_grad=False,  # If True, records operations for automatic differentiation
)

Precision in computer science of a numerical quantity is a measure of the detail in which the quantity is expressed. Usually measured in bits but sometimes in decimal digits. It is related to precision in mathematics, which describes the number of digits that are used to express a value.

- `float16` represents a 16-bit floating point (half precision)
- `float32` represents a 32-bit floating point (single precision)
- `float64` represents a 64-bit floating point (double precision)

Half precision tensors are useful for situations where you sacrifice some detail in the numbers, and actual precision in exchange for numbers that are smaller.


In [56]:
# Checking datatype
float_32_tensor.dtype

torch.float32

In [57]:
# Converting tensor type
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], device='mps:0', dtype=torch.float16)

## Requires Grad


The `requires_grad` parameter is kind of a deep explanation. Deep learning models use gradient descent to update weights and biases of a model during training. These weights and biases are nothing more than tensors, and the gradient "tells them" how they should adjust in order for the model to be better optimized.

When this parameter is set to true, it allows for the computing of the gradient for the respective tensor. If, on the other hand, the parameter is false, it means you don't need gradients for that specific tensor (for inference/testing) or the tensor is input data, meaning it doesn't need to be learned.

PyTorch records all operations on tensors with `require_grad=True`, building a computation graph. This graph enables automatic differentiation (backpropagation). It links all the operations involving the recorded tensor, allowing for the computation of its gradient.

```python
x = torch.tensor(3, requires_grad=True)

y = x ** 2
z = y + 4
w = z * 2
```

The graph denominates x as a leaf node (can be checked with `x.is_leaf`), and the final output tensor (in our case w), is the root of the graph. The operation starts on the root and flows backward til the input tensor.


In [58]:
x = torch.tensor(3.0, requires_grad=True)

y = x**2
z = y + 4
w = z * 2

In [59]:
# Checking if tensor is leaf
x.is_leaf, y.is_leaf, z.is_leaf, w.is_leaf

(True, False, False, False)

In [60]:
# Doing backpropagation from w
w.backward()

In [61]:
# Checking grad for x
x.grad

tensor(12., device='mps:0')

This means that a small variation in $x$ ($\delta x$) would increase $w$ by 12 times that small increase. This is an indication that if $w$ increased, so would $x$. This means that, following the formula for updating the values in a neural network, using backpropagation, the operation would go something like:

$$ x\_{new} = x\_{old} - \eta \cdot \frac{\partial w}{\partial x}$$

- For $\eta = 0.1$ ($\eta$ is the learning rate) :

$$ x\_{new} = 3 - 0.1 \cdot 12$$

$$ x\_{new} = 3 - 1.2$$

$$ x\_{new} = 1.8$$

Meaning the value of x should be updated to 1.8 in order for the function to be closer to being optimized.


## Tensors and NumPy

NumPy is a popular scientific Python numerical computing library. And because of this, PyTorch has functionality to interact with it. For instance, if you need a NumPy array to be converted into a tensor, you can use `torch.from_numpy_array(ndarray)`, or if you need the opposite, a tensor converted to a NumPy array, use `<tensor>.numpy()`.


In [62]:
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array).type(torch.float32)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

The default data type in NumPy is `float64` instead of PyTorch's `float32`.


In [63]:
# Adding 1 to each value in the array
array += 1

# Checking array and tensor
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

The array and the tensor (that was created by converting from the array) don't share memory, meaning changes in one don't reflect on the other. This goes both ways, so tensors turned into arrays also don't share memory.


In [64]:
# Tensor to NumPy array
tensor = torch.ones(7).cpu()  # Need to transfer from gpu (mps) to cpu because of numpy
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

The change in data type is bidirectional as well, PyTorch's default data type is `float32` and when a tensor is converted (that is in that format), the conversion is kept in NumPy.


## Reproducibility

The nature of neural networks is intertwined with the idea of "randomness", since the network for the most part starts with random numbers and tries to update these random numbers to make better representations of the data, this process happens over and over again. The thing is, in computing there isn't really any randomness, this is not even that desirable at times.

This is a key concept in programming in general. There is no real way of computing a random number, instead we use employ techniques in order to make the data appear random, but there is always a trace, always a way of making this randomness not random whatsoever.

This of course reflects on PyTorch, so we'll discuss how to do this. The why we don't want data to be actually random in this case is mostly for **_reproducibility_**, we need a way of reproducing the results in an experiment.

This is done through setting the value of the "**_random seed_**".


In [65]:
# Creating two random tensors and checking whether they are equal
random_tensor_a = torch.rand((3, 2, 5))
random_tensor_b = torch.rand((3, 2, 5))

# Checking if they are the same
random_tensor_a == random_tensor_b

tensor([[[False, False, False, False, False],
         [False, False, False, False, False]],

        [[False, False, False, False, False],
         [False, False, False, False, False]],

        [[False, False, False, False, False],
         [False, False, False, False, False]]], device='mps:0')

In [66]:
# Creating random seed
RANDOM_SEED = 42

In [67]:
# Creating new tensors and comparing them
torch.manual_seed(RANDOM_SEED)
random_tensor_c = torch.rand((3, 2, 5))

torch.manual_seed(RANDOM_SEED)
random_tensor_d = torch.rand((3, 2, 5))

# Checking if they are the same
random_tensor_c == random_tensor_d

tensor([[[True, True, True, True, True],
         [True, True, True, True, True]],

        [[True, True, True, True, True],
         [True, True, True, True, True]],

        [[True, True, True, True, True],
         [True, True, True, True, True]]], device='mps:0')

As seen above, you need to set the manual seed for every process that generates random data. In our case we are doing two assignments, meaning two processes, therefore the seed is set manually twice.
