## 0.0 : Tensors...the basic building blocks of ML

In [140]:
!pip install torch

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [141]:
import numpy as np
import torch

print(torch.__version__)

2.1.1+cpu


Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

For example, you could represent an image as a tensor with shape [3, 224, 224] which would mean [colour_channels, height, width], as in the image has 3 colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.

In [142]:
scalar = torch.tensor(5)
scalar

tensor(5)

In [143]:
print(scalar.ndim)

0


In [144]:
# turn the sdcalar into an int, works just with one value tensors
scalar.item()

5

In [145]:
vector = torch.tensor([8, 9, 4])
vector

tensor([8, 9, 4])

In [146]:
print(vector.ndim)

1


In [147]:
matrix = torch.tensor([[1, 2, 3], [8, 5, 3]])
matrix

tensor([[1, 2, 3],
        [8, 5, 3]])

In [148]:
print(matrix.shape)

torch.Size([2, 3])


In [149]:
print(matrix.ndim)

2


In [150]:
matrix2 = torch.tensor([[[3, 5, 6], [5, 9, 1], [0, 2, 6]]])
print(matrix2.ndim)

3


torch.tensor() always copies data. If you have a Tensor data and just want to change its requires_grad flag, use requires_grad_() or detach() to avoid a copy. If you have a numpy array and want to avoid a copy, use torch.as_tensor().

In [151]:
tensor = torch.tensor([4, 7, 8], requires_grad=False)
print(tensor.is_leaf)

True


In [152]:
tensor = torch.tensor([0, 6, 3, 4]).detach()
print(tensor)

tensor([0, 6, 3, 4])


In [153]:
np_arr = np.array([4, 6, 8, 7])
arr_to_tens = torch.as_tensor(np_arr)
print(arr_to_tens)

tensor([4, 6, 8, 7], dtype=torch.int32)


In [154]:
cpu = torch.device("cpu")
tens = torch.tensor([0, 5, 6, 7, 4], dtype=torch.float32, device=cpu)
tens

tensor([0., 5., 6., 7., 4.])

In [155]:
ones = torch.ones(5, dtype=torch.int32)
ones

tensor([1, 1, 1, 1, 1], dtype=torch.int32)

In [156]:
zeros = torch.zeros(6)
zeros

tensor([0., 0., 0., 0., 0., 0.])

In [157]:
test_tens = torch.tensor([[5, 7, 8, 3, 2], [4, 6, 7, 8, 7], [6, 8, 2, 1, 3]])
print(test_tens)
print(test_tens[0][2])
print(test_tens[:, 0])

tensor([[5, 7, 8, 3, 2],
        [4, 6, 7, 8, 7],
        [6, 8, 2, 1, 3]])
tensor(8)
tensor([5, 4, 6])


A tensor can be created with requires_grad=True so that torch.autograd records operations on them for automatic differentiation.
<br>
A tensor can be created with requires_grad=True so that torch.autograd records operations on them for automatic differentiation.

In [158]:
grad_tens = torch.tensor([9, 5, 4, 3, 0], dtype=torch.float64, requires_grad=True)
print(grad_tens)
opt = grad_tens.pow(2).sum()
opt.backward()
print(grad_tens.grad)

tensor([9., 5., 4., 3., 0.], dtype=torch.float64, requires_grad=True)
tensor([18., 10.,  8.,  6.,  0.], dtype=torch.float64)


Methods which mutate a tensor are marked with an underscore suffix. For example, torch.FloatTensor.abs_() computes the absolute value in-place and returns the modified tensor, while torch.FloatTensor.abs() computes the result in a new tensor.

To change an existing tensor’s torch.device and/or torch.dtype, consider using to() method on the tensor.

In [159]:
tenss = torch.tensor([5, 6, 4, 3])
print(tenss.dtype)
# tenss = tenss.to(torch.float64)
# print(tenss.dtype)
print(tenss.float().dtype)

torch.int64
torch.float32


In [160]:
print(tenss.is_cuda)
print(tenss.device)

False
cpu


### Random tensors

But when building machine learning models with PyTorch, it's rare you'll create tensors by hand (like what we've being doing).

Instead, a machine learning model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.

In essence:

Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...

In [161]:
random_tensor = torch.rand((5, 7))
print(random_tensor)

tensor([[0.2666, 0.6274, 0.2696, 0.4414, 0.2969, 0.8317, 0.1053],
        [0.2695, 0.3588, 0.1994, 0.5472, 0.0062, 0.9516, 0.0753],
        [0.8860, 0.5832, 0.3376, 0.8090, 0.5779, 0.9040, 0.5547],
        [0.3423, 0.6343, 0.3644, 0.7104, 0.9464, 0.7890, 0.2814],
        [0.7886, 0.5895, 0.7539, 0.1952, 0.0050, 0.3068, 0.1165]])


In [162]:
# torch.arange(start, end, step)
rang_tens = torch.arange(1, 100, 2)
print(rang_tens)

tensor([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35,
        37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71,
        73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99])


### Common issues 
Aside from shape issues (tensor shapes don't match up), two of the other most common issues you'll come across in PyTorch are datatype and device issues.

For example, one of tensors is torch.float32 and the other is torch.float16 (PyTorch often likes tensors to be the same format).

Or one of your tensors is on the CPU and the other is on the GPU (PyTorch likes calculations between tensors to be on the same device).

Note: When you run into issues in PyTorch, it's very often one to do with one of the three attributes above. So when the error messages show up, sing yourself a little song called "what, what, where":

"what shape are my tensors? what datatype are they and where are they stored? what shape, what datatype, where where where"

### Manipulating tensors (tensor operations)¶
In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

These operations are often a wonderful dance between:

- Addition
- Substraction
- Multiplication (element-wise)
- Division
- Matrix multiplication

In [163]:
# matrix multiplication aka dot product

tensor1 = torch.rand((5, 3))
tensor2 = torch.rand((3, 4))


# we can use @ but it's not recommended
print(f"USING @ OPERATOR : \n{tensor1 @ tensor2}")
print(f"\nUSING MATMUL : \n {torch.matmul(tensor1, tensor2)}")
print(f"\nUSING MM : \n {torch.mm(tensor1, tensor2)}")

USING @ OPERATOR : 
tensor([[0.9313, 0.7367, 1.3473, 1.3242],
        [0.7607, 0.8019, 1.2323, 1.3081],
        [0.5766, 0.1871, 0.2474, 0.2996],
        [0.5605, 0.6195, 1.0662, 1.0697],
        [1.0106, 0.3028, 0.5179, 0.5498]])

USING MATMUL : 
 tensor([[0.9313, 0.7367, 1.3473, 1.3242],
        [0.7607, 0.8019, 1.2323, 1.3081],
        [0.5766, 0.1871, 0.2474, 0.2996],
        [0.5605, 0.6195, 1.0662, 1.0697],
        [1.0106, 0.3028, 0.5179, 0.5498]])

USING MM : 
 tensor([[0.9313, 0.7367, 1.3473, 1.3242],
        [0.7607, 0.8019, 1.2323, 1.3081],
        [0.5766, 0.1871, 0.2474, 0.2996],
        [0.5605, 0.6195, 1.0662, 1.0697],
        [1.0106, 0.3028, 0.5179, 0.5498]])


The torch.nn.Linear() module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A


$$  y = x.W^T + b $$

- x is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
- W is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "T", that's because the weights matrix gets transposed).

- b is the bias term used to slightly offset the weights and inputs.
- y is the output (a manipulation of the input in the hopes to discover patterns in it).

In [164]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible
torch.manual_seed(42)
# This uses matrix multiplication
linear = torch.nn.Linear(
    in_features=2, out_features=6  # in_features = matches inner dimension of input
)  # out_features = describes outer value

tensor_A = torch.tensor([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# mat mul ( in_feature = 2 cuz = (3,2) out_feature = 6 cuz (2,6) )
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])


### Aggregating tensors

Finding the min, max, mean, sum, etc (aggregation)

In [165]:
x = torch.arange(0, 5, 0.1)
x

tensor([0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000,
        0.9000, 1.0000, 1.1000, 1.2000, 1.3000, 1.4000, 1.5000, 1.6000, 1.7000,
        1.8000, 1.9000, 2.0000, 2.1000, 2.2000, 2.3000, 2.4000, 2.5000, 2.6000,
        2.7000, 2.8000, 2.9000, 3.0000, 3.1000, 3.2000, 3.3000, 3.4000, 3.5000,
        3.6000, 3.7000, 3.8000, 3.9000, 4.0000, 4.1000, 4.2000, 4.3000, 4.4000,
        4.5000, 4.6000, 4.7000, 4.8000, 4.9000])

In [166]:
print(f"THE MAX : { torch.max(x)}")  # or x.mean() ...
print(f"THE INDEX OF THE MAX : {torch.argmax(x)}")
print(f"THE MIN : {torch.min(x)}")
print(f"THE INDEX OF THE MIN : {torch.argmin(x)}")
print(f"THE SUM : {torch.sum(x)}")
print(f"THE MEAN : {torch.mean(x)}")
print(f"THE STD : {torch.std(x)}")

THE MAX : 4.900000095367432
THE INDEX OF THE MAX : 49
THE MIN : 0.0
THE INDEX OF THE MIN : 0
THE SUM : 122.49999237060547
THE MEAN : 2.4499998092651367
THE STD : 1.457737922668457



### Reshaping, stacking, squeezing and unsqueezing


Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make sure the right elements of your tensors are mixing with the right elements of other tensors.

In [167]:
tensor = torch.arange(1, 10)
tensor.shape

torch.Size([9])

In [169]:
reshaped_tensor = tensor.reshape(1, 9)  # 9 puisque l'ancienne dim was 9
print(reshaped_tensor)
print(reshaped_tensor.shape)

tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
torch.Size([1, 9])


In [175]:
z = tensor.view(1, 9)  # changing the view changes the original tensor too.
z, tensor

(tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]]), tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]))

In [180]:
stacked_tens = torch.stack([tensor, tensor, tensor], dim=0)
stacked_tens, stacked_tens.shape

(tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9],
         [1, 2, 3, 4, 5, 6, 7, 8, 9]]),
 torch.Size([3, 9]))

In [181]:
stacked_tens = torch.stack([tensor, tensor, tensor], dim=1)
stacked_tens, stacked_tens.shape

(tensor([[1, 1, 1],
         [2, 2, 2],
         [3, 3, 3],
         [4, 4, 4],
         [5, 5, 5],
         [6, 6, 6],
         [7, 7, 7],
         [8, 8, 8],
         [9, 9, 9]]),
 torch.Size([9, 3]))

In [193]:
print(f"Previous tensor: {reshaped_tensor}")
print(f"Previous shape: {reshaped_tensor.shape}")

# Remove extra dimension from reshaped_tensor
x_squeezed = reshaped_tensor.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous tensor: tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
Previous shape: torch.Size([1, 9])

New tensor: tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
New shape: torch.Size([9])

New tensor: tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
New shape: torch.Size([1, 9])


In [194]:
# Create tensor with specific shape
x_original = torch.rand(size=(224, 224, 3))

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1)  # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])



### Indexing and slicing 

Indexing values goes outer dimension -> inner dimension (check out the square brackets).

In [196]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [197]:
# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}")
print(f"Second square bracket: {x[0][0]}")
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


### PyTorch tensors & NumPy

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

torch.from_numpy(ndarray) - NumPy array -> PyTorch tensor.

torch.Tensor.numpy() - PyTorch tensor -> NumPy array.

In [202]:
arr = np.array([[3, 6, 7], [0, 7, 1]])
tens = torch.from_numpy(arr)
tens, type(tens)

(tensor([[3, 6, 7],
         [0, 7, 1]], dtype=torch.int32),
 torch.Tensor)

In [203]:
tensor = torch.tensor([[5, 7, 1], [0, 5, 3], [4, 5, 3]])
nd_array = tensor.numpy()
nd_array, type(nd_array)

(array([[5, 7, 1],
        [0, 5, 3],
        [4, 5, 3]], dtype=int64),
 numpy.ndarray)

### Reproducibility (trying to take the random out of random)

In [206]:
import random

#  Set the random seed
RANDOM_SEED = 42
torch.manual_seed(seed=RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.random.manual_seed(
    seed=RANDOM_SEED
)  # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.4540, 0.1965, 0.9210, 0.3462],
        [0.1481, 0.0858, 0.5909, 0.0659],
        [0.7476, 0.6253, 0.9392, 0.1338]])

Tensor D:
tensor([[0.4540, 0.1965, 0.9210, 0.3462],
        [0.1481, 0.0858, 0.5909, 0.0659],
        [0.7476, 0.6253, 0.9392, 0.1338]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

### Running tensors on GPUs (and making faster computations)

Note: When I reference "GPU" throughout this course, I'm referencing a Nvidia GPU with CUDA enabled (CUDA is a computing platform and API that helps allow GPUs be used for general purpose computing & not just graphics) unless otherwise specified.

In [207]:
!nvidia-smi

'nvidia-smi' n'est pas reconnu en tant que commande interne
ou externe, un programme ex�cutable ou un fichier de commandes.


In [213]:
# to know if cuda is available and Pytorch has access to GPU
torch.cuda.is_available()

False

In [214]:
# to know how many GPUS we have
torch.cuda.device_count()

0

#### NOTE : 
Make sure to write <b>  device agnostic code </b> wich means code that'll run on CPU (always available) or GPU (if available).

In [209]:
# Set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

#### NOTE :
Putting a tensor on GPU using to(device) (e.g. some_tensor.to(device)) returns a copy of that tensor, e.g. the same tensor will be on CPU and GPU. To overwrite tensors, reassign them:

some_tensor = some_tensor.to(device)

### Moving tensors back to the CPU 

For example, you'll want to do this if you want to interact with your tensors with NumPy (NumPy does not leverage the GPU).

This copies the tensor to CPU memory so it's usable with CPUs.

In [216]:
# Instead, copy the tensor back to cpu
tensor_on_gpu = torch.rand(5)
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu, type(tensor_back_on_cpu)

(array([0.27611536, 0.83956444, 0.15627259, 0.10720509, 0.72614056],
       dtype=float32),
 numpy.ndarray)