# 00. PyTorch Fundamentals

Course video - [Source](https://www.youtube.com/watch?v=Z_ikDlimN6A)

GitHub Repo - [Source](https://github.com/Yer-Marti/PyTorch-Course)

PyTorch Doc - [Source](https://pytorch.org/docs/stable/index.html)

## Contents

* [Contents](#scrollTo=7br1tYvaXhg_&line=1&uniqifier=1)
* [Import libraries](#scrollTo=4BDE9GO3Xcvg&line=1&uniqifier=1)
* [Random tensors](#scrollTo=pdcrYYcCggBG&line=1&uniqifier=1)
* [Zeros and ones](#scrollTo=2iiW866xYhgG)
* [Creating a range of tensors](#scrollTo=bS9KEV7uZU1y)
* [Tensor datatypes](#scrollTo=Cv39Y3TXai0Z)
* [Retrieving information of the tensors](#scrollTo=jxo-D0RufSRn)
* [Manipulating tensors (tensor operations)](#scrollTo=U_xhnFhQha36)
* [Errors with shape](#scrollTo=DQQTLx6KkgG7)
* [Finding the min, max, mean, sum, etc (tensor aggregation)](#scrollTo=AnmV7qA6qt1H)
* [Finding the positional min and max](#scrollTo=mXtS-f2zzk1U)
* [Reshaping, stacking, squeezing and unsqueezing tensors](#scrollTo=mceJZ58a0-6B)
* [Indexing (selecting data from tensors)](#scrollTo=KpaJ_2lMZZWv)
* [PyTorch tensors and NumPy](#scrollTo=SQW-KQ2gsu-v)
* [Reproducibility](#scrollTo=J0je9OiWxKVX)
* [Running tensors and PyTorch objects on the GPUs](#scrollTo=ww8xVd6Tz3Hd)
* [Exercises](#scrollTo=6svxlX516Dip)

## Import libraries

Import the libraries we are going to use:

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Introduction to Tensors

### Creating tensors

Tensors are created using `torch.tensor`

In [None]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# Retrieving the tensor as a Python int
scalar.item()

7

In [None]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

> **Note:** the dimension is given by "the number of square brackets"

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX
MATRIX = torch.tensor([[7, 8],
                      [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0]

tensor([7, 8])

`MATRIX.shape` returns that the matrix is 2x2.

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])

TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

`TENSOR.ndim` returns the number of **dimensions** in the tensor. In other words, `TENSOR.shape` will return 3 values.

In [None]:
TENSOR.ndim

3

In this case, the first dimension is `1` because the first square bracket has a single element. The two remaining dimensions state that said element is a 3x3 matrix.

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

When printing the first element we can see that is an `array` (more or less) in the first position. The first square bracket is missing.

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

Example to test tensors:

In [None]:
TENSOR_TEST = torch.tensor([[[1, 2, 3, 7]],
                            [[4, 5, 6, 8]]])

TENSOR_TEST

tensor([[[1, 2, 3, 7]],

        [[4, 5, 6, 8]]])

In [None]:
TENSOR_TEST.ndim

3

In [None]:
TENSOR_TEST.shape

torch.Size([2, 1, 4])

In [None]:
TENSOR_TEST[1]

tensor([[4, 5, 6, 8]])

### Random tensors

¿Why?

They are important because the way most neural networks learn is by starting with completely random tensors and then hone the values so they represent better the data.

`Start with random values -> observe data -> manipulate values -> observe data -> manipulate values`

In [None]:
# Creating a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.8549, 0.5509, 0.2868, 0.2063],
        [0.4451, 0.3593, 0.7204, 0.0731],
        [0.9699, 0.1078, 0.8829, 0.4132]])

In [None]:
random_tensor.ndim

2

In [None]:
# Creating tensor of similar size of an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3)) # height, width, color channels (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and ones

In [None]:
# Creating an all zeros tensor
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Creating an all ones tensor
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

In [None]:
random_tensor.dtype

torch.float32

### Creating a range of tensors

In [None]:
# Use torch.range()
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# Creating tensors from other tensors
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [None]:
one_to_ten.shape, ten_zeros.shape

(torch.Size([10]), torch.Size([10]))

### Tensor datatypes

Tensor datatypes are one of the 3 big mistakes we are going to run into in PyTorch and Deep Learning:

1. Tensors don't have correct `datatype`
2. Tensors don't have correct `shape`
3. Tensors aren't in the correct `device`

This is connected to computer precision.

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float32, # Datatype of the tensor
                               device=None,         # CPU, GPU (cuda), TPU
                               requires_grad=False) # Wether we want Python to track the tensor gradients
float_32_tensor

tensor([3., 6., 9.])

In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

Test with tensor's datatype:

In [None]:
res_tensor = float_16_tensor * float_32_tensor
res_tensor, res_tensor.shape, res_tensor.ndim, res_tensor.dtype

(tensor([ 9., 36., 81.]), torch.Size([3]), 1, torch.float32)

In [None]:
long_tensor = torch.tensor([3, 6, 9], dtype=torch.long)
long_tensor * float_16_tensor

tensor([ 9., 36., 81.], dtype=torch.float16)

### Retrieving information of the tensors

1. To get the **datatype** of a tensor, we use `tensor.dtype`
1. To get the **shape** of a tensor, we use `tensor.shape`
1. To get the **device** of a tensor, we use `tensor.device`

In [None]:
# Creating the tensor
some_tensor = torch.rand(size=(3, 4)) # we can omit the `size`

# Retrieving its attributes
print(some_tensor)
print(f"Datatype: {some_tensor.dtype}\nShape: {some_tensor.shape}\nDevice: {some_tensor.device}")

tensor([[0.3430, 0.6513, 0.9604, 0.8022],
        [0.7766, 0.4986, 0.7709, 0.9642],
        [0.0956, 0.8524, 0.5605, 0.5408]])
Datatype: torch.float32
Shape: torch.Size([3, 4])
Device: cpu


### Manipulating tensors (tensor operations)

Operations include:

* Sum
* Substraction
* Multiplication (of elements)
* Division
* Matrix multiplication

In [None]:
# Creating the tensor
tensor = torch.tensor([1, 2, 3])
tensor

tensor([1, 2, 3])

In [None]:
# Sum
tensor + 10

tensor([11, 12, 13])

In [None]:
# Subtract
tensor - 10

tensor([-9, -8, -7])

In [None]:
# Multiply
tensor * 10

tensor([10, 20, 30])

In [None]:
# Divide
tensor / 2

tensor([0.5000, 1.0000, 1.5000])

In [None]:
# Testing functions of PyTorch
torch.mul(tensor, 10)

tensor([10, 20, 30])

#### Matrix multiplication

Two main ways for matrix multiplication:

1. Element-wise (value ⋅ matrix)
2. Matrix multiplication (matrix ⋅ matrix)

In [None]:
# Element-wise
print(f"{tensor} * {tensor} = {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3]) = tensor([1, 4, 9])


In [None]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

By hand:

In [None]:
tensor[0] * tensor[0] + tensor[1] * tensor[1] + tensor[2] * tensor[2]

tensor(14)

We can compare the time necessary for each method:

In [None]:
# By hand

%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.34 ms, sys: 0 ns, total: 1.34 ms
Wall time: 1.23 ms


In [None]:
# PyTorch

%%time
torch.matmul(tensor, tensor)

CPU times: user 280 µs, sys: 1 µs, total: 281 µs
Wall time: 199 µs


tensor(14)

### Errors with shape: one of the most common errors in Deep Learning

1. The **inner dimensions** must match:
* `(3, 2) @ (3, 2)` won't work
* `(2, 3) @ (3, 2)` will work
* `(3, 2) @ (2, 3)` will work

2. The resultant matix will have the `shape` of the **outer dimensions**:
* `(2, 3) @ (3, 2) -> (2, 2)`
* `(3, 2) @ (2, 3) -> (3, 3)`

In [None]:
tensor @ tensor # Same as torch.matmul()

tensor(14)

In [None]:
torch.matmul(torch.rand(2, 3), torch.rand(3, 2))

tensor([[1.0062, 0.6560],
        [1.0866, 0.6202]])

In [None]:
# Shape for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

torch.mm(tensor_A, tensor_B) # torch.mm is an alias for torch.matmul

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [None]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix problems with the `shape` of tensors, we can manipulate the `shape` of one tensor using **transpose**.

A **transpose** switches axes or dimensions of a given tensor.

In [None]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [None]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
# Now we can multiply the tensors

print(f"Original: A={tensor_A.shape}, B={tensor_B.shape}")
print(f"New: A={tensor_A.shape} (not changed), B={tensor_B.T.shape}")

print(f"\nResult: {torch.mm(tensor_A, tensor_B.T)}")

Original: A=torch.Size([3, 2]), B=torch.Size([3, 2])
New: A=torch.Size([3, 2]) (not changed), B=torch.Size([2, 3])

Result: tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])


Better visualization for [Matrix Multiplication](http://matrixmultiplication.xyz).

### Finding the min, max, mean, sum, etc (tensor aggregation)

In [None]:
# Create a tensor
x = torch.arange(0, 100, 10)
x, x.dtype

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.int64)

In [None]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

> **Note:** tensor `x`'s datatype is `torch.int64`/`long` as shown before, and `mean()` function cannot operate with that, so it throws an error. We will face some errors of this kind when coding.

In [None]:
# Find the mean
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

### Finding the positional min and max

In [None]:
# Create a tensor
x = torch.arange(1, 101, 10)
x, x.dtype

(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [None]:
# Find the position in tensor that has the minimum value
torch.argmin(x), x.argmin()

(tensor(0), tensor(0))

In [None]:
x[0]

tensor(1)

In [None]:
# Find the position in tensor that has the maximum value
torch.argmax(x), x.argmax()

(tensor(9), tensor(9))

In [None]:
x[-1]

tensor(91)

### Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor to a defined shape
* View - return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
# Creating a tensor
x = torch.arange(1., 10.)
x, x.shape, x.ndim

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]), 1)

> **Note:** reshaped tensor `ndim` is the original tensor's `ndim` + 1. Also, the product of the `shape` dimensions must be equal to the original tensor's.

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 9)
print(f"Tensor: {x_reshaped}, Shape: {x_reshaped.shape}, reshape factor 1 * 9 = 9, Dim={x_reshaped.ndim}")

x_reshaped = x.reshape(9, 1)
print(f"\nTensor: {x_reshaped}, Shape: {x_reshaped.shape}, reshape factor 9 * 1 = 9, Dim={x_reshaped.ndim}")

x_reshaped = x.reshape(3, 3)
print(f"\nTensor: {x_reshaped}, Shape: {x_reshaped.shape}, reshape factor 3 * 3 = 9, Dim={x_reshaped.ndim}")

Tensor: tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), Shape: torch.Size([1, 9]), reshape factor 1 * 9 = 9, Dim=2

Tensor: tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]]), Shape: torch.Size([9, 1]), reshape factor 9 * 1 = 9, Dim=2

Tensor: tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]]), Shape: torch.Size([3, 3]), reshape factor 3 * 3 = 9, Dim=2


In [None]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

If we change `z`, `x` changes because a view of a tensor shares the same memory as the original.

In [None]:
# Changing 'z'
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=1)
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

Squeezing removes all single dimensions of a given tensor.

Documentation [here](https://pytorch.org/docs/stable/generated/torch.squeeze.html).

In [None]:
# Squeezing tensors
y = torch.rand(3, 1, 3, 2, 1)
print(f"Initial tensor: {y.shape}")
y_squeezed = torch.squeeze(y)
print(f"Squeezed tensor: {y_squeezed.shape}")

Initial tensor: torch.Size([3, 1, 3, 2, 1])
Squeezed tensor: torch.Size([3, 3, 2])


Giving a `dim` value to the function sets a maximum of dimensions to be squeezed:

In [None]:
# Setting a dim value
y_squeezed = y.squeeze(dim=1)
print(f"With only 1 dimension squeezed: {y_squeezed.shape}")

With only 1 dimension squeezed: torch.Size([3, 3, 2, 1])


Unsqueeze does the opposite, it adds a new dimension of `1` to the tensor at a specific dimension.

Documentation [here](https://pytorch.org/docs/stable/generated/torch.unsqueeze.html).

In [None]:
# Unsqueezing tensors
y = torch.rand(9)
print(f"Initial tensor: {y}")
print(f"Initial shape: {y.shape}")

y_unsqueezed = torch.unsqueeze(y, dim=1)
print(f"\nUnsqueezed tensor: {y_unsqueezed}")
print(f"Unsqueezed shape: {y_unsqueezed.shape}")

Initial tensor: tensor([0.1387, 0.4443, 0.6465, 0.7125, 0.5824, 0.3138, 0.7448, 0.3783, 0.9414])
Initial shape: torch.Size([9])

Unsqueezed tensor: tensor([[0.1387],
        [0.4443],
        [0.6465],
        [0.7125],
        [0.5824],
        [0.3138],
        [0.7448],
        [0.3783],
        [0.9414]])
Unsqueezed shape: torch.Size([9, 1])


Setting the input dimension to negative makes the index to go backwards. `-1` is the last position and so on.

> **Note:** the input value range is as follows => `[-input.dim() - 1, input.dim() + 1)`

In [None]:
y_negative_unsqueezed = torch.unsqueeze(y, dim=-2)
print(f"\nUnsqueezed tensor: {y_negative_unsqueezed}")
print(f"Unsqueezed shape: {y_negative_unsqueezed.shape}")


Unsqueezed tensor: tensor([[0.1387, 0.4443, 0.6465, 0.7125, 0.5824, 0.3138, 0.7448, 0.3783, 0.9414]])
Unsqueezed shape: torch.Size([1, 9])


Permutation rearranges the tensor's dimensions in a specified order. Creates a view of the original tensor, so it shares memory with the original.

Input `dims` is a tuple with the order of the dims' indexes.

Documentation [here](https://pytorch.org/docs/stable/generated/torch.permute.html).

In [None]:
# Permuting tensors
y = torch.rand(3, 4, 2)
print(f"Initial tensor: {y}")
print(f"Initial shape: {y.shape}")

y_permuted = torch.permute(y, (2, 0, 1))
print(f"\nPermuted tensor: {y_permuted}")
print(f"Permuted shape: {y_permuted.shape}")

Initial tensor: tensor([[[0.6474, 0.6234],
         [0.5790, 0.2778],
         [0.1297, 0.1720],
         [0.0913, 0.3035]],

        [[0.6541, 0.0325],
         [0.2897, 0.3572],
         [0.1267, 0.3263],
         [0.4092, 0.5813]],

        [[0.3729, 0.7374],
         [0.2611, 0.3989],
         [0.8837, 0.8601],
         [0.9623, 0.1004]]])
Initial shape: torch.Size([3, 4, 2])

Permuted tensor: tensor([[[0.6474, 0.5790, 0.1297, 0.0913],
         [0.6541, 0.2897, 0.1267, 0.4092],
         [0.3729, 0.2611, 0.8837, 0.9623]],

        [[0.6234, 0.2778, 0.1720, 0.3035],
         [0.0325, 0.3572, 0.3263, 0.5813],
         [0.7374, 0.3989, 0.8601, 0.1004]]])
Permuted shape: torch.Size([2, 3, 4])


> **Note:** apparently, permutation is often used for images.

> Example:

In [None]:
image_tensor = torch.rand(224, 224, 3) # height, width, color_channels
print(f"Original image shape: {image_tensor.shape}")

# We want to set the color_channels to be the first dimension
permuted_image_tensor = image_tensor.permute(2, 0, 1)
print(f"\nPermuted image tensor shape: {permuted_image_tensor.shape}")

Original image shape: torch.Size([224, 224, 3])

Permuted image tensor shape: torch.Size([3, 224, 224])


Permuted tensor shares memory with the original:

In [None]:
permuted_image_tensor[0, 0, 0] = 16666

image_tensor[0, 0, 0], permuted_image_tensor[0, 0, 0]

(tensor(16666.), tensor(16666.))

### Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy.

In [None]:
# Creating a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Indexing the first dimension (dim=0)
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Indexing the second dimension (dim=1)
x[0][0], x[0, 0]

(tensor([1, 2, 3]), tensor([1, 2, 3]))

In [None]:
# Indexing the third dimension, in this example, getting a value (dim=2)
x[0][0][0], x[0, 0, 0]

(tensor(1), tensor(1))

In [None]:
# Indexing all with ':'
x[:, :, 0]

tensor([[1, 4, 7]])

In [None]:
index_test = torch.arange(1, 28).reshape(3, 3, 3) # values 1 to 27

print(f"Tensor: {index_test}, Shape: {index_test.shape}")

index_test[1, :, 1]

Tensor: tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[10, 11, 12],
         [13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24],
         [25, 26, 27]]]), Shape: torch.Size([3, 3, 3])


tensor([11, 14, 17])

### PyTorch tensors and NumPy

NumPy is a popular scientific Python numerical computing library. PyTorch requires NumPy.

Because of this, PyTorch has functionality to interact with it.

* Change data in NumPy to PyTorch tensor -> `torch.from_numpy(ndarray)`
* Change data in PyTorch tensor to NumPy -> `torch.Tensor.numpy()`

In [None]:
# NumPy array to tensor
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

> **Note:** NumPy's default datatype is `float64` whereas PyTorch's is `float32`.

In [None]:
array.dtype

dtype('float64')

In [None]:
torch.arange(1.0, 8.0).dtype

torch.float32

Nonetheless, we can use `type()` function if we need to change the tensor's datatype:

In [None]:
tensor.type(torch.float32).dtype

torch.float32

Using `from_numpy()` creates a different space in memory, so the data is not shared.

In [None]:
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In the opposite operation, datatype is reflected just like in the previous case.

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

The same goes for `numpy()`, memory is not shared:

In [None]:
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

### Reproducibility (trying to take random out of random)

How a neural network learns:

`start with random values -> tensor operations -> update random numbers to try make them better representation of the data -> again -> again -> ...`

To reduce the randomness in neural networks and PyTorch comes the concept of **random seed**.

In [None]:
# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A == random_tensor_B)

tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Making reproducible random tensors

# Set a random seed
RANDOM_SEED = 666

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C == random_tensor_D)

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### Running tensors and PyTorch objects on the GPUs

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything better.

#### 1. Getting a GPU

* Use Google Colab for a free GPU (can upgrade as well)
* Use your own GPU (investment for a good one and requires setup)
* Use cloud computing (GCP, AWS, Azure... are services yhat allow computer renting on the cloud)

In [None]:
!nvidia-smi

Sat Feb 24 12:29:38 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   68C    P0              29W /  70W |    121MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

#### 2. Check for GPU access with PyTorch

In [None]:
import torch
torch.cuda.is_available()

True

For PyTorch, since it's capable of running compute on the CPU or GPU, it's best practice to setup device agnostic code.

Documentation for [best practices](https://pytorch.org/docs/stable/notes/cuda.html#best-practices).

This is basically telling PyTorch to use a GPU if it has access to one, since its faster that way, else use a CPU.

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

#### 3. Putting tensors and models on the GPU

GPU results in faster computation, so we want these to run there.

In [None]:
# Create a tensor (default is on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
tensor.device

device(type='cpu')

In [None]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

#### 4. Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [None]:
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

### Exercises

[Source](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/00_pytorch_fundamentals.ipynb)

In [None]:
import torch
import numpy as np

2. Create a random tensor with shape `(7, 7)`.

In [None]:
ex2_random_tensor = torch.rand(7, 7)
ex2_random_tensor, ex2_random_tensor.shape

(tensor([[0.8549, 0.5509, 0.2868, 0.2063, 0.4451, 0.3593, 0.7204],
         [0.0731, 0.9699, 0.1078, 0.8829, 0.4132, 0.7572, 0.6948],
         [0.5209, 0.5932, 0.8797, 0.6286, 0.7653, 0.1132, 0.8559],
         [0.6721, 0.6267, 0.5691, 0.7437, 0.9592, 0.3887, 0.2214],
         [0.3742, 0.1953, 0.7405, 0.2529, 0.2332, 0.9314, 0.9575],
         [0.5575, 0.4134, 0.4355, 0.7369, 0.0331, 0.0914, 0.8994],
         [0.9936, 0.4703, 0.1049, 0.5137, 0.2674, 0.4990, 0.7447]]),
 torch.Size([7, 7]))

3. Perform a matrix multiplication on the tensor from 2 with another random tensor with shape `(1, 7)` (hint: you may have to transpose the second tensor).

In [None]:
ex3_random_tensor = torch.rand(1, 7)

# Inner dimensions must match, so we transpose the new tensor
ex3_transposed = torch.transpose(ex3_random_tensor, 1, 0)

ex3_result = torch.mm(ex2_random_tensor, ex3_transposed)

ex3_result, ex3_result.shape

(tensor([[1.6914],
         [1.7581],
         [2.0906],
         [1.8975],
         [1.7585],
         [1.7961],
         [1.8900]]),
 torch.Size([7, 1]))

4. Set the random seed to `0` and do exercises 2 & 3 over again.

In [None]:
EX4_RANDOM_SEED = 0
torch.manual_seed(EX4_RANDOM_SEED)

<torch._C.Generator at 0x7aadebea9cb0>

In [None]:
# EX 2

ex2_random_tensor = torch.rand(7, 7)
ex2_random_tensor, ex2_random_tensor.shape

(tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074, 0.6341, 0.4901],
         [0.8964, 0.4556, 0.6323, 0.3489, 0.4017, 0.0223, 0.1689],
         [0.2939, 0.5185, 0.6977, 0.8000, 0.1610, 0.2823, 0.6816],
         [0.9152, 0.3971, 0.8742, 0.4194, 0.5529, 0.9527, 0.0362],
         [0.1852, 0.3734, 0.3051, 0.9320, 0.1759, 0.2698, 0.1507],
         [0.0317, 0.2081, 0.9298, 0.7231, 0.7423, 0.5263, 0.2437],
         [0.5846, 0.0332, 0.1387, 0.2422, 0.8155, 0.7932, 0.2783]]),
 torch.Size([7, 7]))

In [None]:
# EX 3

ex3_random_tensor = torch.rand(1, 7)

# Inner dimensions must match, so we transpose the new tensor
ex3_transposed = torch.transpose(ex3_random_tensor, 1, 0)

ex3_result = torch.mm(ex2_random_tensor, ex3_transposed)

ex3_result, ex3_result.shape

(tensor([[1.8542],
         [1.9611],
         [2.2884],
         [3.0481],
         [1.7067],
         [2.5290],
         [1.7989]]),
 torch.Size([7, 1]))

5. Speaking of random seeds, we saw how to set it with `torch.manual_seed()` but is there a GPU equivalent? (hint: you'll need to look into the documentation for `torch.cuda` for this one). If there is, set the GPU random seed to `1234`.

In [None]:
EX5_RANDOM_SEED = 1234

#torch.cuda.seed()    This one sets a random seed

torch.cuda.manual_seed(EX5_RANDOM_SEED)

6. Create two random tensors of shape `(2, 3)` and send them both to the GPU (you'll need access to a GPU for this). Set `torch.manual_seed(1234)` when creating the tensors (this doesn't have to be the GPU random seed).

In [None]:
EX6_RANDOM_SEED = 1234

torch.manual_seed(EX6_RANDOM_SEED)
ex6_random_tensor1 = torch.rand(2, 3)

torch.manual_seed(EX6_RANDOM_SEED)
ex6_random_tensor2 = torch.rand(2, 3)

device = "cuda"
ex6_random_tensor1.to(device), ex6_random_tensor2.to(device)

(tensor([[0.0290, 0.4019, 0.2598],
         [0.3666, 0.0583, 0.7006]], device='cuda:0'),
 tensor([[0.0290, 0.4019, 0.2598],
         [0.3666, 0.0583, 0.7006]], device='cuda:0'))

7. Perform a matrix multiplication on the tensors you created in 6 (again, you may have to adjust the shapes of one of the tensors).

In [None]:
ex6_tensor2_transposed = torch.transpose(ex6_random_tensor2, 1, 0)

ex7_res = torch.mm(ex6_random_tensor1, ex6_tensor2_transposed)
ex7_res

tensor([[0.2299, 0.2161],
        [0.2161, 0.6287]])

8. Find the maximum and minimum values of the output of 7.

In [None]:
ex7_res.max(), ex7_res.min()

(tensor(0.6287), tensor(0.2161))

9. Find the maximum and minimum index values of the output of 7.

In [None]:
ex7_res.argmax(), ex7_res.argmin()

(tensor(3), tensor(1))

10. Make a random tensor with shape `(1, 1, 1, 10)` and then create a new tensor with all the 1 dimensions removed to be left with a tensor of shape `(10)`. Set the seed to `7` when you create it and print out the first tensor and it's shape as well as the second tensor and it's shape.

In [None]:
torch.manual_seed(7)

ex10_random_tensor = torch.rand(1, 1, 1, 10)
print(f"Squeezed tensor: {ex10_random_tensor}, Shape: {ex10_random_tensor.shape}")

ex10_tensor_squeezed = ex10_random_tensor.squeeze()
print(f"Squeezed tensor: {ex10_tensor_squeezed}, Shape: {ex10_tensor_squeezed.shape}")

Squeezed tensor: tensor([[[[0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297,
           0.3653, 0.8513]]]]), Shape: torch.Size([1, 1, 1, 10])
Squeezed tensor: tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
        0.8513]), Shape: torch.Size([10])
