In [1]:
import torch
torch.__version__

'2.0.1+cpu'

## Introduction to tensors

Now we've got PyTorch imported, it's time to learn about tensors.

Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

For example, you could represent an image as a tensor with shape `[3, 224, 224]` which would mean `[colour_channels, height, width]`, as in the image has `3` colour channels (red, green, blue), a height of `224` pixels and a width of `224` pixels.

![example of going from an input image to a tensor representation of the image, image gets broken down into 3 colour channels as well as numbers to represent the height and width](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-tensor-shape-example-of-image.png)

In tensor-speak (the language used to describe tensors), the tensor would have three dimensions, one for `colour_channels`, `height` and `width`.

But we're getting ahead of ourselves.

Let's learn more about tensors by coding them.


### Creating tensors


In [2]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [3]:
scalar.shape

torch.Size([])

In [4]:
# Vector
vector = torch.tensor([7, 7, 4])
vector

tensor([7, 7, 4])

In [5]:
# Check shape of vector
vector.shape

torch.Size([3])

In [6]:
# Matrix
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [7]:
MATRIX.shape

torch.Size([2, 2])

In [8]:
# Tensor
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]],
                       [[-1, -2, -3],
                        [-3, -6, -9],
                        [-2, -4, -5]]])
TENSOR

tensor([[[ 1,  2,  3],
         [ 3,  6,  9],
         [ 2,  4,  5]],

        [[-1, -2, -3],
         [-3, -6, -9],
         [-2, -4, -5]]])

And what about its shape?

In [9]:
# Check shape of TENSOR
TENSOR.shape

torch.Size([2, 3, 3])

In [10]:
# Tensor
TENSOR = torch.tensor([[[[1.0, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]],
                       [[-1, -2, -3],
                        [-3, -6, -9],
                        [-2, -4, -5]]],
                      [[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]],
                       [[-1, -2, -3],
                        [-3, -6, -9],
                        [-2, -4, -5]]]])
TENSOR

tensor([[[[ 1.,  2.,  3.],
          [ 3.,  6.,  9.],
          [ 2.,  4.,  5.]],

         [[-1., -2., -3.],
          [-3., -6., -9.],
          [-2., -4., -5.]]],


        [[[ 1.,  2.,  3.],
          [ 3.,  6.,  9.],
          [ 2.,  4.,  5.]],

         [[-1., -2., -3.],
          [-3., -6., -9.],
          [-2., -4., -5.]]]])

In [11]:
TENSOR.shape

torch.Size([2, 2, 3, 3])

In [12]:
# create tensor from np.array

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

pt_arr = torch.tensor(arr)
pt_arr

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

In [13]:
# tensor with zeros

torch.zeros(size=(2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [14]:
# tensor with ones

torch.ones(size=(2, 3, 4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

Tensor indexing

In [15]:
TENSOR

tensor([[[[ 1.,  2.,  3.],
          [ 3.,  6.,  9.],
          [ 2.,  4.,  5.]],

         [[-1., -2., -3.],
          [-3., -6., -9.],
          [-2., -4., -5.]]],


        [[[ 1.,  2.,  3.],
          [ 3.,  6.,  9.],
          [ 2.,  4.,  5.]],

         [[-1., -2., -3.],
          [-3., -6., -9.],
          [-2., -4., -5.]]]])

In [16]:
TENSOR[1][1][0]

tensor([-1., -2., -3.])

In [17]:
TENSOR[1, 1, 0]

tensor([-1., -2., -3.])

Tensor slices

In [18]:
TENSOR[1, 1, 0:2]

tensor([[-1., -2., -3.],
        [-3., -6., -9.]])

### Tensor datatypes

There are many different [tensor datatypes available in PyTorch](https://pytorch.org/docs/stable/tensors.html#data-types).

Some are specific for CPU and some are better for GPU.

Getting to know which is which can take some time.

Generally if you see `torch.cuda` anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).

The most common type (and generally the default) is `torch.float32` or `torch.float`.

This is referred to as "32-bit floating point".

But there's also 16-bit floating point (`torch.float16` or `torch.half`) and 64-bit floating point (`torch.float64` or `torch.double`).

And to confuse things even more there's also 8-bit, 16-bit, 32-bit and 64-bit integers.

Let's see how to create some tensors with specific datatypes. We can do so using the `dtype` parameter.

In [19]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

Aside from shape issues (tensor shapes don't match up), two of the other most common issues you'll come across in PyTorch are datatype and device issues.

For example, one of tensors is `torch.float32` and the other is `torch.float16` (PyTorch often likes tensors to be the same format).

Or one of your tensors is on the CPU and the other is on the GPU (PyTorch likes calculations between tensors to be on the same device).

We'll see more of this device talk later on.

For now let's create a tensor with `dtype=torch.float16`.

In [20]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

In [21]:
new_tensor = float_16_tensor.type(torch.float32)
new_tensor.dtype

torch.float32

## Getting information from tensors

Once you've created tensors (or someone else or a PyTorch module has created them for you), you might want to get some information from them.

We've seen these before but three of the most common attributes you'll want to find out about tensors are:
* `shape` - what shape is the tensor? (some operations require specific shape rules)
* `dtype` - what datatype are the elements within the tensor stored in?
* `device` - what device is the tensor stored on? (usually GPU or CPU)

Let's create a random tensor and find out details about it.

In [22]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.2502, 0.3559, 0.0365, 0.5169],
        [0.9431, 0.2103, 0.2814, 0.3172],
        [0.7069, 0.4229, 0.7224, 0.2116]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Manipulating tensors (tensor operations)

In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

These operations are often a wonderful dance between:
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

And that's it. Sure there are a few more here and there but these are the basic building blocks of neural networks.

Stacking these building blocks in the right way, you can create the most sophisticated of neural networks (just like lego!).

### Basic operations

Let's start with a few of the fundamental operations, addition (`+`), subtraction (`-`), mutliplication (`*`).

They work just as you think they would.

In [23]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [24]:
tensor.shape

torch.Size([3])

In [25]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30])

Notice how the tensor values above didn't end up being `tensor([110, 120, 130])`, this is because the values inside the tensor don't change unless they're reassigned.

In [26]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 3])

Let's subtract a number and this time we'll reassign the `tensor` variable.

In [27]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -7])

In [28]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 3])

In [29]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
tensor * torch.tensor([4, 5, 6])

tensor([ 4, 10, 18])

In [30]:
tensor

tensor([1, 2, 3])

In [31]:
tensor_1 = torch.tensor([[5, 6, 7],
                        [8, 9, 10]])
tensor_1

tensor([[ 5,  6,  7],
        [ 8,  9, 10]])

In [32]:
tensor_1 + tensor

tensor([[ 6,  8, 10],
        [ 9, 11, 13]])

In [33]:
tensor / tensor

tensor([1., 1., 1.])

### Matrix multiplication (is all you need)

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is [matrix multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).

PyTorch implements matrix multiplication functionality in the [`torch.matmul()`](https://pytorch.org/docs/stable/generated/torch.matmul.html) method.

The main two rules for matrix multiplication to remember are:
1. The **inner dimensions** must match:
  * `(3, 2) @ (3, 2)` won't work
  * `(2, 3) @ (3, 2)` will work
  * `(3, 2) @ (2, 3)` will work
2. The resulting matrix has the shape of the **outer dimensions**:
 * `(2, 3) @ (3, 2)` -> `(2, 2)`
 * `(3, 2) @ (2, 3)` -> `(3, 3)`

> **Note:** "`@`" in Python is the symbol for matrix multiplication.

> **Resource:** You can see all of the rules for matrix multiplication using `torch.matmul()` [in the PyTorch documentation](https://pytorch.org/docs/stable/generated/torch.matmul.html).

Let's create a tensor and perform element-wise multiplication and matrix multiplication on it.



In [None]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])

The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our `tensor` variable with values `[1, 2, 3]`:

| Operation | Calculation | Code |
| ----- | ----- | ----- |
| **Element-wise multiplication** | `[1*1, 2*2, 3*3]` = `[1, 4, 9]` | `tensor * tensor` |
| **Matrix multiplication** | `[1*1 + 2*2 + 3*3]` = `[14]` | `tensor.matmul(tensor)` |


In [34]:
# Element-wise matrix multiplication
tensor * tensor

tensor([1, 4, 9])

In [35]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [36]:
# Can also use the "@" symbol for matrix multiplication, though not recommended
tensor @ tensor

tensor(14)

You can do matrix multiplication by hand but it's not recommended.

The in-built `torch.matmul()` method is faster.

In [37]:
%%time
# Matrix multiplication by hand
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: total: 0 ns
Wall time: 5.99 ms


tensor(14)

In [38]:
%%time
torch.matmul(tensor, tensor)

CPU times: total: 0 ns
Wall time: 960 µs


tensor(14)

## One of the most common errors in deep learning (shape errors)

Because much of deep learning is multiplying and performing operations on matrices and matrices have a strict rule about what shapes and sizes can be combined, one of the most common errors you'll run into in deep learning is shape mismatches.

In [40]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

We can make matrix multiplication work between `tensor_A` and `tensor_B` by making their inner dimensions match.

One of the ways to do this is with a **transpose** (switch the dimensions of a given tensor).

You can perform transposes in PyTorch using either:
* `torch.transpose(input, dim0, dim1)` - where `input` is the desired tensor to transpose and `dim0` and `dim1` are the dimensions to be swapped.
* `tensor.T` - where `tensor` is the desired tensor to transpose.

Let's try the latter.

In [41]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [42]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [None]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


You can also use [`torch.mm()`](https://pytorch.org/docs/stable/generated/torch.mm.html) which is a short for `torch.matmul()`.

In [43]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

### Finding the min, max, mean, sum, etc (aggregation)

Now we've seen a few ways to manipulate tensors, let's run through a few ways to aggregate them (go from more values to less values).

First we'll create a tensor and then find the max, min, mean and sum of it.





In [44]:
# Create a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

Now let's perform some aggregation.

In [45]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


> **Note:** You may find some methods such as `torch.mean()` require tensors to be in `torch.float32` (the most common) or another specific datatype, otherwise the operation will fail.

You can also do the same as above with `torch` methods.

In [46]:
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)

(tensor(90), tensor(0), tensor(45.), tensor(450))

### Positional min/max

You can also find the index of a tensor where the max or minimum occurs with [`torch.argmax()`](https://pytorch.org/docs/stable/generated/torch.argmax.html) and [`torch.argmin()`](https://pytorch.org/docs/stable/generated/torch.argmin.html) respectively.

This is helpful incase you just want the position where the highest (or lowest) value is and not the actual value itself (we'll see this in a later section when using the [softmax activation function](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html)).

In [47]:
# Create a tensor
tensor = torch.arange(10, 100, 10)
print(f"Tensor: {tensor}")

# Returns index of max and min values
print(f"Index where max value occurs: {tensor.argmax()}")
print(f"Index where min value occurs: {tensor.argmin()}")

Tensor: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])
Index where max value occurs: 8
Index where min value occurs: 0


### Reshaping, stacking, squeezing and unsqueezing

Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

To do so, some popular methods are:

| Method | One-line description |
| ----- | ----- |
| [`torch.reshape(input, shape)`](https://pytorch.org/docs/stable/generated/torch.reshape.html#torch.reshape) | Reshapes `input` to `shape` (if compatible), can also use `torch.Tensor.reshape()`. |
| [`torch.Tensor.view(shape)`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html) | Returns a view of the original tensor in a different `shape` but shares the same data as the original tensor. |
| [`torch.stack(tensors, dim=0)`](https://pytorch.org/docs/1.9.1/generated/torch.stack.html) | Concatenates a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same size. |
| [`torch.squeeze(input)`](https://pytorch.org/docs/stable/generated/torch.squeeze.html) | Squeezes `input` to remove all the dimenions with value `1`. |
| [`torch.unsqueeze(input, dim)`](https://pytorch.org/docs/1.9.1/generated/torch.unsqueeze.html) | Returns `input` with a dimension value of `1` added at `dim`. |
| [`torch.permute(input, dims)`](https://pytorch.org/docs/stable/generated/torch.permute.html) | Returns a *view* of the original `input` with its dimensions permuted (rearranged) to `dims`. |

Why do any of these?

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make the right elements of your tensors are mixing with the right elements of other tensors.

Let's try them out.

First, we'll create a tensor.

In [48]:
# Create a tensor
import torch
x = torch.arange(0., 12.)
x, x.shape

(tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.]),
 torch.Size([12]))

Now let's add an extra dimension with `torch.reshape()`.

In [49]:
# Add an extra dimension
x_reshaped = x.reshape(1, 12)
x_reshaped, x_reshaped.shape

(tensor([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.]]),
 torch.Size([1, 12]))

In [50]:
# Add an extra dimension
x_reshaped = x.reshape(3, 4)
x_reshaped, x_reshaped.shape

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 torch.Size([3, 4]))

In [51]:
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

We can also change the view with `torch.view()`.

In [52]:
# Change view (keeps same data as original but changes view)
# See more: https://stackoverflow.com/a/54507446/7900723
z = x.view(3, 4)
z, z.shape

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 torch.Size([3, 4]))

Remember though, changing the view of a tensor with `torch.view()` really only creates a new view of the *same* tensor.

So changing the view changes the original tensor too.

In [53]:
z[:, 0]

tensor([0., 4., 8.])

In [54]:
# Changing z changes x
z[:, 0] = 5
z, x

(tensor([[ 5.,  1.,  2.,  3.],
         [ 5.,  5.,  6.,  7.],
         [ 5.,  9., 10., 11.]]),
 tensor([ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.]))

If we wanted to stack our new tensor on top of itself five times, we could do so with `torch.stack()`.

In [55]:
x.shape

torch.Size([12])

In [56]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) # try changing dim to dim=1 and see what happens
x_stacked

tensor([[ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.],
        [ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.],
        [ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.],
        [ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.]])

In [None]:
x_stacked.shape

torch.Size([4, 12])

In [None]:
 x = x.unsqueeze(0)

In [None]:
x

tensor([[ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.]])

In [None]:
x.shape

torch.Size([1, 12])

In [None]:
# concatenate tensors
x_stacked = torch.cat((x, x), dim=1) # try changing dim to dim=1 and see what happens
x_stacked

tensor([[ 5.,  1.,  2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.,  5.,  1.,
          2.,  3.,  5.,  5.,  6.,  7.,  5.,  9., 10., 11.]])

In [None]:
x_stacked.shape

torch.Size([1, 24])

How about removing all single dimensions from a tensor?

To do so you can use `torch.squeeze()` (I remember this as *squeezing* the tensor to only have dimensions over 1).

In [None]:
x_reshaped = x_reshaped.unsqueeze(dim=0)
x_reshaped

tensor([[[[[ 5.,  1.,  2.,  3.],
           [ 5.,  5.,  6.,  7.],
           [ 5.,  9., 10., 11.]]]]])

In [None]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimension from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[[[[ 5.,  1.,  2.,  3.],
           [ 5.,  5.,  6.,  7.],
           [ 5.,  9., 10., 11.]]]]])
Previous shape: torch.Size([1, 1, 1, 3, 4])

New tensor: tensor([[ 5.,  1.,  2.,  3.],
        [ 5.,  5.,  6.,  7.],
        [ 5.,  9., 10., 11.]])
New shape: torch.Size([3, 4])


In [None]:
x_squeezed = torch.ones(size=(2, 3))
x_squeezed

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [None]:
x_squeezed + 3

tensor([[4., 4., 4.],
        [4., 4., 4.]])

In [None]:
x_squeezed = x_squeezed.unsqueeze(dim=-1)
x_squeezed

tensor([[[1.],
         [1.],
         [1.]],

        [[1.],
         [1.],
         [1.]]])

In [None]:
x_squeezed.shape

torch.Size([2, 3, 1])

And to do the reverse of `torch.squeeze()` you can use `torch.unsqueeze()` to add a dimension value of 1 at a specific index.

In [None]:
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

## Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous tensor: tensor([[1., 1., 1.],
        [1., 1., 1.]])
Previous shape: torch.Size([2, 3])

New tensor: tensor([[[1., 1., 1.],
         [1., 1., 1.]]])
New shape: torch.Size([1, 2, 3])


## Reproducibility (trying to take the random out of random)


In [57]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.6176, 0.7505, 0.6697, 0.1798],
        [0.0336, 0.3256, 0.5275, 0.0771],
        [0.9089, 0.5082, 0.2306, 0.4266]])

Tensor B:
tensor([[0.2141, 0.8408, 0.7223, 0.1738],
        [0.2143, 0.3822, 0.1793, 0.8764],
        [0.5221, 0.5552, 0.1807, 0.5422]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

Just as you might've expected, the tensors come out with different values.

But what if you wanted to created two random tensors with the *same* values.

As in, the tensors would still contain random values but they would be of the same flavour.

That's where [`torch.manual_seed(seed)`](https://pytorch.org/docs/stable/generated/torch.manual_seed.html) comes in, where `seed` is an integer (like `42` but it could be anything) that flavours the randomness.

Let's try it out by creating some more *flavoured* random tensors.

In [58]:
import torch
import random

# # Set the random seed
RANDOM_SEED=42 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called
# Without this, tensor_D would be different to tensor_C
torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

## Running tensors on GPUs (and making faster computations)

Deep learning algorithms require a lot of numerical operations.

And by default these operations are often done on a CPU (computer processing unit).

However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need (matrix multiplications) than CPUs.




In [59]:
!nvidia-smi

"nvidia-smi" ­Ґ пў«пҐвбп ў­гваҐ­­Ґ© Ё«Ё ў­Ґи­Ґ©
Є®¬ ­¤®©, ЁбЇ®«­пҐ¬®© Їа®Ја ¬¬®© Ё«Ё Ї ЄҐв­л¬ д ©«®¬.


In [60]:
# Check for GPU
# import torch
torch.cuda.is_available()

False

In [61]:
# Set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [62]:
# Count number of devices
torch.cuda.device_count()

0

Knowing the number of GPUs PyTorch has access to is helpful incase you wanted to run a specific process on one GPU and another process on another (PyTorch also has features to let you run a process across *all* GPUs).

Moving tensors to the GPU

In [63]:
# Create tensor (default on CPU)
tensor_1 = torch.tensor([1, 2, 3])
tensor_2 = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor_1, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor_1.to('cuda:0')
tensor_on_gpu

tensor([1, 2, 3]) cpu


AssertionError: Torch not compiled with CUDA enabled

In [None]:
tensor_2.device

device(type='cpu')

In [None]:
tensor_on_gpu + tensor_2.to('cuda')

tensor([2, 4, 6], device='cuda:0')

Moving tensors back to the CPU

What if we wanted to move the tensor back to CPU?

For example, you'll want to do this if you want to interact with your tensors with NumPy (NumPy does not leverage the GPU).

Let's try using the [`torch.Tensor.numpy()`](https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html) method on our `tensor_on_gpu`.

In [None]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()

TypeError: ignored

Instead, to get a tensor back to CPU and usable with NumPy we can use [`Tensor.cpu()`](https://pytorch.org/docs/stable/generated/torch.Tensor.cpu.html).

This copies the tensor to CPU memory so it's usable with CPUs.

In [None]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

In [None]:
tensor_1_cpu = torch.rand(1000, 2000)
tensor_2_cpu = torch.rand(2000, 1000)

In [None]:
%%timeit
tensor_3_cpu = tensor_1_cpu @ tensor_2_cpu

95.9 ms ± 37 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
tensor_1_gpu = torch.rand(1000, 2000, device = torch.device('cuda'))
tensor_2_gpu = torch.rand(2000, 1000, device = torch.device('cuda'))

In [None]:
%%timeit
tensor_3_gpu = tensor_1_gpu @ tensor_2_gpu

47.3 µs ± 29.6 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
