import torch
import pandas

# introduction to tensors

### Creating scalars



In [2]:
#scalar

scalar= torch.tensor(7)
scalar

tensor(7)

In [3]:
# vector

vector= torch.tensor([7, 7])
vector

tensor([7, 7])

In [4]:
# matrix

matrix= torch.tensor([[7, 1], 
                     [5, 3]])

matrix

tensor([[7, 1],
        [5, 3]])

In [5]:
TENSOR= torch.tensor([[[[3, 5], 
                       [4, 6],
                       [1, 9]]]])

TENSOR

tensor([[[[3, 5],
          [4, 6],
          [1, 9]]]])

In [6]:
# To check the dimension of a tensor we use `torch.ndim`

print(f'The dimension of the variable "scalar" is {scalar.ndim}.\nThe dimension of the variable "vector" is {vector.ndim}.\nThe dimension of the variable "matrix" is {matrix.ndim}.\
\nThe dimension of the variable "TENSOR" is {TENSOR.ndim}.')

The dimension of the variable "scalar" is 0.
The dimension of the variable "vector" is 1.
The dimension of the variable "matrix" is 2.
The dimension of the variable "TENSOR" is 4.


In [7]:
# to check the shapes of the above

print(f'The shape of the variable "scalar" is {scalar.shape}.\nThe shape of the variable "vector" is {vector.shape}.\nThe shape of the variable "matrix" is {matrix.shape}.\
\nThe shape of the variable "TENSOR" is {TENSOR.shape}.')

The shape of the variable "scalar" is torch.Size([]).
The shape of the variable "vector" is torch.Size([2]).
The shape of the variable "matrix" is torch.Size([2, 2]).
The shape of the variable "TENSOR" is torch.Size([1, 1, 3, 2]).


<h3>Summary of types of tensor objects</h3>

| Name   | What is it?                                     | Number of Dimensions | Variable Nomenclature |
| ------: | -----------------------------------------------: | --------------------: | ---------------------: |
| Scaler | A single number, representing only magnitude     | 0                    | Lower (a)             |
| Vector | A number with direction, can have many other numbers | 1 | lower (y) |
| Matrix | A 2 dimensional array of numbers | 2 | Upper (Q) |
| Tensor | A n-dimensional array of numbers | Can be any number: 0 = scalar, 1 = vector, 2= matrix | Upper (X) |

### Torch object xtics

In [8]:
TENSOR= torch.tensor([[[1,4], [5, 6]], 
              [[2, 7], [3, 5]],
              [[7, 9], [2, 9]]])

In [9]:
TENSOR.shape

torch.Size([3, 2, 2])

In [10]:
TENSOR.ndim

3

In [11]:
TENSOR.dtype

torch.int64

### Random Tensors

Why Random  Tensors?

this is because machine learning models usually start training with a large set of random numbers and keep adjusting weights at each epoch until it achieves its goal or completes the number of epochs

In [12]:
# Creating a `Random Tensors` of shape (3, 4)

r_tensor= torch.rand(3, 4)
r_tensor

tensor([[0.7272, 0.1547, 0.7471, 0.4858],
        [0.6741, 0.5403, 0.0758, 0.9862],
        [0.2669, 0.8116, 0.2864, 0.3869]])

In [13]:
# create a random tensor with similar shape to an image tensor. the shape of an image tensor is given as (height, width, channels)
# example a 255 pixels square color image will be (224, 224, 3) as there are 3 color channels for thr RGB

rand_image= torch.rand(224, 224, 3)
rand_image.shape, rand_image.ndim

(torch.Size([224, 224, 3]), 3)

### Tensors of Zeros and Ones



In [14]:
# Create a tensor of zeros
zero_tensor= torch.zeros(3, 4)
zero_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [15]:
# create a tensor of ones

one_tensor= torch.ones(3, 4)
one_tensor

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

### Creating a range of tensors and tensors-like

torch.range(start, end, step) the `end` is exclusive and the start is inclusive. however range is to be depreccated in future versions and thus a better fix will be torch.arange(start, end, step)

In [16]:
# create a torch range
one_to_ten= torch.arange(1, 11, 1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [17]:
# To create a tensor like another, we use the rand, zero or ones_like the tensor we are trying to copy

torch.rand_like(one_tensor) # created a random tensor like one_tensor created above

tensor([[0.3922, 0.0056, 0.9104, 0.5717],
        [0.0539, 0.8941, 0.3167, 0.7245],
        [0.4682, 0.5532, 0.8813, 0.7928]])

### Tensor Datatypes

when creating a tensor there are 3 important parameters, are 

| Tensor parameter | Action | 
| :---------------- | :-------|
| dtype | defaults to `None`. Used to set the datatype for the objects created eg. torch.float32, torch.float64, torch.float16/torch.half, torch.bfloat16 : https://pytorch.org/docs/stable/tensor_attributes.html |
| device | default to `None`. This sets the device on which the tensor will be created. If utilizing a GPU using the passing the type argument set the device |
| requires_grad | defaults to `False`. This defines whether gradients should be tracked on operations with this tensor. |

In [18]:
# integers

float_32= torch.tensor([[1,2,3],
                        [4,5,6]],
                       dtype= torch.float32)
print(float_32, float_32.dtype, '\n')

# to change the data type of a tensor, we use the type method

float_16= float_32.type(torch.float16)
print(float_16)

tensor([[1., 2., 3.],
        [4., 5., 6.]]) torch.float32 

tensor([[1., 2., 3.],
        [4., 5., 6.]], dtype=torch.float16)


## Note
there are 3 main error we will run into with PyTorch and deep learning. These are:
<ol>
    <li>Tensor not right datatype</li>
    <li>Tensor not right shape</li>
    <li>Tensor not on right device</li>
</ol>

## Getting information from Tensors

In [19]:
# create a random tensor

tensor= torch.rand([3, 5])

In [20]:
# to check the shape of a tensor, we use tensor.shape or .size()

print(tensor.shape, '\n', tensor.size())

torch.Size([3, 5]) 
 torch.Size([3, 5])


In [21]:
# to chect the data type of a tensor we use the attribute of a tensor called dtype

tensor.dtype

torch.float32

In [22]:
# to check the device a tensor is on, we use the .device attributes

tensor.device

device(type='cpu')

## Manipulating Tensors

Tensor operation include:
1. additions
2. subtraction
3. multiplication (element-wise)
4. division
5. matrix multiplication

In [23]:
# addition

tensor_a= torch.tensor([1, 2, 3])

print(tensor_a + 10)

# subtraction

print(tensor_a - 10)

# multiplication

print(tensor_a * 10)

# division

print(tensor_a / 10)


tensor([11, 12, 13])
tensor([-9, -8, -7])
tensor([10, 20, 30])
tensor([0.1000, 0.2000, 0.3000])


### matrix multiplication

there are 2 types of multiplication in torch

1. element-wise
2. Dot product multiplication (matrix multiplication shape (m, n) * (n, k) and the resulting matrix is the dimension of the outer matrix.

In [24]:
tensor= torch.tensor([1, 2, 3])

In [25]:
%%time

value= 0
for i in range(len(tensor)):
    value+= tensor[i] + tensor[i]

print(tensor)

tensor([1, 2, 3])
CPU times: user 706 µs, sys: 754 µs, total: 1.46 ms
Wall time: 1.55 ms


In [26]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 705 µs, sys: 1.13 ms, total: 1.83 ms
Wall time: 1.52 ms


tensor(14)

In [27]:
tensor_a= torch.tensor([[3, 2], 
               [4, 5],
               [6, 7]])

tensor_b= torch.tensor([[1, 2],
                        [3, 4], 
                        [5, 6]])

In [28]:
torch.matmul(tensor_a, tensor_b)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [29]:
# To fix transpose issues, we use a transpose on one of the matrices
 
torch.matmul(tensor_a, tensor_b.T)

tensor([[ 7, 17, 27],
        [14, 32, 50],
        [20, 46, 72]])

### Finding min, max, mean, sum, etc (tensor aggregation)



In [30]:
# find the min
print(torch.min(torch.min(tensor_a)))

print(torch.max(tensor_a))

print(torch.mean(tensor_a.type(torch.float32))) #can only work on dtype <= torch.float32

print(torch.sum(tensor_a))

tensor(2)
tensor(7)
tensor(4.5000)
tensor(27)


In [31]:
# argmin and argmax, are methods used to get the index position of the max or min value in a tensor

tensor_a.argmax()

tensor(5)

In [32]:
tensor_a.max()

tensor(7)

In [33]:
tensor_a

tensor([[3, 2],
        [4, 5],
        [6, 7]])

In [34]:
tb, tc= torch.rand(4, 2), torch.rand(2, 6)

torch.mm(tb, tc)

tensor([[0.8114, 0.6421, 0.9782, 0.4819, 0.1513, 0.5620],
        [0.5014, 0.3416, 0.4647, 0.2435, 0.0661, 0.3608],
        [1.2606, 0.9213, 1.3266, 0.6738, 0.1972, 0.8919],
        [0.6253, 0.4719, 0.6956, 0.3488, 0.1052, 0.4388]])

## Reshaping, stacking, squeezing, and unsqueezing tensors

* Reshaping: this reshapes a tensor to a defined shape.
* View: This shows a tensor as a different shape but keeps the original shape in memory. it means that changning the view will also change the original tensor
* Stacking: This combines tensors, one on top of the other (vstack) or side by side (hstack). this can also be done with stack and passing in dimensions `dim` as an arguement.
* Squeeze: this removes all `1` dimensions from a tensor.
* Unsqueeze: Add a `1` dimension to a tensor.
* Permute: Return a view of the input with dimensions permuted (swapped) in a certain way.

In [47]:
import torch # so we can run the notebook from this point

# to create a new tensor
x= torch.arange(1, 10)
x, x.shape

(tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]), torch.Size([9]))

In [85]:
# reshaping. the primary rule for reshaping is that the reshape must be a multiple of the original size of the original tensor
x_reshaped= x.reshape(1, 9)
x, x.reshape(3, 3)

(tensor([5, 2, 3, 4, 5, 6, 7, 8, 9]),
 tensor([[5, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]))

In [49]:
x.shape, x.reshape(3, 3).shape

(torch.Size([9]), torch.Size([3, 3]))

In [52]:
# view. A view can be used to create a copy of a tensor but will keep the original memory of the tensor. By this, they end up as different names pointing to the same address object in memory.
# This means that changing the view of a tensor will also change the original tensor itself

y= x.view(1, 9)

print(f"x is: {x}\ny is: {y}")

x is: tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])
y is: tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])


In [56]:
y[0][:1] = 5 # Here we assign the first element in the view y of the original tensor x

print(f"x is: {x}\ny is: {y}") # lets print the tensors of x and y

x is: tensor([5, 2, 3, 4, 5, 6, 7, 8, 9])
y is: tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9]])


In [61]:
# stack tensors on top of each other
x_stacked= torch.stack([x, x, x, x, x], dim=0) # `dim = 0` will stack tensors vertically - one on top of the other, while `dim-1` will stack tensors horizontally

x_stacked

tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 2, 3, 4, 5, 6, 7, 8, 9],
        [5, 2, 3, 4, 5, 6, 7, 8, 9]])

In [89]:
# squeeze

print(f"This is the x_reshaped tensor: {x_reshaped}\nThis is its shape: {x_reshaped.shape}.\n\n")

x_reshaped_squeezed= x_reshaped.squeeze() # this will squeeze the tensor so all 1 dimensions are removed both horizontally and vertically
print(f"This is the x_reshaped tensor after squeezing: {x_reshaped_squeezed}\nThis is its shape: {x_reshaped_squeezed.shape}.\n\n")

This is the x_reshaped tensor: tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9]])
This is its shape: torch.Size([1, 9]).


This is the x_reshaped tensor after squeezing: tensor([5, 2, 3, 4, 5, 6, 7, 8, 9])
This is its shape: torch.Size([9]).




In [94]:
# using tensor.unsqueeze()

print(f"The x_reshaped_squeezed tensor is: {x_reshaped_squeezed}\nIts shape is {x_reshaped_squeezed.shape}.\n\n")

x_reshaped_unsqueezed= x_reshaped_squeezed.unsqueeze(dim= 0)

print(f"The x_reshaped_unsqueezed tensor is: {x_reshaped_unsqueezed}\nIts shape is {x_reshaped_unsqueezed.shape}.\n\n")

The x_reshaped_squeezed tensor is: tensor([5, 2, 3, 4, 5, 6, 7, 8, 9])
Its shape is torch.Size([9]).


The x_reshaped_unsqueezed tensor is: tensor([[5, 2, 3, 4, 5, 6, 7, 8, 9]])
Its shape is torch.Size([1, 9]).




In [95]:
# `torch.permute()` rearranges a target tensor in a specified order
# Create tensor with specific shape
x_original = torch.rand(size=(224, 224, 3))

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


## Indexing (selecting data from tensors)

Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).

To do so, you can use indexing.

If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with tensors is very similar.



In [1]:
# Create a tensor 
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

**Indexing values goes outer dimension -> inner dimension (check out the square brackets).**

In [2]:

# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}") 
print(f"Second square bracket: {x[0][0]}") 
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


## PyTorch tensors & NumPy
Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

* torch.from_numpy(ndarray) - NumPy array -> PyTorch tensor.
* torch.Tensor.numpy() - PyTorch tensor -> NumPy array.

Let's try them out.

In [3]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))


> **Note**: By default, NumPy arrays are created with the datatype float64 and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).
>
> However, many PyTorch calculations default to using float32.
>
> So if you want to convert your NumPy array (float64) -> PyTorch tensor (float64) -> PyTorch tensor (float32), you can use tensor = torch.from_numpy(array).type(torch.float32).

In [4]:
# Change the array, keep the tensor
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

And if you want to go from PyTorch tensor to NumPy array, you can call `tensor.numpy().`

In [5]:
# Tensor to NumPy array
tensor = torch.ones(7) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

And the same rule applies as above, if you change the original `tensor`, the new `numpy_tensor` stays the same.



In [6]:
# Change the tensor, keep the array the same
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

As you learn more about neural networks and machine learning, you'll start to discover how much randomness plays a part.

Well, pseudorandomness that is. Because after all, as they're designed, a computer is fundamentally deterministic (each step is predictable) so the randomness they create are simulated randomness (though there is debate on this too, but since I'm not a computer scientist, I'll let you find out more yourself).

How does this relate to neural networks and deep learning then?

We've discussed neural networks start with random numbers to describe patterns in data (these numbers are poor descriptions) and try to improve those random numbers using tensor operations (and a few other things we haven't discussed yet) to better describe patterns in data.

In short:

`start with random numbers -> tensor operations -> try to make better (again and again and again)`

Although randomness is nice and powerful, sometimes you'd like there to be a little less randomness.

Why?

So you can perform repeatable experiments.

For example, you create an algorithm capable of achieving X performance.

And then your friend tries it out to verify you're not crazy.

How could they do such a thing?

That's where **reproducibility** comes in.

In other words, can you get the same (or very similar) results on your computer running the same code as I get on mine?

Let's see a brief example of reproducibility in PyTorch.

We'll start by creating two random tensors, since they're random, you'd expect them to be different right?



In [7]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.8475, 0.3644, 0.7671, 0.4612],
        [0.6031, 0.4138, 0.9056, 0.9290],
        [0.6019, 0.7683, 0.9867, 0.5866]])

Tensor B:
tensor([[0.9580, 0.8821, 0.2590, 0.8117],
        [0.9322, 0.4008, 0.9352, 0.7091],
        [0.2863, 0.7263, 0.2207, 0.5912]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

Just as you might've expected, the tensors come out with different values.

But what if you wanted to created two random tensors with the same values.

As in, the tensors would still contain random values but they would be of the same flavour.

That's where `torch.manual_seed(seed)` comes in, where seed is an integer (like 42 but it could be anything) that flavours the randomness.

Let's try it out by creating some more flavoured random tensors.

In [8]:
import torch
import random

# # Set the random seed
RANDOM_SEED=42 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED) 
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called 
# Without this, tensor_D would be different to tensor_C 
torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


Nice!

It looks like setting the seed worked.

>Resource: What we've just covered only scratches the surface of reproducibility in PyTorch. For more, on reproducbility in general and random seeds, I'd checkout:
>
> * The <a href="https://pytorch.org/docs/stable/notes/randomness.html">PyTorch reproducibility documentation</a> (a good exericse would be to read through this for 10-minutes and even if you don't understand it now, being aware of it is important).
> * The <a href="https://en.wikipedia.org/wiki/Random_seed">Wikipedia random seed page</a> (this'll give a good overview of random seeds and pseudorandomness in general).

## Running tensors on GPUs (and making faster computations)

Deep learning algorithms require a lot of numerical operations.

And by default these operations are often done on a CPU (computer processing unit).

However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need (matrix multiplications) than CPUs.

Your computer might have one.

If so, you should look to use it whenever you can to train neural networks because chances are it'll speed up the training time dramatically.

There are a few ways to first get access to a GPU and secondly get PyTorch to use the GPU.

> **Note**: When I reference "GPU" throughout this course, I'm referencing a Nvidia GPU with CUDA enabled (CUDA is a computing platform and API that helps allow GPUs be used for general purpose computing & not just graphics) unless otherwise specified.



### 1. Getting a GPU
You may already know what's going on when I say GPU. But if not, there are a few ways to get access to one.


| Method | Difficulty to setup | Pros | Cons | How to setup |
| ------------:| -------------: | -----------------------------: | -----------------------------: | -----------------------------: |
| Google Colab | Easy | Free to use, almost zero setup required, can share work with others as easy as a link | Doesn't save your data outputs, limited compute, subject to timeouts | Follow the Google Colab Guide |
| Use your own | Medium | Run everything locally on your own machine | GPUs aren't free, require upfront cost | Follow the PyTorch installation guidelines |
| Cloud computing (AWS, GCP, Azure) | Medium-Hard | Small upfront cost, access to almost infinite compute | Can get expensive if running continually, takes some time to setup right | Follow the PyTorch installation guidelines | 

There are more options for using GPUs but the above three will suffice for now.

Personally, I use a combination of Google Colab and my own personal computer for small scale experiments (and creating this course) and go to cloud resources when I need more compute power.

> **Resource:** If you're looking to purchase a GPU of your own but not sure what to get, <a href="https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/">Tim Dettmers</a> has an excellent guide.

To check if you've got access to a Nvidia GPU, you can run `!nvidia-smi` where the `!` (also called bang) means "run this on the command line".



In [9]:
!nvidia-smi

zsh:1: command not found: nvidia-smi


## 2. Getting PyTorch to run on the GPU
Once you've got a GPU ready to access, the next step is getting PyTorch to use for storing data (tensors) and computing on data (performing operations on tensors).

To do so, you can use the torch.cuda package.

Rather than talk about it, let's try it out.

You can test if PyTorch has access to a GPU using <a href="https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available">`torch.cuda.is_available()`</a>.



In [10]:
# Check for GPU
import torch
torch.cuda.is_available()

False


If the above outputs `True`, PyTorch can see and use the GPU, if it outputs `False`, it can't see the GPU and in that case, you'll have to go back through the installation steps.

Now, let's say you wanted to setup your code so it ran on CPU or the GPU if it was available.

That way, if you or someone decides to run your code, it'll work regardless of the computing device they're using.

Let's create a `device` variable to store what kind of device is available.

In [12]:
# Set device type
device = 'mps' if torch.backends.mps.is_available() else 'cpu'
device

'mps'

If the above output "cuda" it means we can set all of our PyTorch code to use the available CUDA device (a GPU) and if it output "cpu", our PyTorch code will stick with the CPU.

> **Note:** In PyTorch, it's best practice to write <a href="https://pytorch.org/docs/master/notes/cuda.html#device-agnostic-code">**device agnostic code**</a>. This means code that'll run on CPU (always available) or GPU (if available).

If you want to do faster computing you can use a GPU but if you want to do much faster computing, you can use multiple GPUs.

You can count the number of GPUs PyTorch has access to using torch.cuda.device_count().



In [13]:
# Count number of devices
torch.cuda.device_count()

0


Knowing the number of GPUs PyTorch has access to is helpful incase you wanted to run a specific process on one GPU and another process on another (PyTorch also has features to let you run a process across all GPUs).



### 3. Putting tensors (and models) on the GPU
You can put tensors (and models, we'll see this later) on a specific device by calling to(device) on them. Where device is the target device you'd like the tensor (or model) to go to.

Why do this?

GPUs offer far faster numerical computing than CPUs do and if a GPU isn't available, because of our device agnostic code (see above), it'll run on the CPU.

> **Note:** Putting a tensor on GPU using to(device) (e.g. some_tensor.to(device)) returns a copy of that tensor, e.g. the same tensor will be on CPU and GPU. To overwrite tensors, reassign them:
>
> 
>some_tensor = some_tensor.to(device)

Let's try creating a tensor and putting it on the GPU (if it's available).



In [14]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='mps:0')


If you have a GPU available, the above code will output something like:


`tensor([1, 2, 3]) CPU`

`tensor([1, 2, 3], device='cuda:0')`

Notice the second tensor has device='cuda:0', this means it's stored on the 0th GPU available (GPUs are 0 indexed, if two GPUs were available, they'd be 'cuda:0' and 'cuda:1' respectively, up to 'cuda:n').



### 4. Moving tensors back to the CPU
What if we wanted to move the tensor back to CPU?

For example, you'll want to do this if you want to interact with your tensors with NumPy (NumPy does not leverage the GPU).

Let's try using the <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html">`torch.Tensor.numpy()`</a> method on our tensor_on_gpu.



In [15]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()

TypeError: can't convert mps:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.


Instead, to get a tensor back to CPU and usable with NumPy we can use Tensor.cpu().

This copies the tensor to CPU memory so it's usable with CPUs.



In [16]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.

In [17]:
tensor_on_gpu

tensor([1, 2, 3], device='mps:0')