<a href="https://colab.research.google.com/github/M-Jahanzaib6062/PyTorch-Beginner-to-Advanced/blob/main/00_PyTorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **00. PyTorch Fundamentals**
Q. What is PyTorch?

A. PyTorch is an open-source Machine Learning and Deep Learning framework.

Link to website:https://docs.pytorch.org/docs/stable/index.html

Q. What can PyTorch be used for?

A. It allows you to manipulate and process data and write machine learning algorithms using python code.

Q. Who uses PyTorch?

A. Tech giants like Meta, Tesla, Microsoft and many more use PyTorch for there AI development.

It is also the most used framework in AI papers.


# ***Importing PyTorch***

In [None]:
import torch
print(torch.__version__)

2.6.0+cu124


# ***PyTorch Tensors***

## Creating Tensors

There is a whole documentation page dedicated to the [torch.Tensor](https://docs.pytorch.org/docs/stable/tensors.html#torch-tensor) class

Let's code.

The first thing we are going to create is a **scalar**.

A scalar is a single number and in tensor-speak it is a zero dimensional tensor.

In [None]:
scalar = torch.tensor(3)
print(scalar)

tensor(3)


**tensor(3)** represents a scalar value of type **torch.Tensor**.

Let's check the dimensions of this tensor.

In [None]:
print(scalar.ndim)

0


To retrieve value from this tensor, we can use the "item()" method.

In [None]:
tensor_value = scalar.item()
print(tensor_value)

3


## Vectors

Okay, now let's see a vector.

A vector is a single dimension tensor but can contain many numbers.

As in, you could have a vector [3, 2] to describe [bedrooms, bathrooms] in your house. Or you could have [3, 2, 2] to describe [bedrooms, bathrooms, car_parks] in your house.

The important trend here is that a vector is flexible in what it can represent (the same with tensors).

In [None]:
vector = torch.tensor([3, 7])
print(vector)

tensor([3, 7])


One of the important question is that how many dimensions a vector has?

To find that we can use the "ndim" property.

In [None]:
print(vector.ndim)

1


So, a vector has 1 dimension.



An easy way to understand this is to think about how many dimensions do you need to access data in the tensor.

And since you only need 1 dimension for a vector, so it has one dimension.

**Trick 1:**

One useful trick to find the dimensions of any tensor is to count the number of closing brackets at the end of the tensor.

e.g.

my_tensor = torch.tensor(

  [[1, 2, 3],
  
   [4, 5, 6]] )

Since there are two closing brackets at the end so 'my_tensor' is 2-D.

=> Such a tensor is called a "Matrix".

The next important thing to discuss about a tensor is its 'Shape'.

"shape" property is used to get the shape of a tensor.

In [None]:
print(vector.shape)

torch.Size([2])


The above returns torch.Size([2]) which means our vector has a shape of [2]. This is because of the two elements we placed inside the square brackets ([3, 7]).



**Trick 2:**

The shape of a tensor can be determined by counting how many elements are there in each dimension.

## Matrix

Now let's see a Matrix


In [None]:
Matrix = torch.tensor([
    [1, 2, 3],
    [4, 5, 6]
])
print(Matrix)

tensor([[1, 2, 3],
        [4, 5, 6]])


In [None]:
print(f'Shape of "Matrix" is {Matrix.shape}')
print(f'"Matrix" is {Matrix.ndim} dimensional')

Shape of "Matrix" is torch.Size([2, 3])
"Matrix" is 2 dimensional


The same way we can create tensors of 'shape' and any 'dimension'.

e.g.


In [None]:
tensor = torch.tensor(
    [[[1, 2, 3],
      [4, 5, 6],
      [7, 8, 9]]]
)
print(tensor)
print(f'Shape of "tensor" is {tensor.shape}')
print(f'"tensor" is {tensor.ndim} dimensional')

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])
Shape of "tensor" is torch.Size([1, 3, 3])
"tensor" is 3 dimensional


## Random Tensors

We can create a random tensor using 'torch.rand()' function with 'size' parameter.

In [None]:
random_tensor = torch.rand(size = (3, 4))
print(random_tensor)
print(f'The DataType of tensor is {random_tensor.dtype}')

tensor([[0.4518, 0.5870, 0.5703, 0.6243],
        [0.4053, 0.9303, 0.9399, 0.9811],
        [0.0250, 0.1591, 0.8895, 0.5989]])
The DataType of tensor is torch.float32


## Zeros and Ones

In [None]:
zeros_tensor = torch.zeros(size = (2, 2))
print(zeros_tensor)

tensor([[0., 0.],
        [0., 0.]])


In [None]:
ones_tensor = torch.ones(size = (3, 3))
print(ones_tensor)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


## Creating a range and tensors like

Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

You can use torch.arange(start, end, step) to do so.

Where:

* start = start of range (e.g. 0)
* end = end of range (e.g. 10)
* step = how many steps in between each value (e.g. 1)

In [None]:
one_to_ten = torch.arange(start = 1, end = 11, step = 1)
print(one_to_ten)
# Notice that the end is not included.
# So to go from 1 to 100, where 100 should be included you would use end = 101.

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])


Sometimes you might want one tensor of a certain type with the same shape as another tensor.

For example, a tensor of all zeros with the same shape as a previous tensor.

To do so you can use torch.zeros_like(input) or torch.ones_like(input) which return a tensor filled with zeros or ones in the same shape as the input respectively.

In [None]:
tensorA = torch.tensor([[1, 2, 3],
                        [4, 5, 6]], dtype = torch.float16)
print(tensorA)
print(f'Shape is {tensorA.shape}')
print(f'dytpe is {tensorA.dtype}')

tensor([[1., 2., 3.],
        [4., 5., 6.]], dtype=torch.float16)
Shape is torch.Size([2, 3])
dytpe is torch.float16


In [None]:
zeros_like_tensorA = torch.zeros_like(tensorA)
print(zeros_like_tensorA)
print(f'Shape is {zeros_like_tensorA.shape}')
print(f'dtype is {zeros_like_tensorA.dtype}')

tensor([[0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float16)
Shape is torch.Size([2, 3])
dtype is torch.float16


In [None]:
ones_like_tensorA = torch.ones_like(tensorA)
print(ones_like_tensorA)
print(f'Shape is {ones_like_tensorA.shape}')
print(f'dtype is {ones_like_tensorA.dtype}')

tensor([[1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float16)
Shape is torch.Size([2, 3])
dtype is torch.float16


# **Tensor Datatypes**

There are many different tensor datatypes available in PyTorch.

Some are specific for CPU and some are better for GPU.

Getting to know which one can take some time.

Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).

The most common type (and generally the default) is torch.float32 or torch.float.

This is referred to as "32-bit floating point".

But there's also 16-bit floating point (torch.float16 or torch.half) and 64-bit floating point (torch.float64 or torch.double).

And to confuse things even more there's also 8-bit, 16-bit, 32-bit and 64-bit integers.

Plus more!

Note: An integer is a flat round number like 7 whereas a float has a decimal 7.0.

The reason for all of these is to do with precision in computing.

Precision is the amount of detail used to describe a number.

The higher the precision value (8, 16, 32), the more detail and hence data used to express a number.

This matters in deep learning and numerical computing because you're making so many operations, the more detail you have to calculate on, the more compute you have to use.

So lower precision datatypes are generally faster to compute on but sacrifice some performance on evaluation metrics like accuracy (faster to compute but less accurate).

Resources:

See the PyTorch documentation for a list of all available tensor datatypes.

Link: https://docs.pytorch.org/docs/stable/tensors.html#torch-tensor

Read the Wikipedia page for an overview of what precision in computing) is.

Link: https://en.wikipedia.org/wiki/Precision_(computer_science)

In [None]:
float32_tensor = torch.tensor([1, 2, 3], dtype = torch.float32)
print(float32_tensor.dtype)

torch.float32


In [None]:
float16_tensor = torch.tensor([1, 2, 3], dtype = torch.float16)
print(float16_tensor.dtype)

torch.float16


# **Information about tensors**


Some of the most important properties of a tensor are:
* datatype
* shape
* device

e.g.

In [None]:
tensor = torch.tensor(
    [[1, 2, 3],
     [4, 5, 6]], dtype = torch.float32
)
print(tensor)
print(f'It\'s shape is {tensor.shape}')
print(f'It\'s dtype is {tensor.dtype}')
print(f'It\'s stored on {tensor.device}')

tensor([[1., 2., 3.],
        [4., 5., 6.]])
It's shape is torch.Size([2, 3])
It's dtype is torch.float32
It's stored on cpu


**Note:** When you run into issues in PyTorch, it's very often one to do with one of the three attributes above.

# **Manipulating Tensors (Tensor Operations)**

In deep learning, data (images, text, video, audio, DNA structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be millions and billions) on tensors to create a representation of the patterns in the input data.

These operations are often a wonderful dance between:

* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

## Basic Operations

In [None]:
tensor = torch.tensor([1, 2, 3, 4])
print(tensor)

tensor([1, 2, 3, 4])


In [None]:
print(tensor + 10) # Element-wise Addition

tensor([11, 12, 13, 14])


In [None]:
print(tensor - 100) # Element-wise Subtraction

tensor([-99, -98, -97, -96])


In [None]:
print(tensor * 10) # Element-wise Multiplication

tensor([10, 20, 30, 40])


We can also use PyTorch's built-in functions.

In [None]:
print(torch.add(tensor, 10))

tensor([11, 12, 13, 14])


In [None]:
print(torch.subtract(tensor, 100))

tensor([-99, -98, -97, -96])


In [None]:
print(torch.mul(tensor, 10)) # torch.mul short for torch.multiply

tensor([10, 20, 30, 40])


## Matrix Multiplication

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the 'torch.matmul()' method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:
* (3, 2) @ (3, 2) won't work
* (2, 3) @ (3, 2) will work
* (3, 2) @ (2, 3) will work

The resulting matrix has the shape of the outer dimensions:
* (2, 3) @ (3, 2) -> (2, 2)
* (3, 2) @ (2, 3) -> (3, 3)

*Note:* The '@' symbol represents matrix multication in python.

 Link to 'torch.matmul()' : https://docs.pytorch.org/docs/stable/generated/torch.matmul.html#torch-matmul

In [None]:
tensor = torch.tensor([1, 2, 3])
print(tensor)

tensor([1, 2, 3])


In [None]:
tensor * tensor

tensor([1, 4, 9])

Since, [1 * 1, 2 * 2, 3 * 3] = [1, 4, 9]

In [None]:
torch.matmul(tensor, tensor)

tensor(14)

Since, [1 * 1 + 2 * 2 + 3 * 3] = 14

In [None]:
# Matrix multiplication using loop
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.52 ms, sys: 0 ns, total: 1.52 ms
Wall time: 1.38 ms


In [None]:
# Matrix multiplication using torch.matmul()
%%time
torch.matmul(tensor, tensor)

CPU times: user 585 µs, sys: 18 µs, total: 603 µs
Wall time: 470 µs


tensor(14)

The difference is so subtle because the loop version performs computation sequentially while the torch.matmul() do parallel computing.


In [None]:
# using transpose()
tensor1 = torch.rand(2, 3)
print(tensor1, tensor1.shape)

tensor([[0.4461, 0.9350, 0.1114],
        [0.9643, 0.7778, 0.8134]]) torch.Size([2, 3])


In [None]:
# if we multiply them it will give error because of shape
torch.matmul(tensor1, tensor1)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x3 and 2x3)

In [None]:
# So we can use take transpose of the first or second tensor in this case.
torch.matmul(tensor1,tensor1.T)

tensor([[1.0855, 1.2480],
        [1.2480, 2.1966]])

You can understand matrix multiplication visually at
http://matrixmultiplication.xyz/.

# **Aggregation**

Finding the min, max, mean, sum, etc.

In [None]:
tensor = torch.arange(0, 100, 10)
print(tensor, tensor.dtype)

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]) torch.int64


In [None]:
print(tensor.mean())

RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

Error says it '.mean()' requires the input tensor to be of dtype = 'float'

In [None]:
# Solution: First convert the tensor dtype to float32 and then call mean()
print(tensor.type(torch.float32).mean())

tensor(45.)


In [None]:
print(tensor.min())

tensor(0)


In [None]:
print(tensor.max())

tensor(90)


In [None]:
print(tensor.sum())

tensor(450)


You can also use the built-in functions.

In [None]:
torch.mean(tensor.type(torch.float32))

tensor(45.)

In [None]:
torch.min(tensor)

tensor(0)

In [None]:
torch.max(tensor)

tensor(90)

In [None]:
torch.sum(tensor)

tensor(450)

## Now let's talk about the 'Positional min/max'

You can also find the index of a tensor where the max or minimum occurs with 'torch.argmax()' and 'torch.argmin()' respectively.

This is helpful incase you just want the position where the highest (or lowest) value is and not the actual value itself (we'll see this in a later section when using the softmax activation function).

In [None]:
tensor = torch.tensor([10, 20, 40])
print(tensor, tensor.dtype)

tensor([10, 20, 40]) torch.int64


In [None]:
print(f'The minimum number is on index {tensor.argmin()}')

The minimum number is on index 0


In [None]:
print(f'The maximum number is on index {tensor.argmax()}')

The maximum number is on index 2


# Reshaping, stacking, squeezing and unsqueezing

Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

To do so, some popular methods are:

## Reshaping

In [None]:
tensor = torch.arange(1, 11)
print(tensor, tensor.shape)

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]) torch.Size([10])


In [None]:
reshaped_tensor = tensor.reshape(shape = (2, 5))
print(reshaped_tensor)
print(f'Shape is {reshaped_tensor.shape}')

tensor([[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10]])
Shape is torch.Size([2, 5])


Select the new shape such that the number of element must be equal in both tensors

## View

In [None]:
tensor_view = tensor.view(1, 10)
print(tensor_view)
print(f'Shape is {tensor_view.shape}')

tensor([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])
Shape is torch.Size([1, 10])


## Stack

In [None]:
vertical_stacked_tensors = torch.stack([tensor, tensor], dim = 0)
print(vertical_stacked_tensors)
print(vertical_stacked_tensors.shape)

tensor([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])
torch.Size([2, 10])


In [None]:
horizontal_stacked_tensors = torch.stack([tensor, tensor], dim = 1)
print(horizontal_stacked_tensors)
print(horizontal_stacked_tensors.shape)

tensor([[ 1,  1],
        [ 2,  2],
        [ 3,  3],
        [ 4,  4],
        [ 5,  5],
        [ 6,  6],
        [ 7,  7],
        [ 8,  8],
        [ 9,  9],
        [10, 10]])
torch.Size([10, 2])


## Squeeze / Unsqueeze

torch.squeeze() removes a single dimension from the tensor.

e.g.

In [None]:
tensor = torch.rand(size = (1, 10))
print(tensor)
print(tensor.shape)

tensor([[0.1293, 0.3254, 0.0968, 0.7273, 0.0883, 0.7129, 0.5872, 0.0516, 0.1394,
         0.8539]])
torch.Size([1, 10])


In [None]:
squeezed_tensor = torch.squeeze(tensor, dim = 0)
print(squeezed_tensor)
print(squeezed_tensor.shape)

tensor([0.1293, 0.3254, 0.0968, 0.7273, 0.0883, 0.7129, 0.5872, 0.0516, 0.1394,
        0.8539])
torch.Size([10])


torch.unsqueeze() do the opposite, it add a single dimension as specified.

In [None]:
unsqueezed_tensor = torch.unsqueeze(squeezed_tensor, dim = 0)
print(unsqueezed_tensor)
print(unsqueezed_tensor.shape)

tensor([[0.1293, 0.3254, 0.0968, 0.7273, 0.0883, 0.7129, 0.5872, 0.0516, 0.1394,
         0.8539]])
torch.Size([1, 10])


In [None]:
unsqueezed_tensor = torch.unsqueeze(squeezed_tensor, dim = 1)
print(unsqueezed_tensor)
print(unsqueezed_tensor.shape)

tensor([[0.1293],
        [0.3254],
        [0.0968],
        [0.7273],
        [0.0883],
        [0.7129],
        [0.5872],
        [0.0516],
        [0.1394],
        [0.8539]])
torch.Size([10, 1])


## Permute

You can also rearrange the order of axes values with 'torch.permute(input, dims)', where the input gets turned into a view with new dims.

In [None]:
tensor_org = torch.rand(size = (224, 224, 3))
tensor_perm = tensor_org.permute(2, 0, 1)
print(tensor_perm.shape)

torch.Size([3, 224, 224])


# **Indexing**

In [None]:
import torch

To retrieve specific data items from any tensor, indexing can be used.

It is quite similar to how indexing works with python lists and numpy arrays.

Indexing goes from outer-dimension -> inner-dimension.

e.g.

In [None]:
tensor = torch.arange(1, 10).reshape(shape = (1, 3, 3))
print(tensor, tensor.shape)

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]]) torch.Size([1, 3, 3])


In [None]:
print(tensor[0]) # outer bracket
print(tensor[0][0]) #inner bracket or 0'th row
print(tensor[0][0][0]) #inner element or 0'th column of 0'th row of 0'th outer bracket

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([1, 2, 3])
tensor(1)


You can also use " : " to specify "all values in this dimension" and then use a comma (,) to add another dimension.

In [None]:
print(tensor[:,0]) # First row of all the bigger elements in the first dimension
print(tensor[0, 1, 1]) # to retrieve 5
print(tensor[0, :, 2]) # to retrieve last column values from each row

tensor([[1, 2, 3]])
tensor(5)
tensor([3, 6, 9])


# **PyTorch tensors and Numpy**

Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

* torch.from_numpy(ndarray) - NumPy array -> PyTorch tensor.
* torch.Tensor.numpy() - PyTorch tensor -> NumPy array.

In [None]:
import numpy as np

In [None]:
array = np.arange(1., 11.)
tensor = torch.from_numpy(array)
array, tensor

(array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]),
 tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.], dtype=torch.float64))

In [None]:
tensor = torch.arange(1, 11).reshape(shape = (2, 5))
np_tensor = tensor.numpy()
tensor, np_tensor

(tensor([[ 1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10]]),
 array([[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10]]))

# **Reproducibility**

Although randomness is nice and powerful, sometimes you'd like there to be a little less randomness.

Why?

So you can perform repeatable experiments.

For example, you create an algorithm capable of achieving X performance.

And then your friend tries it out to verify you're not crazy.

How could they do such a thing?

That's where reproducibility comes in.

In other words, can you get the same (or very similar) results on your computer running the same code as I get on mine?

Let's see a brief example of reproducibility in PyTorch.

We'll start by creating two random tensors, since they're random, you'd expect them to be different right?

In [None]:
tensor1 = torch.rand(3, 4)
tensor2 = torch.rand(3, 4)
print(f'Tensor 1: \n{tensor1}\n')
print(f'Tensor 2: \n{tensor2}\n')
print(f'Do they have same values:\n{tensor1 == tensor2}')

Tensor 1: 
tensor([[0.7493, 0.9975, 0.5042, 0.3771],
        [0.6768, 0.4907, 0.6441, 0.9924],
        [0.9901, 0.4614, 0.8121, 0.2624]])

Tensor 2: 
tensor([[0.2810, 0.6617, 0.9495, 0.1346],
        [0.7697, 0.7947, 0.7077, 0.4072],
        [0.6257, 0.9457, 0.6070, 0.7115]])

Do they have same values:
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


Just as you might've expected, the tensors come out with different values.

But what if you wanted to create two random tensors with the same values.

That's where " torch.manual_seed(seed) "comes in, where seed is an integer (like 42 but it could be anything) that effects the randomness.

Let's try it out by creating some more  random tensors.

In [None]:
torch.manual_seed(42)
tensor1 = torch.rand(3, 4)

torch.manual_seed(42)
tensor2 = torch.rand(3, 4)

print(f'Tensor 1: \n{tensor1}\n')
print(f'Tensor 2: \n{tensor2}\n')
print(f'Do they have same values:\n{tensor1 == tensor2}')

Tensor 1: 
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor 2: 
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Do they have same values:
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


In [None]:
torch.manual_seed(42)
print(torch.rand(3))
torch.manual_seed(42)
print(torch.rand(3, 5))

tensor([0.8823, 0.9150, 0.3829])
tensor([[0.8823, 0.9150, 0.3829, 0.9593, 0.3904],
        [0.6009, 0.2566, 0.7936, 0.9408, 0.1332],
        [0.9346, 0.5936, 0.8694, 0.5677, 0.7411]])


# **Running tensors on GPUs**

Deep learning algorithms require a lot of numerical operations.

And by default these operations are often done on a CPU (computer processing unit).

However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need (matrix multiplications) than CPUs.

Your computer might have one.

If so, you should look to use it whenever you can to train neural networks because chances are it'll speed up the training time dramatically.

## GPU on google colab


Goto the "Runtime", Select "Change runtime type" then select "Hardware accelerator" to "T4 GPU".

In [None]:
!nvidia-smi

Sun Jul 27 05:56:51 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   50C    P8             11W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## Getting PyTorch to run on the GPU

The next step is to check whether PyTorch has access to a GPU.

This can be done using "torch.cuda.is_available()".

In [None]:
import torch
torch.cuda.is_available()

True

If the above outputs True, PyTorch can see and use the GPU, if it outputs False, it can't see the GPU and in that case, you'll have to go back through the installation steps.

Now, let's say you wanted to setup your code so it ran on CPU or the GPU if it was available.

That way, if you or someone decides to run your code, it'll work regardless of the computing device they're using.

Let's create a device variable to store what kind of device is available.

In [None]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

If the above output "cuda" it means we can set all of our PyTorch code to use the available CUDA device (a GPU) and if it output "cpu", our PyTorch code will stick with the CPU.

**Note:** In PyTorch, it's best practice to write device agnostic code. This means code that'll run on CPU (always available) or GPU (if available).

If you want to do faster computing you can use a GPU but if you want to do much faster computing, you can use multiple GPUs.

You can count the number of GPUs PyTorch has access to using torch.cuda.device_count().

In [None]:
torch.cuda.device_count()

1

## Putting models or tensors on GPU

In [None]:
tensor = torch.tensor([1,2,3])
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
tensor_on_gpu = tensor.to(device)
print(tensor_on_gpu, tensor_on_gpu.device)

tensor([1, 2, 3], device='cuda:0') cuda:0


## Moving tensor back to CPU

In [None]:
# Can't convert a tensor on GPU to numpy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

To get a tensor back to CPU and usable with NumPy we can use "Tensor.cpu()".

In [None]:
tensor_on_cpu = tensor_on_gpu.cpu()
print(tensor_on_cpu, tensor_on_cpu.device)
print(tensor_on_gpu, tensor_on_gpu.device) # while the original tensor copy stays on the gpu.

tensor([1, 2, 3]) cpu
tensor([1, 2, 3], device='cuda:0') cuda:0
