<a href="https://colab.research.google.com/github/DataScientistTX/PyTorch_Projects/blob/main/PyTorch_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#Checking the GPU from Google Colab
!nvidia-smi

Sat Oct 28 02:17:36 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   60C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from tqdm.autonotebook import tqdm
import pandas as pd

  from tqdm.autonotebook import tqdm


To use PyTorch, we need to import it as the torch package. With it, we can immediately start creating tensors. Every time you nest a list within another list, you create a new dimension of the tensor that PyTorch will produce:

In [None]:
import torch
torch_scalar = torch.tensor(3.14)
torch_vector = torch.tensor([1, 2, 3, 4])
torch_matrix = torch.tensor([[1, 2,],
                             [3, 4,],
                             [5, 6,],
                             [7, 8,]])
torch_tensor3d = torch.tensor([
                            [
                            [ 1,  2,  3],
                            [ 4,  5,  6],
                            ],
                            [
                            [ 7,  8,  9],
                            [10, 11, 12],
                            ],
                            [
                            [13, 14, 15],
                            [16, 17, 18],
                            ],
                            [
                            [19, 20, 21],
                            [22, 23, 24],
                            ]
])

If we print the shapes of these tensors, you should see the same shapes shown earlier. Again, while scalars, vectors, and matrices are different things, they are unified under the larger umbrella of tensors. We care about this because we use tensors of different shapes to represent different types of data. We get to those details later; for now, we focus on the mechanics PyTorch provides to work with tensors:

In [None]:
print(torch_scalar.shape)
print(torch_vector.shape)
print(torch_matrix.shape)
print(torch_tensor3d.shape)

torch.Size([])
torch.Size([4])
torch.Size([4, 2])
torch.Size([4, 2, 3])


If you have done any ML or scientific computing in Python, you have probably used the NumPy library. As you would expect, PyTorch supports converting NumPy objects into their PyTorch counterparts. Since both of them represent data as tensors, this is a painless process. The following two code blocks show how we can create a random matrix in NumPy and then convert it into a PyTorch Tensor object:


In [None]:
x_np = np.random.random((4,4))
print(x_np)

x_pt = torch.tensor(x_np)
print(x_pt)


[[0.87052378 0.34696855 0.21667932 0.57079882]
 [0.02136561 0.85205241 0.84077682 0.71733416]
 [0.03388641 0.72535439 0.79655854 0.83792532]
 [0.15123618 0.99473453 0.9699564  0.06084524]]
tensor([[0.8705, 0.3470, 0.2167, 0.5708],
        [0.0214, 0.8521, 0.8408, 0.7173],
        [0.0339, 0.7254, 0.7966, 0.8379],
        [0.1512, 0.9947, 0.9700, 0.0608]], dtype=torch.float64)


Both NumPy and torch support multiple different data types. By default, NumPy uses 64-bit floats, and PyTorch defaults to 32-bit floats. However, if you create a PyTorch tensor from a NumPy tensor, it uses the same type as the given NumPy tensor. You can see that in the previous output, where PyTorch let us know that dtype=torch.float64 since it is not the default choice.

The most common types we care about for deep learning are 32-bit floats, 64-bit integers (Longs), and booleans (i.e., binary True/False). Most operations leave the tensor type unchanged unless we explicitly create or cast it to a new type. To avoid issues with types, you can specify explicitly what type of tensor you want to create when calling a function. The following code checks what type of data is contained in our tensor using the dtype attribute:

In [None]:
print(x_np.dtype, x_pt.dtype)
x_np = np.asarray(x_np, dtype=np.float32)
x_pt = torch.tensor(x_np, dtype=torch.float32)
print(x_np.dtype, x_pt.dtype)

float64 torch.float64
float32 torch.float32


The main exception to using 32-bit floats or 64-bit integers as the dtype is when we need to perform logic operations (like Boolean AND, OR, NOT), which we can use to quickly create binary masks.
A mask is a tensor that tells us which portions of another tensor are valid to use. We use masks in some of our more complex neural networks. For example, let’s say we want to find every value greater than 0.5 in a tensor. Both PyTorch and NumPy let us use the standard logic operators to check for things like this:

In [None]:
b_np = (x_np > 0.5)
print(b_np)
print(b_np.dtype)

b_pt = (x_pt > 0.5)
print(b_pt)
print(b_pt.dtype)


[[ True False False  True]
 [False  True  True  True]
 [False  True  True  True]
 [False  True  True False]]
bool
tensor([[ True, False, False,  True],
        [False,  True,  True,  True],
        [False,  True,  True,  True],
        [False,  True,  True, False]])
torch.bool


While the NumPy and PyTorch APIs are not identical, they share many functions with the same names, behaviors, and characteristics:


In [None]:
np.sum(x_np),torch.sum(x_pt)

(9.006996, tensor(9.0070))

While many functions are the same, some are not quite identical. There may be slight differences in behavior or in the arguments required. These discrepancies are usually because the PyTorch version has made changes that are particular to how these methods are used for neural network design and execution. Following is an example of the transpose function, where PyTorch requires us to specify which two dimensions to transpose. NumPy takes the two dimensions and transposes them without complaint:

In [None]:
np.transpose(x_np)

array([[0.87052375, 0.02136561, 0.03388641, 0.15123619],
       [0.34696856, 0.8520524 , 0.7253544 , 0.9947345 ],
       [0.21667932, 0.8407768 , 0.79655856, 0.9699564 ],
       [0.5707988 , 0.71733415, 0.8379253 , 0.06084524]], dtype=float32)

In [None]:
torch.transpose(x_pt, 0, 1)

tensor([[0.8705, 0.0214, 0.0339, 0.1512],
        [0.3470, 0.8521, 0.7254, 0.9947],
        [0.2167, 0.8408, 0.7966, 0.9700],
        [0.5708, 0.7173, 0.8379, 0.0608]])

PyTorch does this because we often want to transpose dimensions of a tensor for deep learning applications, whereas NumPy tries to stay with more general expectations. As shown next, we can transpose two of the dimensions in our torch_tensor3d from the start of the chapter. Originally it had a shape of (4, 2, 3). If we transpose the first and third dimensions, we get a shape of (3, 2, 4):

In [None]:
torch_tensor3d


tensor([[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]],

        [[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]])

In [None]:
torch.transpose(torch_tensor3d, 0, 2)

tensor([[[ 1,  7, 13, 19],
         [ 4, 10, 16, 22]],

        [[ 2,  8, 14, 20],
         [ 5, 11, 17, 23]],

        [[ 3,  9, 15, 21],
         [ 6, 12, 18, 24]]])

In [None]:
print(torch.transpose(torch_tensor3d, 0, 2).shape)


torch.Size([3, 2, 4])


Because such differences exist, you should always double-check the PyTorch documentation at https://pytorch.org/docs/stable/index.html if you attempt to use a function you are familiar with but suddenly find it does not behave as expected. It is also a good tool to have open when using PyTorch. There are many different functions that can help you in PyTorch, and we cannot review them all.

PyTorch GPU acceleration

We can use the timeit library: it lets us run code multiple times and tells us how long it took to run. We make a larger matrix X, compute XX several times, and see how long that takes to run:

In [None]:
import timeit
x = torch.rand(2**11, 2**11)
time_cpu = timeit.timeit('x@x', globals=globals(), number=100)
time_cpu

23.691677253999984

It takes a bit of time to run that code, but not too long. On my computer, it took 28.99 seconds to run, which is stored in the time_cpu variable. Now, how do we get PyTorch to use our GPU? First we need to create a device reference. We can ask PyTorch to give us one using the torch.device function. If you have an NVIDIA GPU, and the CUDA drivers are installed properly, you should be able to pass in "cuda" as a string and get back an object representing that device:

In [None]:
print('Is CUDA available? :', torch.cuda.is_available())
device = torch.device('cuda')

Is CUDA available? : True


In [None]:
x = x.to(device)
time_gpu = timeit.timeit('x@x', globals=globals(), number=100)
time_gpu

2.807074027999988

When I run this code, the time to perform 100 multiplications is 2.8 seconds, which is an instant 8.5× speedup. This was a pretty ideal case, as matrix multiplications are super-efficient on GPUs, and we created a big matrix. You should try making the matrix smaller and see how that impacts the speedup you get.

Be aware that this only works if every object involved is on the same device. Say you run the following code, where the variable x has been moved onto the GPU and y has not (so it is on the CPU by default):

In [None]:
x = torch.rand(128, 128).to(device)
y = torch.rand(128, 128)
x*y

RuntimeError: ignored

You will end up getting an error message that says:
RuntimeError: expected device cuda:0 but got device cpu
The error tells you which device the first variable is on (cuda:0) but that the second variable was on a different device (cpu). If we instead wrote y*x you would see the error changetoexpected device cpu but got device cuda:0.Wheneveryouseeanerror like this, you have a bug that kept you from moving everything to the same compute device.

The other thing to be aware of is how to convert PyTorch data back to the CPU. For example, we may want to convert a tensor back to a NumPy array so that we can pass it to Matplotlib or save it to disk. The PyTorch tensor object has a .numpy() method that will do this, but if you call x.numpy(), you will get this error:

In [None]:
x.numpy()

TypeError: ignored

Instead, you can use the handy shortcut function .cpu() to move an object back to the CPU, where you can interact with it normally. So you will often see code that looks like x.cpu().numpy() when you want to access the results of your work.