<a href="https://colab.research.google.com/github/raagzz/tutorials/blob/main/PyTorch/00_PyTorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Importing PyTorch
---

In [1]:
import torch
torch.__version__

'2.5.1+cu121'

# Introduction to Tensors
---

- Tensors are the fundamental building blocks of Machine Learning.
- Their job is to represent data in a numerical way.

## Creating tensors

- A Scalar is a single number i.e. a zero dimensional tensor.


In [2]:
scalar = torch.tensor(7)
scalar  # Although scalar is a single number, it's of type 'torch.Tensor'

tensor(7)

In [3]:
scalar.ndim # Check the dimensions of a tensor using the 'ndim' attribute

0

In [4]:
scalar.item()   # To retrieve the number from the tensor (as Python Int)

7

- A vector is a single dimension tensor but can contain many numbers.

In [5]:
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

> You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.

In [6]:
vector.ndim # Check the number of dimensions of vector

1

In [7]:
vector.shape    # Tells you how the elements inside them are arranged

torch.Size([2])

In [8]:
MATRIX = torch.tensor([[7, 8],
                       [9, 0]])
MATRIX

tensor([[7, 8],
        [9, 0]])

- Matrices are as flexible as vectors, except they've got an extra dimension.

In [9]:
MATRIX.ndim

2

In [10]:
MATRIX.shape

torch.Size([2, 2])

We get the output `torch.Size([2, 2])` because `MATRIX` is two elements deep and two elements wide.

In [11]:
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [12]:
TENSOR.ndim

3

In [13]:
TENSOR.shape

torch.Size([1, 3, 3])

![image.png](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-pytorch-different-tensor-dimensions.png)

> In practice, you'll often see scalars and vectors denoted as lowercase letters such as `y` or `a`. And matrices and tensors denoted as uppercase letters such as `X` or `W`.

![link text](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

## Random tensors

- When building a machine learning model with PyTorch, we start out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it.

- As a data scientist, you can define how the machine learning model starts *(initialization)*, looks at data *(representation)* and updates *(optimization)* its random numbers.

In [14]:
random_tensor = torch.rand(size=(224, 224, 3)) # create a tensor of random numbers
random_tensor.shape, random_tensor.dtype

(torch.Size([224, 224, 3]), torch.float32)

## Zeros and ones

- Sometimes we just want to fill with zeros or ones. For example, when masking the values.

In [15]:
zeros = torch.zeros(size=(3, 4))    # Create a tensor of all zeros
zeros, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

In [16]:
ones = torch.ones(size=(3, 4))
ones, ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

## Creating a range and tensors like

- Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

In [17]:
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten, zero_to_ten.shape

(tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), torch.Size([10]))

- Sometimes you might want one tensor of a certain type with the same shape as another tensor.

In [18]:
ten_zeros = torch.zeros_like(input=zero_to_ten)
ten_ones = torch.ones_like(input=zero_to_ten)
ten_zeros, ten_ones

(tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]))

## Tensor datatypes

- Most common type is `torch.float32` or `torch.float` (default), referred to as *32-bit floating point*.

- There are 16-bit `torch.float16` or `torch.half` and 64-bit `torch.float64` or `torch.double`

- There are also 8-bit, 16-bit, 32-bit and 64-bit integers.

> The reason for all of these is to do with **precision** in computing.

In [19]:
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

> Aside from **shape** issues (tensor shapes don't match up), two of the other most common issues you'll come across in PyTorch are **datatype** and **device** issues.

In [20]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16)

float_16_tensor.dtype

torch.float16

# Getting information from tensors
---

- `shape` - what shape is the tensor? (some operations require specific shape rules)
- `dtype` - what datatype are the elements within the tensor stored in?
- `device` - what device is the tensor stored on? (usually GPU or CPU)

In [21]:
some_tensor = torch.rand(3, 4)

print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}")

tensor([[0.3755, 0.8068, 0.9972, 0.5519],
        [0.7255, 0.0275, 0.6378, 0.7342],
        [0.1702, 0.0445, 0.5495, 0.6003]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


# Manipulating tensors (tensor operations)
---

- A machine learning model learns by investigating data represented as tensors, and performing a series of operations on them to create a representation of the patterns in the input data.
- They can be addition, subtraction, multiplication (element-wise), division or matrix multiplication.

## Basic operations

In [22]:
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [23]:
tensor * 10

tensor([10, 20, 30])

In [24]:
tensor = tensor - 10 # Subtract and reassign
tensor

tensor([-9, -8, -7])

In [25]:
tensor = tensor + 10 # Add and reassign
tensor

tensor([1, 2, 3])

- Can also use built-in functions.

In [26]:
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [27]:
tensor * tensor # Element-wise multiplication

tensor([1, 4, 9])

## Matrix Multiplication

The main two rules for matrix multiplication to remember are:

1. The **inner dimensions** must match.
2. The resulting matrix has the shape of **outer dimensions**.


In [28]:
tensor = torch.tensor([1, 2, 3])
torch.matmul(tensor, tensor)

tensor(14)

> You can do matrix multiplication by hand but it's not recommended.

In [29]:
%%time
# Matrix multiplication by hand
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
value

CPU times: user 833 µs, sys: 8 µs, total: 841 µs
Wall time: 790 µs


tensor(14)

In [30]:
%%time
tensor @ tensor # Can also use the "@" symbol for matrix multiplication

CPU times: user 79 µs, sys: 14 µs, total: 93 µs
Wall time: 95.8 µs


tensor(14)

# Most common errors in deep learning (shape errors)
---

In [31]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

- We can make matrix multiplication work between tensor_A and tensor_B by making their inner dimensions match.

- One of the ways to do this is with a transpose.

In [32]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [33]:
torch.mm(tensor_A, tensor_B.T) # You can also use torch.mm()

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

### Linear() module

The torch.nn.Linear() module, also known as a **feed-forward layer** or **fully connected layer**, implements a matrix multiplication between an input `x` and a weights matrix `A`.

$$ y = x\cdot{A^T} + b $$

Where:

- $x$ is the input to the layer.
- $A$ is the weights matrix created by the layer, starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data.
- $b$ is the bias term used to slightly offset the weights and inputs.
- $y$ is the output (manipulating input to discover patterns in it).

In [34]:
torch.manual_seed(42) # to make it reproducible

# in_features = matches inner dimensions
# out_features = describes outer value
linear = torch.nn.Linear(in_features=2, out_features=6)
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])


## Finding min, max, mean, sum (aggregation)

In [35]:
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

- Some methods such as `torch.mean()` require tensors to be in `torch.float32` or another specific datatype, otherwise the operation will fail.

In [36]:
x.min(), x.max(), x.type(torch.float32).mean(), x.sum()

(tensor(0), tensor(90), tensor(45.), tensor(450))

In [37]:
# You can also do the same as above with torch methods.
torch.min(x), torch.max(x), torch.mean(x.type(torch.float32)), torch.sum(x)

(tensor(0), tensor(90), tensor(45.), tensor(450))

## Positional min/max


In [38]:
print(f"Index where max value occurs: {x.argmax()}")
print(f"Index where min value occurs: {x.argmin()}")

Index where max value occurs: 9
Index where min value occurs: 0


## Change tensor datatype


In [39]:
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [40]:
tensor_float16 = tensor.type(torch.float16)
tensor_float16.dtype

torch.float16

In [41]:
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

## Reshaping, stacking, squeezing and unsqueezing

In [42]:
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

- Reshapes `input` to `shape` (if compatible).

In [43]:
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

- Returns a view of the original tensor in a different `shape` but shares the same data as the original tensor.

In [44]:
z = x.view(1, 7)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [45]:
# changing the view changes the original tensor too
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))

> **In-place operations:** Operations that store the result into the operand are called *in-place*. They are denoted by a _ suffix. For example: `x.copy_(y)`, `x.t_()`, will change x.

- Concatenates a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same size.

In [49]:
x_stacked_0 = torch.stack([x, x, x, x], dim=0)
x_stacked_1 = torch.stack([x, x, x, x], dim=1)

x_stacked_0, x_stacked_1

(tensor([[5., 2., 3., 4., 5., 6., 7.],
         [5., 2., 3., 4., 5., 6., 7.],
         [5., 2., 3., 4., 5., 6., 7.],
         [5., 2., 3., 4., 5., 6., 7.]]),
 tensor([[5., 5., 5., 5.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.],
         [5., 5., 5., 5.],
         [6., 6., 6., 6.],
         [7., 7., 7., 7.]]))

> `torch.cat()` concatenates the given sequence in the given dimension.
> `torch.stack()` concatenates the given sequence along a new dimension.

In [51]:
x_stacked_0 = torch.cat([x, x, x, x], dim=0)

# This will error
# x_stacked_1 = torch.cat([x, x, x, x], dim=1)

x_stacked_0

tensor([5., 2., 3., 4., 5., 6., 7., 5., 2., 3., 4., 5., 6., 7., 5., 2., 3., 4.,
        5., 6., 7., 5., 2., 3., 4., 5., 6., 7.])

- Squeezes `input` to remove all the dimenions with value `1`.

In [52]:
x_squeezed = x_reshaped.squeeze()
x_squeezed, x_squeezed.shape

(tensor([5., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

- Returns `input` with a dimension value of `1` added at a specific index of dim.

In [53]:
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
x_unsqueezed, x_unsqueezed.shape

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

- Returns a **view** of the original `input` with its dimensions permuted (rearranged) to `dims`.

In [54]:
x_original = torch.rand(size=(224, 224, 3))
x_permuted = x_original.permute(2, 0, 1)

x_original.shape, x_permuted.shape

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

# Indexing (selecting data from tensors)
---

In [55]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

- Indexing values goes *outer dimension -> inner dimension*.

In [56]:
x[0], x[0][0], x[0][0][0]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor([1, 2, 3]),
 tensor(1))

- You can also use `:` to specify "all values in this dimension" and then use a comma (`,`) to add another dimension.

In [57]:
# Get all values of 0th dim and 0 index of 1st dim
print(x[:, 0])
# Get all values of 0th & 1st dims but only index 1 of 2nd dim
print(x[:, :, 1])
# Get all values of 0 dim but only index 1  of 1st and 2nd dim
print(x[:, 1, 1])
# Get index 0 of 0th and 1st dim and all values of 2nd dim
print(x[0, 0, :])

tensor([[1, 2, 3]])
tensor([[2, 5, 8]])
tensor([5])
tensor([1, 2, 3])


# Pytorch tensors and Numpy
---

In [58]:
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

> By default, NumPy arrays are created with the datatype `float64` and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above). However, many PyTorch calculations default to using `float32`.

In [59]:
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

# Reproducibility
---

In [60]:
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.8016, 0.3649, 0.6286, 0.9663],
        [0.7687, 0.4566, 0.5745, 0.9200],
        [0.3230, 0.8613, 0.0919, 0.3102]])

Tensor B:
tensor([[0.9536, 0.6002, 0.0351, 0.6826],
        [0.3743, 0.5220, 0.1336, 0.9666],
        [0.9754, 0.8474, 0.8988, 0.1105]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [61]:
import random

RANDOM_SEED=42
torch.manual_seed(seed=RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called
# Without this, tensor_D would be different to tensor_C
torch.random.manual_seed(seed=RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

# Running tensors on GPUs
---

- GPUs peform matrix multiplications much faster than CPUs.

In [62]:
!nvidia-smi # To check access to a Nvidia GPU

Sat Dec  7 14:56:17 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   41C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## Getting PyTorch to run on GPU

In [63]:
import torch
torch.cuda.is_available()

True

- To run **device-agnostic code**, use this practice so our PyTorch code will use available device.

In [64]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [65]:
torch.cuda.device_count() # count number of GPUs PyTorch has access to

1

## Putting tensors (and models) on the GPU

- Putting a tensor on GPU using `to(device)` returns a copy of that tensor, e.g. the same tensor will be on CPU and GPU. To overwrite tensors, reassign them.

In [66]:
tensor = torch.tensor([1, 2, 3]) # on CPU

print(tensor, tensor.device)

tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

## Moving tensors back to CPU

- Have to do this if we want to interact with tensors using Numpy.

In [67]:
tensor_on_gpu.numpy() # If on GPU, can't transform to NumPy (error)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [68]:
tensor_on_cpu = tensor_on_gpu.cpu().numpy() # copy the tensor back to cpu
tensor_on_cpu

array([1, 2, 3])

The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.

In [69]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

# Exercises
---

1. Documentation reading - See the documentation on [torch.Tensor](https://pytorch.org/docs/stable/tensors.html#torch-tensor) and for [torch.cuda](https://pytorch.org/docs/master/notes/cuda.html#cuda-semantics).

- Spend 1-hour going through the [PyTorch basics tutorial](https://pytorch.org/tutorials/beginner/basics/intro.html)
- To learn more on how a tensor can represent data, see this video: [What's a tensor?](https://youtu.be/f5liqUk0ZTw)

2. Create a random tensor with shape (7, 7).

In [70]:
import torch
tensor = torch.rand(7, 7)
tensor, tensor.shape

(tensor([[0.8694, 0.5677, 0.7411, 0.4294, 0.8854, 0.5739, 0.2666],
         [0.6274, 0.2696, 0.4414, 0.2969, 0.8317, 0.1053, 0.2695],
         [0.3588, 0.1994, 0.5472, 0.0062, 0.9516, 0.0753, 0.8860],
         [0.5832, 0.3376, 0.8090, 0.5779, 0.9040, 0.5547, 0.3423],
         [0.6343, 0.3644, 0.7104, 0.9464, 0.7890, 0.2814, 0.7886],
         [0.5895, 0.7539, 0.1952, 0.0050, 0.3068, 0.1165, 0.9103],
         [0.6440, 0.7071, 0.6581, 0.4913, 0.8913, 0.1447, 0.5315]]),
 torch.Size([7, 7]))

3. Perform a matrix multiplication on the tensor from 2 with another random tensor with shape (1, 7)

In [71]:
tensor_2 = torch.rand(1, 7)
torch.matmul(tensor, tensor_2.T)

tensor([[1.9625],
        [1.0950],
        [0.9967],
        [1.8910],
        [1.9205],
        [1.0674],
        [1.6949]])

4. Set the random seed to 0 and do exercises 2 & 3 over again.

In [72]:
torch.manual_seed(0)

tensor = torch.rand(7, 7)
tensor_2 = torch.rand(1, 7)
torch.matmul(tensor, tensor_2.T)

tensor([[1.8542],
        [1.9611],
        [2.2884],
        [3.0481],
        [1.7067],
        [2.5290],
        [1.7989]])

5. Speaking of random seeds, we saw how to set it with torch.manual_seed() but is there a GPU equivalent? (hint: you'll need to look into the documentation for torch.cuda for this one). If there is, set the GPU random seed to 1234.

In [73]:
torch.cuda.manual_seed(1234)

6. Create two random tensors of shape (2, 3) and send them both to the GPU (you'll need access to a GPU for this). Set torch.manual_seed(1234) when creating the tensors (this doesn't have to be the GPU random seed).

In [74]:
torch.manual_seed(1234)

device = 'cuda' if torch.cuda.is_available() else 'cpu'

tensor_A = torch.rand(2, 3).to(device)
tensor_B = torch.rand(2, 3).to(device)

tensor_A, tensor_B

(tensor([[0.0290, 0.4019, 0.2598],
         [0.3666, 0.0583, 0.7006]], device='cuda:0'),
 tensor([[0.0518, 0.4681, 0.6738],
         [0.3315, 0.7837, 0.5631]], device='cuda:0'))

7. Perform a matrix multiplication on the tensors you created in 6 (again, you may have to adjust the shapes of one of the tensors).

In [75]:
tensor_out = torch.matmul(tensor_A, tensor_B.T)
tensor_out

tensor([[0.3647, 0.4709],
        [0.5184, 0.5617]], device='cuda:0')

8. Find the maximum and minimum values of the output of 7.


In [76]:
torch.max(tensor_out), torch.min(tensor_out)

(tensor(0.5617, device='cuda:0'), tensor(0.3647, device='cuda:0'))

9. Find the maximum and minimum index values of the output of 7.


In [77]:
torch.argmax(tensor_out), torch.argmin(tensor_out)

(tensor(3, device='cuda:0'), tensor(0, device='cuda:0'))

10. Make a random tensor with shape (1, 1, 1, 10) and then create a new tensor with all the 1 dimensions removed to be left with a tensor of shape (10). Set the seed to 7 when you create it and print out the first tensor and it's shape as well as the second tensor and it's shape.

In [78]:
torch.manual_seed(7)

tensor_rand = torch.rand(size=(1, 1, 1, 10))
tensor_rand_2 = tensor_rand.squeeze()

tensor_rand, tensor_rand.shape, tensor_rand_2, tensor_rand_2.shape

(tensor([[[[0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297,
            0.3653, 0.8513]]]]),
 torch.Size([1, 1, 1, 10]),
 tensor([0.5349, 0.1988, 0.6592, 0.6569, 0.2328, 0.4251, 0.2071, 0.6297, 0.3653,
         0.8513]),
 torch.Size([10]))