In [1]:
import torch

# 1. Introduction to tensors

- Tensors are the fundamental building block of machine learning.
- Their job is to represent data in a numerical way.
- For example, you could represent an image as a tensor with shape [3, 224, 224] which would mean [colour_channels, height, width], as in the image has 3 colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.

![00-tensor-shape-example-of-image.png](attachment:00-tensor-shape-example-of-image.png)

- In tensor-speak (the language used to describe tensors), the tensor would have three dimensions, one for colour_channels, height and width.

## Creating tensors

- PyTorch loves tensors. So much so there's a whole documentation page dedicated to the torch.Tensor class.

- Your first piece of homework is to read through the documentation on torch.Tensor for 10-minutes. But you can get to that later.

## Let's code.

- The first thing we're going to create is a scalar.

- A scalar is a single number and in tensor-speak it's a zero dimension tensor.

In [2]:
scalar = torch.tensor(7)
scalar

tensor(7)

In [3]:
scalar.ndim

0

In [4]:
# return number from the tensor
scalar.item()
# works only for 0-d tensors


7

let's see a <b>vector</b>.

- A vector is a single dimension tensor but can contain many numbers.

- As in, you could have a vector [3, 2] to describe [bedrooms, bathrooms] in your house. Or you could have [3, 2, 2] to describe [bedrooms, bathrooms, car_parks] in your house.

- The important trend here is that a vector is flexible in what it can represent (the same with tensors).

In [5]:
vector = torch.tensor([1,2,3])
vector

tensor([1, 2, 3])

In [6]:
vector.ndim

1

- Hmm, that's strange, vector contains two numbers but only has a single dimension.

- You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.

- How many square brackets does vector have?

- Another important concept for tensors is their shape attribute. The shape tells you how the elements inside them are arranged.

In [7]:
vector.shape

torch.Size([3])

In [8]:
MATRIX = torch.tensor([[1,2,3],[1,2,3]])
MATRIX

tensor([[1, 2, 3],
        [1, 2, 3]])

In [9]:
MATRIX.ndim

2

In [10]:
MATRIX.shape

torch.Size([2, 3])

## How about we create a tensor?

In [11]:
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

- I want to stress that tensors can represent almost anything.

- The one we just created could be the sales numbers for a steak and almond butter store (two of my favourite foods).

![00_simple_tensor.png](attachment:00_simple_tensor.png)

In [12]:
TENSOR.ndim

3

In [13]:
TENSOR.shape

torch.Size([1, 3, 3])

![00-pytorch-different-tensor-dimensions.png](attachment:00-pytorch-different-tensor-dimensions.png)



- Note: You might've noticed me using lowercase letters for scalar and vector and uppercase letters for MATRIX and TENSOR. This was on purpose. In practice, you'll often see scalars and vectors denoted as lowercase letters such as y or a. And matrices and tensors denoted as uppercase letters such as X or W.

- You also might notice the names martrix and tensor used interchangably. This is common. Since in PyTorch you're often dealing with torch.Tensors (hence the tensor name), however, the shape and dimensions of what's inside will dictate what it actually is.


![image.png](attachment:image.png)

![00-scalar-vector-matrix-tensor.png](attachment:00-scalar-vector-matrix-tensor.png)

## Random Tensor Initialization in PyTorch

- Machine learning models use PyTorch to find patterns in data by manipulating tensors. 
- Instead of creating tensors manually, models typically start with large, random tensors. 
- These random numbers are then iteratively updated as the model processes data to improve its representation. 
- As a data scientist, you control this initialization, data representation, and optimization process.

In [14]:
random_tensor = torch.rand(size=(3,4))
random_tensor

tensor([[7.9278e-01, 1.3157e-01, 7.8126e-01, 3.4613e-02],
        [2.5057e-01, 4.3869e-04, 9.0147e-01, 1.2556e-01],
        [5.1334e-01, 7.1166e-01, 8.7213e-01, 7.3238e-01]])

In [15]:
random_tensor.dtype

torch.float32

In [16]:
random_image_size_tensor = torch.rand(size=(3, 224, 224))
random_image_size_tensor

tensor([[[0.4427, 0.2044, 0.0084,  ..., 0.8519, 0.4487, 0.5742],
         [0.0587, 0.8413, 0.0543,  ..., 0.0533, 0.2291, 0.3869],
         [0.4316, 0.0738, 0.7806,  ..., 0.0668, 0.5151, 0.9198],
         ...,
         [0.2171, 0.3212, 0.7369,  ..., 0.9017, 0.6220, 0.0183],
         [0.7323, 0.4787, 0.0739,  ..., 0.1931, 0.9352, 0.6236],
         [0.3924, 0.5150, 0.8497,  ..., 0.5996, 0.3632, 0.5061]],

        [[0.3579, 0.7575, 0.3381,  ..., 0.1408, 0.7490, 0.9259],
         [0.1628, 0.3548, 0.6639,  ..., 0.7362, 0.9601, 0.2596],
         [0.3155, 0.1594, 0.3396,  ..., 0.9492, 0.3321, 0.5081],
         ...,
         [0.3348, 0.7518, 0.9888,  ..., 0.4567, 0.6945, 0.2467],
         [0.7667, 0.5677, 0.4748,  ..., 0.8730, 0.0708, 0.9665],
         [0.6416, 0.9365, 0.1834,  ..., 0.7631, 0.3555, 0.1600]],

        [[0.6047, 0.9147, 0.8492,  ..., 0.5861, 0.1932, 0.5933],
         [0.7727, 0.5990, 0.0354,  ..., 0.7194, 0.9847, 0.7531],
         [0.3977, 0.1775, 0.8723,  ..., 0.4897, 0.7896, 0.

In [17]:
random_image_size_tensor.shape

torch.Size([3, 224, 224])

In [18]:
random_image_size_tensor.dtype

torch.float32

In [19]:
random_image_size_tensor.ndim

3

## Zeros and ones

- Sometimes you'll just want to fill tensors with zeros or ones.

- This happens a lot with masking (like masking some of the values in one tensor with zeros to let a model know not to learn them).

- Let's create a tensor full of zeros and ones with `torch.zeros()`, `torch.ones()`

In [20]:
torch_zeros = torch.zeros(size=(3,4))
torch_zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [21]:
torch_zeros.ndim

2

In [22]:
torch_zeros.shape

torch.Size([3, 4])

In [23]:
torch_ones = torch.ones(size=(3,4))
torch_ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [24]:
torch_ones.ndim

2

In [25]:
torch_ones.shape

torch.Size([3, 4])

## Creating a range and tensors like

- Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

- You can use torch.arange(start, end, step) to do so.

- Where:

    - `start` = start of range (e.g. 0)
    - `end` = end of range (e.g. 10)
    - `step` = how many steps in between each value (e.g. 1)


In [26]:
zero_to_ten = torch.arange(start = 0, end = 10, step = 1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [27]:
zero_to_five = torch.arange(start = 0, end = 10, step = 2)
zero_to_five

tensor([0, 2, 4, 6, 8])

In [28]:
#a tensor of all zeros with the same shape as a previous tensor
ten_zeros = torch.zeros_like(input = zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [29]:
#a tensor of all ones with the same shape as a previous tensor
ten_ones = torch.ones_like(input=zero_to_ten)
ten_ones

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [30]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.9603, 0.0211, 0.2367, 0.1771],
        [0.9520, 0.6292, 0.6736, 0.0075],
        [0.5381, 0.7433, 0.3837, 0.9257]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


## Manipulating tensors (tensor operations)

- In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

- A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

- These operations are often a wonderful dance between:

    - Addition
    - Substraction
    - Multiplication (element-wise)
    - Division
    - Matrix multiplication


In [31]:
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [32]:
tensor - 1

tensor([0, 1, 2])

In [33]:
tensor * 10

tensor([10, 20, 30])

In [34]:
tensor / 10

tensor([0.1000, 0.2000, 0.3000])

In [35]:
tensor -= 10

In [36]:
tensor

tensor([-9, -8, -7])

In [37]:
tensor += 10
tensor

tensor([1, 2, 3])

PyTorch also has a bunch of built-in functions like `torch.mul()` (short for multiplication) and `torch.add()` to perform basic operations.

In [38]:
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [39]:
torch.add(tensor, 10)

tensor([11, 12, 13])

In [40]:
torch.sub(tensor, 10)

tensor([-9, -8, -7])

In [41]:
torch.divide(tensor, 10)

tensor([0.1000, 0.2000, 0.3000])

In [42]:
torch.multiply(tensor, tensor)

tensor([1, 4, 9])

In [43]:
torch.add(tensor, tensor)

tensor([2, 4, 6])

In [44]:
torch.sub(tensor, tensor)

tensor([0, 0, 0])

In [45]:
torch.divide(tensor, tensor)

tensor([1., 1., 1.])

In [46]:
#matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [47]:
tensor @ tensor

tensor(14)

In [48]:
#element wise matrix multiplication
tensor * tensor

tensor([1, 4, 9])

In [49]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [50]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [51]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [54]:
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

In [55]:
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

![00-matrix-multiply-crop.gif](attachment:00-matrix-multiply-crop.gif)

1.  **Neural networks fundamentally rely on linear transformations, primarily expressed through matrix multiplications and dot products.** The `torch.nn.Linear()` module, often called a feed-forward or fully connected layer, is a core component that performs such a transformation between an input $x$ and a learnable weights matrix $A$.

2.  **The operation of a linear layer can be summarized by the equation $y = xA^T + b$.** Here, $x$ represents the input data to the layer, and $y$ is the output. $A$ is the layer's internal weights matrix, which is initialized randomly and refined through the learning process to identify meaningful patterns. The "T" denotes transposition, a common operation in neural network mathematics.

3.  **The variable $b$ represents the bias term, an essential component that allows the linear function to be shifted or offset.** This bias provides additional flexibility, enabling the model to fit a wider range of data patterns beyond what a simple multiplication could achieve.

4.  **This linear function, akin to the familiar $y = mx + b$ from algebra, forms the basis of many neural network operations and is capable of modeling linear relationships within data.** By stacking multiple linear layers and incorporating non-linear activation functions (which are not discussed here), neural networks can approximate highly complex, non-linear patterns.

In [56]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)
# This uses matrix multiplication
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input 
                         out_features=6) # out_features = describes outer value 
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])


## Reshaping, stacking, squeezing and unsqueezing

- Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

- To do so, some popular methods are:
    ![image.png](attachment:image.png)

- Why do any of these?

    - Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make sure the right elements of your tensors are mixing with the right elements of other tensors.