In [64]:
import torch

print(torch.__version__)


2.2.2


# Tensor creation

https://pytorch.org/docs/stable/tensors.html

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.

In [65]:
# Scalar
scalar = torch.tensor(7)
print(f"""
The Scalar: {scalar}
The tensor dimension scalar.ndim: {scalar.ndim}

We can get the python number within a tensor with the item() method: {scalar.item()},
altough this only works with one-element tensors
""")




The Scalar: 7
The tensor dimension scalar.ndim: 0

We can get the python number within a tensor with the item() method: 7,
altough this only works with one-element tensors



In [66]:
# Vectors
vector = torch.tensor([7, 7])
print(f"""
The Vector: {vector}
The vector dimension: {vector.ndim}
The shape tells you have the elements inside the tensors are arranged.
For {vector}, the shape is {vector.shape}
""")


The Vector: tensor([7, 7])
The vector dimension: 1
The shape tells you have the elements inside the tensors are arranged.
For tensor([7, 7]), the shape is torch.Size([2])



In [67]:
# Matrix
MATRIX = torch.tensor(
    [[7, 8],
     [9, 10]]
)
print(f"""
The Matrix: 
{MATRIX}

You notice we have two `[`, so the dimension should be 2.
MATRIX.ndim = {MATRIX.ndim},
We have 2 rows and 2 columns, so the shape is MATRIX.shape = {MATRIX.shape}
""")


The Matrix: 
tensor([[ 7,  8],
        [ 9, 10]])

You notice we have two `[`, so the dimension should be 2.
MATRIX.ndim = 2,
We have 2 rows and 2 columns, so the shape is MATRIX.shape = torch.Size([2, 2])



In [68]:
# Tensor
TENSOR = torch.tensor(
    [[[1,2,3],
      [3,6,9],
      [2,4,5]]]
)
print(f"""
The Tensor: {TENSOR}
The tensor dimension: {TENSOR.ndim}
The shape of the tensor is {TENSOR.shape}
""")


The Tensor: tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])
The tensor dimension: 3
The shape of the tensor is torch.Size([1, 3, 3])



In [69]:
torch.backends.mps.is_available()

True

Matrix Multiplication

In [70]:

A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])

result = torch.matmul(A, B)  # También puedes usar A @ B
print(result)

tensor([[19, 22],
        [43, 50]])


## Dimensionality
Lets create matrixes that are compatible for the multiplication operation. Remember that given a Matrix of shape: $A:^{nxm}$ and $B:^{pxq}$ Then the number of columns in the first matrix should be the same as the number of rows for the second matrix. In other words, the internal dimensions should match.
For example:

$$
A:^{ (2x3) }
B:^{ (3x4) }
$$

Multiplying these matrices will generate a matrix of shape $C:^{2*4}$

In [71]:
A  = torch.rand(2,3)
B = torch.rand(3,4)

C = torch.matmul(A, B)
print(f"""
C=
{C}      
C has a shape of {C.shape}
C has a dimension of {C.ndim}

""")


C=
tensor([[0.6103, 0.2330, 0.7837, 0.3185],
        [0.9458, 0.6058, 0.7217, 0.7832]])      
C has a shape of torch.Size([2, 4])
C has a dimension of 2




In [72]:
# 3D Tensors and Batch Dimensions
# When dealing with tensors with more than 2 dimensions (like 3D), PyTorch treats the first dimension as the batch dimension, and it applies the matrix multiplication rule to each pair of matrices in the batch.

# For example, the next matrix can be thought of as 2 matrices of size  (3, 4)  stacked along the first dimension.
A = torch.rand(2, 3, 4)
# And the next can be thought of as 2 matrices of size  (4, 5)  stacked along the first dimension.
B = torch.rand(2, 4, 5)
# PyTorch performs the matrix multiplication independently for each pair of matrices in the batch (i.e., for each slice along the first dimension).
result = torch.matmul(A, B)
# Since there are 2 matrices in the batch (from the first dimension), the final result has shape  (2, 3, 5) .
print(result.shape)
# The resulting tensor shape will be (batch_size, m, p)

torch.Size([2, 3, 5])


## Understanding Broadcasting
```
A = torch.rand(1, 3, 4, 5)
B = torch.rand(2, 1, 5, 6)
```

Imagine we want to multiply A and B. For tensors with dimensions greater than 2D, python treats the outermost left dimension as the "batch dimensions",
and the outermost 2 right dimensions are multiplied using the matrix multiplication rules.
In this case, the matrix multiplication will be performed on the shapes (3, 4) from A and (4, 5) from B,  the result will be a tensor of shape (3, 5).
But the Batch dimensions are not compatible due its shapes (1,3) from A and (2,1).

In this case pytorch will apply a "Broadcast" to these matrices. It consist in "expanding" or "stretching" the dimensions. This requires that one of the dimensions is 1 which is the case in our exercise.

Our matrix A will expand its first dimension twice so ends up with (2,3). And our matrix B will expand is second dimension 3 times, so it will end up in (2,3).
Now with a compatible shape, our matrix can be multiplied. Pytorch take care of this process in the matmul operation





In [73]:
A = torch.rand(1, 3, 4, 5)
B = torch.rand(2, 1, 5, 6)

# Perform matrix multiplication
result = torch.matmul(A, B)

print("Shape of A:", A.shape)
print("Shape of B:", B.shape)
print("Shape of result:", result.shape)  

Shape of A: torch.Size([1, 3, 4, 5])
Shape of B: torch.Size([2, 1, 5, 6])
Shape of result: torch.Size([2, 3, 4, 6])


To understand this expansions of dimension, upgrade the dimension on the first dimension on the matrix A
``` 
A = torch.rand(2, 3, 4, 5)
B = torch.rand(2, 1, 5, 6)
```

This will still work because we still have a dimension equals to 1 in the second matrix B. The process of expansion is made in this one, so the first 2 dimensions of B become: (2,3) 

In [74]:
A = torch.rand(2, 3, 4, 5)
B = torch.rand(2, 1, 5, 6)

# Perform matrix multiplication
result = torch.matmul(A, B)

print("Shape of A:", A.shape)
print("Shape of B:", B.shape)
print("Shape of result:", result.shape)  

Shape of A: torch.Size([2, 3, 4, 5])
Shape of B: torch.Size([2, 1, 5, 6])
Shape of result: torch.Size([2, 3, 4, 6])


But if we change to something like:
```  
A = torch.rand(2, 3, 4, 5)
B = torch.rand(2, 2, 5, 6)

```

We cannot perform the expansion of the matrix and the operation will fail:

In [75]:
A = torch.rand(2, 3, 4, 5)
B = torch.rand(2, 2, 5, 6)

# Perform matrix multiplication
result = torch.matmul(A, B)

print("Shape of A:", A.shape)
print("Shape of B:", B.shape)
print("Shape of result:", result.shape)  


RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 1

# Common errors in deep learning (shape erros)

Now that we have an understanding of basic matrix shape requirements for multiplication lets check the following error examples

In [None]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)

print(f"shape of A: {tensor_A.shape}")
print(f"shape of B: {tensor_B.shape}")
print(f"We notice that the internal dimensions of these tensors are 2 and 3, which means that we CAN'T perform matrix multiplication between these two tensors.")
torch.matmul(tensor_A, tensor_B) # (this will error)

shape of A: torch.Size([3, 2])
shape of B: torch.Size([3, 2])
We notice that the internal dimensions of these tensors are 2 and 3, which means that we CAN'T perform matrix multiplication between these two tensors.


RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

Here is an opportunity to test some techniques for tensor shape manipulation, such as `transpose`.
Here we introduce the function: `torch.transpose(input, dim0, dim1)`


In [None]:
# Create a 2D tensor
tensor = torch.tensor([[1, 2, 3],
                       [4, 5, 6]])
# Original shape
print("Original Tensor:")
print(tensor)
print(f"Shape: {tensor.shape}")

# Transpose the dimensions (swap rows and columns)
transposed_tensor = torch.transpose(
    tensor,  # The input tensor has shape  (2, 3) , with 2 rows and 3 columns
    0,       # Swaps dimension 0 (rows)
    1        # with dimension 1 (columns)
)

# This will result in a tensor of shape (3,2)

print("\nTransposed Tensor:")
print(transposed_tensor)
print(f"Shape: {transposed_tensor.shape}")

Original Tensor:
tensor([[1, 2, 3],
        [4, 5, 6]])
Shape: torch.Size([2, 3])

Transposed Tensor:
tensor([[1, 4],
        [2, 5],
        [3, 6]])
Shape: torch.Size([3, 2])


In [None]:
# We can also to this using T, lets try on our prevous tensors: tensor_A and tensor_B
# View tensor_A and tensor_B.T
print(tensor_A)
print(f"shape of tensor_A: {tensor_A.shape}")
print(tensor_B.T)
print(f"shape of tensor_B.T: {tensor_B.T.shape}")

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
shape of tensor_A: torch.Size([3, 2])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
shape of tensor_B.T: torch.Size([2, 3])


In [None]:
# We now get the same internal dimensions, so we can perform matrix multiplication
torch.matmul(tensor_A, tensor_B.T)
# Oh btw, we can use the shortcut torch.mm(tensor_A, tensor_B.T) to perform matrix multiplication
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

# A practical example of matrix multiplication

lets review the `torch.nn.Linear()` module, also know as a feed forward layer or fully connected layer. It implmements a matrix multiplication between an input `x`and a weights matrix `A`.

You may recognize this in the function use to compute the output of a neural net:

$$ y = x\cdot{W^T} + b $$

Where:

- `x` is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
- `W` is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "`T`",  - that's because the weights matrix gets transposed).

- `b` is the bias term used to slightly offset the weights and inputs.
- `y` is the output (a manipulation of the input in the hopes to discover patterns in it).

In [None]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)

# Let's begin by defining our input tensor `x``
# Our input tensor has a shape of (3, 2). 
x = tensor_A
print(f"x: {x}\nx_shape: {x.shape} this means that we have 3 samples with 2 features each\n\n")

# The `linear` function will declare:

# The weight matrix `W` as (out_features=6, in_features=2)
#   Let's paue to analyze this initialization shape: (6,2)
#   The first dimension (6) represent the rows of the weight matrix W, and they represent the neurons (or output features) of the layer
#   The second dimension (2) represent the columns of the weight matrix W, and they represent the input features. Thus in inner dimension of the input should match this value


# A bias vector b of shpe (out_features=6) is also initialized
linear = torch.nn.Linear(
                out_features=6, # out_features = describes outer value 
                in_features=2, # in_features = matches inner dimension of input                 
                bias=True
) 

# Weight matrix
W = linear.weight.data
print(f"Weight matrix W:\n{W}\n\nWeight matrix shape: {W.shape}")
print(f"When transposed the weight matrix will be:\n{W.T} \n with shape: {W.T.shape}")

# Get bias vector 
bias = linear.bias.data 
print(f"Bias vector b:\n{bias}\n\nBias vector shape: {bias.shape}")




x: tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
x_shape: torch.Size([3, 2]) this means that we have 3 samples with 2 features each


Weight matrix W:
tensor([[ 0.5406,  0.5869],
        [-0.1657,  0.6496],
        [-0.1549,  0.1427],
        [-0.3443,  0.4153],
        [ 0.6233, -0.5188],
        [ 0.6146,  0.1323]])

Weight matrix shape: torch.Size([6, 2])
When transposed the weight matrix will be:
tensor([[ 0.5406, -0.1657, -0.1549, -0.3443,  0.6233,  0.6146],
        [ 0.5869,  0.6496,  0.1427,  0.4153, -0.5188,  0.1323]]) 
 with shape: torch.Size([2, 6])
Bias vector b:
tensor([ 0.5224,  0.0958,  0.3410, -0.0998,  0.5451,  0.1045])

Bias vector shape: torch.Size([6])


In [None]:
# Now we will perform the operation x·W_T + b 
# (3,2)·(6,2)^T + (b) = (3,2)·(2,6)= (3,6) 
output = linear(x)
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")


# We can interpret the output as follows:
print(f"Output shape: {output.shape}")
print(f"First dimension of shape:\n{output.shape[0]} = Number of samples in the batch.\nEach row is computed independently by applying the same weights matrix W and bias to each input sample")
print(f"Second dimension of shape:\n{output.shape[1]} = Number of output features/neurons in the layer\nEach column corresponds to the activation of a specific neuron for a given input sample")

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])
Output shape: torch.Size([3, 6])
First dimension of shape:
3 = Number of samples in the batch.
Each row is computed independently by applying the same weights matrix W and bias to each input sample
Second dimension of shape:
6 = Number of output features/neurons in the layer
Each column corresponds to the activation of a specific neuron for a given input sample


# Reshaping, stacking, squeezing and unsqueezing

We will review some common functions for reshaping

## Reshape: `torch.reshape(input, shape)`
Preserves Total Number of Elements:
- The new shape must maintain the same total number of elements as the original tensor.
- For example:
    - A tensor of shape  `(2, 3)`  has  $2 \times 3 = 6$  elements.
    - You can reshape it to  (1, 6) ,  (6, 1) , or any shape that results in  6  elements.


Creates a New View:
- The reshaped tensor shares the same underlying data as the original tensor, meaning no new memory is allocated.
- Any changes to the reshaped tensor will reflect in the original tensor, and vice versa, as long as the tensor is contiguous.

A simple way of understanding this is that, as long as the final shape (2,3) equals the number of elmements (6), it can take any form such as (3,2)

In [85]:
# Lets create a tensor with 6 elements
x = torch.arange(1., 7.)  # Shape: (6,)
print("Original Tensor:", x)
print("Original Shape:", x.shape)

# we can reshape it to a 2x3 tensor
reshaped1 = x.reshape(2, 3)
# or a 3x2 tensor
reshaped2 = x.reshape(3, 2)

print(f"\nReshaped (2x3):\n{reshaped1}")
print(f"\nReshaped (3x2):\n{reshaped2}")

Original Tensor: tensor([1., 2., 3., 4., 5., 6.])
Original Shape: torch.Size([6])

Reshaped (2x3):
tensor([[1., 2., 3.],
        [4., 5., 6.]])

Reshaped (3x2):
tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])


In [81]:
# IN this example we will add an extra dimension

x = torch.arange(1., 8.)
print(f"x: \n{x}\nShape: {x.shape}\nsize: {x.size()}")

# Add an extra dimension
x_reshaped = x.reshape(1, 7)
print(f"x_reshaped: \n{x_reshaped}\nShape: {x_reshaped.shape}\nsize: {x_reshaped.size()}")


x: 
tensor([1., 2., 3., 4., 5., 6., 7.])
Shape: torch.Size([7])
size: torch.Size([7])
x_reshaped: 
tensor([[1., 2., 3., 4., 5., 6., 7.]])
Shape: torch.Size([1, 7])
size: torch.Size([1, 7])


In [86]:
# you can specify one dimension as -1, and PyTorch will automatically infer its size to ensure the total number of elements remains constan

x = torch.arange(12)  # Shape: (12,)
reshaped = x.reshape(3, -1)  # Shape: (3, 4)
print(f"x: \n{x}\nShape: {x.shape}")
print(f"reshaped:\n{reshaped}\nShape: {reshaped.shape}")

x: 
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Shape: torch.Size([12])
reshaped:
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
Shape: torch.Size([3, 4])


## `Tensor.view(shape)`
Returns a view of the original tensor in a different shape but shares the same data as the original tensor.

In [93]:
x = torch.arange(1., 8.)  # Shape: (12,)

print(f"x: \n{x}\nShape: {x.shape}")
z = x.view(1,7)
print(f"z: \n{z}\nShape: {z.shape}")

x: 
tensor([1., 2., 3., 4., 5., 6., 7.])
Shape: torch.Size([7])
z: 
tensor([[1., 2., 3., 4., 5., 6., 7.]])
Shape: torch.Size([1, 7])


In [94]:
# Remember though, changing the view of a tensor with torch.view() really only creates a new view of the same tensor.
# So changing the view changes the original tensor too
z[:, 0] = 5 
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))

## Stack torch.stack(tensors, dim=0)
Concatenates a sequence of tensors along a new dimension (dim), all tensors must be same size.

In [101]:
# Stacking in dimension 0 will place the stacked tensors as rows
x_stacked_dim0 = torch.stack([x, x, x], dim=0)
print(f"original x: \n{x}\nShape: {x.shape}\n")
print(f"x_stacked: \n{x_stacked_dim0}\nShape: {x_stacked_dim0.shape}")

original x: 
tensor([5., 2., 3., 4., 5., 6., 7.])
Shape: torch.Size([7])

x_stacked: 
tensor([[5., 2., 3., 4., 5., 6., 7.],
        [5., 2., 3., 4., 5., 6., 7.],
        [5., 2., 3., 4., 5., 6., 7.]])
Shape: torch.Size([3, 7])


In [102]:
# Stacking in dimension 1 will place the stacked tensors as columns
x_stacked_dim1 = torch.stack([x, x, x], dim=1)
print(f"original x: \n{x}\nShape: {x.shape}\n")
print(f"x_stacked: \n{x_stacked_dim1}\nShape: {x_stacked_dim1.shape}")

original x: 
tensor([5., 2., 3., 4., 5., 6., 7.])
Shape: torch.Size([7])

x_stacked: 
tensor([[5., 5., 5.],
        [2., 2., 2.],
        [3., 3., 3.],
        [4., 4., 4.],
        [5., 5., 5.],
        [6., 6., 6.],
        [7., 7., 7.]])
Shape: torch.Size([7, 3])


## torch.squeeze(input)
Squeezes input to remove all the dimenions with value 1.

In [106]:
x = torch.rand(1, 7)
print(f"x: \n{x}\nShape: {x.shape}")

x_squeezed = torch.squeeze(x)
print(f"x_squeezed: \n{x_squeezed}\nShape: {x_squeezed.shape}")

x: 
tensor([[0.6135, 0.0086, 0.7622, 0.6847, 0.5212, 0.7146, 0.5006]])
Shape: torch.Size([1, 7])
x_squeezed: 
tensor([0.6135, 0.0086, 0.7622, 0.6847, 0.5212, 0.7146, 0.5006])
Shape: torch.Size([7])


## torch.unsqueeze(input, dim)

Returns input with a dimension value of 1 added at dim.

In [112]:
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"x_unsqueezed: \n{x_unsqueezed}\nShape: {x_unsqueezed.shape}")

x_unsqueezed: 
tensor([[0.6135, 0.0086, 0.7622, 0.6847, 0.5212, 0.7146, 0.5006]])
Shape: torch.Size([1, 7])


## torch.permute(input, dims)

Returns a view of the original input with its dimensions permuted (rearranged) to dims.

In [117]:
# Create tensor with specific shape
x_original = torch.rand(size=(224, 225, 3))

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")


Previous shape: torch.Size([224, 225, 3])
New shape: torch.Size([3, 224, 225])


# Selecting data from tensors

In [118]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [121]:
# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}") 
print(f"Second square bracket: {x[0][0]}") 
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


In [122]:
# Get all values of 0th dimension and the 0 index of 1st dimension
x[:, 0]

tensor([[1, 2, 3]])

In [123]:
# Get all values of 0th & 1st dimensions but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [124]:
# Get all values of the 0 dimension but only the 1 index value of the 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

In [125]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension 
x[0, 0, :] # same as x[0][0]

tensor([1, 2, 3])