Two main ways of performing multiplication in neural networks and deep learning.
- Element-wise multiplication 
- Matrix multiplication (dot product)

In [1]:
import torch

In [2]:
tensor = torch.tensor([1, 2, 3])

In [3]:
# Element wise multiplication
print(tensor, "*", tensor)
print(f"Result: {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Result: tensor([1, 4, 9])


In [4]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [5]:
tensor

tensor([1, 2, 3])

In [9]:
# Matrix multiplication by hand
print(tensor[0] * tensor[0] + tensor[1] * tensor[1] + tensor[2] * tensor[2])

tensor(14)


In [10]:
1 * 1 + 2 * 2 + 3 * 3

14

In [13]:
%%time
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: total: 0 ns
Wall time: 1 ms


In [14]:
%%time
torch.matmul(tensor, tensor)

CPU times: total: 0 ns
Wall time: 1 ms


tensor(14)

One of the most common errors in deep learning is shape errors

There are two main   rules that performing matrix multiplication needs to satisfy:
1. The inner dimensions must match.
2.  The resulting matrix has the shape of the outer dimensions.

In [15]:
torch.rand(3,2)

tensor([[0.3478, 0.1102],
        [0.7716, 0.5761],
        [0.2107, 0.4775]])

In [16]:
torch.rand(3,2).shape

torch.Size([3, 2])

In [17]:
torch.matmul(torch.rand(3,2), torch.rand(2,3))

tensor([[0.3444, 0.8336, 0.3816],
        [0.3141, 0.2131, 0.2324],
        [0.0465, 0.5476, 0.1435]])

In [18]:
torch.matmul(torch.rand(3,2), torch.rand(3,2))

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

Dealing with Shape Errors

In [20]:
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [21]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a **transpose**.

A **transpose** switches the axes or dimensions of a given tensor.

In [23]:
tensor_B.T, tensor_B

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]))

In [24]:
tensor_B.shape, tensor_B.T.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

In [1]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes: tensor_A: {tensor_A.shape}, tensor_B: {tensor_B.shape}") 

print(f"New shapes: tensor_A: {tensor_A.shape}, tensor_B: {tensor_B.T.shape}")

print(f"Multiplication result: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match")

print("Output:")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"Output shape: {output.shape}")

NameError: name 'tensor_A' is not defined