<a href="https://colab.research.google.com/github/Kavitesh/projects/blob/main/pytorch%5Cmatrix_multiplication.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our `tensor` variable with values `[1, 2, 3]`:

| Operation | Calculation | Code |
| ----- | ----- | ----- |
| **Element-wise multiplication** | `[1*1, 2*2, 3*3]` = `[1, 4, 9]` | `tensor * tensor` |
| **Matrix multiplication** | `[1*1 + 2*2 + 3*3]` = `[14]` | `tensor.matmul(tensor)` |

### Matrix multiplication

Visual explaination at http://matrixmultiplication.xyz/.

PyTorch implements matrix multiplication functionality in the [`torch.matmul()`](https://pytorch.org/docs/stable/generated/torch.matmul.html) method.

The main two rules for matrix multiplication to remember are:

1. The **inner dimensions** must match:
  * `(3, 2) @ (3, 2)` won't work
  * `(2, 3) @ (3, 2)` will work
  * `(3, 2) @ (2, 3)` will work
2. The resulting matrix has the shape of the **outer dimensions**:
 * `(2, 3) @ (3, 2)` -> `(2, 2)`
 * `(3, 2) @ (2, 3)` -> `(3, 3)`


> **Note:** "`@`" in Python is the symbol for matrix multiplication.
A matrix multiplication like this is also referred to as the **dot product** of two matrices.


The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our `tensor` variable with values `[1, 2, 3]`:

| Operation | Calculation | Code |
| ----- | ----- | ----- |
| **Element-wise multiplication** | `[1*1, 2*2, 3*3]` = `[1, 4, 9]` | `tensor * tensor` |
| **Matrix multiplication** | `[1*1 + 2*2 + 3*3]` = `[14]` | `tensor.matmul(tensor)` |

In [8]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])

In [16]:
%%time
# Matrix multiplication by hand
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: user 278 µs, sys: 0 ns, total: 278 µs
Wall time: 284 µs


tensor(14)

In [13]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 426 µs, sys: 0 ns, total: 426 µs
Wall time: 333 µs


tensor(14)

In [15]:
%%time
tensor @ tensor

CPU times: user 768 µs, sys: 0 ns, total: 768 µs
Wall time: 597 µs


tensor(14)

### Transpose

We can make matrix multiplication work between same dimension matrix by making their inner dimensions match.

One of the ways to do this is with a **transpose** (switch the dimensions of a given tensor).

* `torch.transpose(input, dim0, dim1)` - where `input` is the desired tensor to transpose and `dim0` and `dim1` are the dimensions to be swapped.
* `tensor.T` - where `tensor` is the desired tensor to transpose.

In [18]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [19]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [26]:
# The operation works when tensor_B is transposed
print(f"Shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape} , tensor_B.T = {tensor_B.T.shape}\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)

Shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2]) , tensor_B.T = torch.Size([2, 3])

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])
