# Torch matrix multiplication

In [8]:
import torch

"""
torch.mm - performs a matrix multiplication without broadcasting
It expects two 2D tensors so n × m * m × p = n×p
i.e. only for matrices and not higher dimensional tensors.

https://pytorch.org/docs/stable/generated/torch.mm.html:
"""

a = torch.randn(2,5)
b = torch.randn(3,5)

torch.mm(a,b.T)

tensor([[ 2.6104,  0.5499,  0.3142],
        [-8.3252, -0.6339, -1.1291]])

In [9]:
"""
torch.mul - performs a elementwise multiplication with broadcasting - (Tensor) by (Tensor or Number)
torch.mul does not perform a matrix multiplication. It broadcasts two tensors and performs 
an elementwise multiplication. 

https://pytorch.org/docs/stable/generated/torch.mul.html
"""

torch.mul(a, b)

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 0

In [10]:
a = torch.randn(2,5)
b = torch.randn(2,5)

torch.mul(a, b)

tensor([[ 0.0039,  0.9935,  0.0681,  0.0039,  0.8029],
        [-1.0538, -0.4971, -2.3967,  1.0259,  0.3220]])

In [None]:

    torch.matmul

It is better to check out the official documentation https://pytorch.org/docs/stable/generated/torch.matmul.html as it uses different modes depending on the input tensors. It may perform dot product, matrix-matrix product or batched matrix products with broadcasting.

As for your question regarding product of:

tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)

it is a batched version of a product. please check this simple example for understanding:

import torch

# 3x1x3
a = torch.FloatTensor([[[1, 2, 3]], [[3, 4, 5]], [[6, 7, 8]]])
# 3
b = torch.FloatTensor([1, 10, 100])
r1 = torch.matmul(a, b)

r2 = torch.stack((
    torch.matmul(a[0], b),
    torch.matmul(a[1], b),
    torch.matmul(a[2], b),
))
assert torch.allclose(r1, r2)

So it can be seen as a multiple operations stacked together across batch dimension.

Also it may be useful to read about broadcasting:

https://pytorch.org/docs/stable/notes/broadcasting.html#broadcasting-semantics

In [None]:
 want to add the introduction of torch.bmm, which is batch matrix-matrix product.

torch.bmm(input,mat2,*,out=None)→Tensor

shape: (b×n×m),(b×m×p) -->(b×n×p)

Performs a batch matrix-matrix product of matrices stored in input and mat2. input and mat2 must be 3-D tensors each containing the same number of matrices.

This function does not broadcast.

Example

input = torch.randn(10, 3, 4)
mat2 = torch.randn(10, 4, 5)
res = torch.bmm(input, mat2)
res.size()  # torch.Size([10, 3, 5])
