In [1]:
import torch

# addcmul(input, tensor1, tensor2, *, value=1, out=None) → Tensor

The shapes of tensor, tensor1, and tensor2 must be broadcastable.

For inputs of type FloatTensor or DoubleTensor, value must be a real number, otherwise an integer.

$$
out_i = input_i + value \times tensor1_i \times tensor2_i
$$

In [65]:
input = torch.tensor(3)
tensor1 = torch.tensor(2)
tensor2 = torch.tensor(4)
value = 1

In [66]:
torch.addcmul(input, tensor1, tensor2, value=value)

tensor(11)

In [67]:
input + tensor1 * tensor2 * value

tensor(11)

In [68]:
input = torch.randn(1, 3)
tensor1 = torch.randn(3, 1)
tensor2 = torch.randn(1, 3)
value = 0.1

In [69]:
res1 = torch.addcmul(input, tensor1, tensor2, value=value)
res1

tensor([[-0.5691,  0.6914,  0.0034],
        [-0.5687,  0.6914,  0.0025],
        [-0.5466,  0.6917, -0.0457]])

In [70]:
res2 = input + tensor1 * tensor2 * value
res2

tensor([[-0.5691,  0.6914,  0.0034],
        [-0.5687,  0.6914,  0.0025],
        [-0.5466,  0.6917, -0.0457]])

In [72]:
torch.allclose(res1, res2)

True

# addmm(input, mat1, mat2, *, beta=1, alpha=1, out=None) → Tensor

Performs a matrix multiplication of the matrices mat1 and mat2. The matrix input is added to the final result.

If mat1 is a (n×m) tensor, mat2 is a (m×p) tensor, then input must be broadcastable with a (n×p) tensor and out will be a (n×p) tensor.

alpha and beta are scaling factors on matrix-vector product between mat1 and mat2 and the added matrix input respectively.

- input: [nxp]
- mat1: [nxm]
- mat2: [mxp]
- beta: float
- alpha: float
- out: [nxp]

$$
out = beta \times input + alpha \times (mat1 @ mat2)
$$

In [73]:
input = torch.randn(2, 2)
mat1 = torch.randn(2, 3)
mat2 = torch.randn(3, 2)
beta = 2.0
alpha = 3.0

In [74]:
res1 = torch.addmm(input, mat1, mat2, beta=beta, alpha=alpha)
res1

tensor([[  8.3969,   8.5018],
        [  3.2703, -10.6507]])

In [75]:
# 自己手写支持多维度输入
res2 = beta * input + alpha * (mat1 @ mat2)
res2

tensor([[  8.3969,   8.5018],
        [  3.2703, -10.6507]])

In [76]:
torch.all(res1 == res2)

tensor(True)

# addbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor

Performs a batch matrix-matrix product of matrices stored in batch1 and batch2, with a reduced add step (all matrix multiplications get accumulated along the first dimension). input is added to the final result.

batch1 and batch2 must be 3-D tensors each containing the same number of matrices.

If batch1 is a (b×n×m) tensor, batch2 is a (b×m×p) tensor, input must be broadcastable with a (n×p) tensor and out will be a (n×p) tensor.

- input: [n×p]
- batch1: [b×n×m]
- batch2: [b×m×p]
- beta: float
- alpha: float
- out: [n×p]

$$
out = beta \times input + alpha \times (\sum_{i=0}^{b-1}(batch1_i @ batch2_i))
$$

In [81]:
input = torch.randn(3, 3)
batch1 = torch.randn(2, 3, 4)
batch2 = torch.randn(2, 4, 3)
beta = 2.0
alpha = 3.0

In [83]:
res1 = torch.addbmm(input, batch1, batch2, beta=beta, alpha=alpha)
res1

tensor([[  1.6790,   7.2108,  19.9065],
        [ -0.0518,  -5.6930, -14.3090],
        [ -3.0053,   9.9459,   4.5554]])

In [84]:
res2 = beta * input + alpha * torch.sum(batch1 @ batch2, dim=0)
res2

tensor([[  1.6790,   7.2108,  19.9065],
        [ -0.0518,  -5.6930, -14.3090],
        [ -3.0053,   9.9459,   4.5554]])

In [85]:
torch.allclose(res1, res2)

True

# baddbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor

Performs a batch matrix-matrix product of matrices in batch1 and batch2. input is added to the final result.

batch1 and batch2 must be 3-D tensors each containing the same number of matrices.

If batch1 is a (b×n×m) tensor, batch2 is a (b×m×p) tensor, then input must be broadcastable with a (b×n×p) tensor and out will be a (b×n×p) tensor. Both alpha and beta mean the same as the scaling factors used in torch.addbmm().

- input: [b×n×p]
- batch1: [b×n×m]
- batch2: [b×m×p]
- beta: float
- alpha: float
- out: [b×n×p]

$$
out_i = beta \times input_i + alpha \times (batch1_i @ batch2_i)
$$

In [86]:
input = torch.randn(2, 3, 3)
batch1 = torch.randn(2, 3, 4)
batch2 = torch.randn(2, 4, 3)
beta = 2.0
alpha = 3.0

In [87]:
res1 = torch.baddbmm(input, batch1, batch2, beta=beta, alpha=alpha)
print(res1.shape)
res1

torch.Size([2, 3, 3])


tensor([[[-0.2387, -2.5367, -2.8231],
         [ 3.3495, -0.2242,  0.5831],
         [-6.3266, -7.5401, -6.9091]],

        [[-2.0132, -6.4656,  7.6230],
         [11.8475,  2.2841, -1.0839],
         [-1.4977,  0.4509,  4.3374]]])

In [88]:
res2 = beta * input + alpha * batch1 @ batch2
res2

tensor([[[-0.2387, -2.5367, -2.8231],
         [ 3.3495, -0.2242,  0.5831],
         [-6.3266, -7.5401, -6.9091]],

        [[-2.0132, -6.4656,  7.6230],
         [11.8475,  2.2841, -1.0839],
         [-1.4977,  0.4509,  4.3374]]])

In [63]:
torch.allclose(res1, res2)

True

# addmv(input, mat, vec, *, beta=1, alpha=1, out=None) → Tensor

Performs a matrix-vector product of the matrix mat and the vector vec. The vector input is added to the final result.

If mat is a (n×m) tensor, vec is a 1-D tensor of size m, then input must be broadcastable with a 1-D tensor of size n and out will be 1-D tensor of size n.

alpha and beta are scaling factors on matrix-vector product between mat and vec and the added tensor input respectively.

- input: [n]
- mat: [n×m]
- vec: [m]
- beta: float
- alpha: float
- out: [n]

$$
out = beta \times input + alpha \times (mat @ vec)
$$

In [112]:
input = torch.randn(2)
mat = torch.randn(2, 3)
vec = torch.randn(3)

In [113]:
res1 = torch.addmv(input, mat, vec, beta=beta, alpha=alpha)
res1

tensor([-7.5539, -1.0529])

In [114]:
res2 = beta * input + alpha * mat @ vec
res2

tensor([-7.5540, -1.0529])

In [116]:
torch.allclose(res1, res2)

True