In [26]:
import torch

Scalars are implemented as tensors that contain only one element

In [27]:
x = torch.tensor(3.0)
y = torch.tensor(2.0)

x + y, x * y, x / y, x**y

(tensor(5.), tensor(6.), tensor(1.5000), tensor(9.))

#### Basic Properties of Tensor Arithmetic

In [28]:
A = torch.arange(6, dtype=torch.float32).reshape(2, 3)
B = A.clone()  # Assign a copy of A to B by allocating new memory
A, A + B

(tensor([[0., 1., 2.],
         [3., 4., 5.]]),
 tensor([[ 0.,  2.,  4.],
         [ 6.,  8., 10.]]))

The elementwise product of two matrices is called their Hadamard product

In [29]:
A * B

tensor([[ 0.,  1.,  4.],
        [ 9., 16., 25.]])

In [30]:
a = 2
X = torch.arange(24).reshape(2, 3, 4)
a + X, (a * X).shape

(tensor([[[ 2,  3,  4,  5],
          [ 6,  7,  8,  9],
          [10, 11, 12, 13]],
 
         [[14, 15, 16, 17],
          [18, 19, 20, 21],
          [22, 23, 24, 25]]]),
 torch.Size([2, 3, 4]))

Often, we wish to calculate the sum of a tensor’s elements.

In [31]:
a = torch.arange(3, dtype=torch.float32)
a, a.sum()

(tensor([0., 1., 2.]), tensor(3.))

In [32]:
A, A.shape, A.sum()

(tensor([[0., 1., 2.],
         [3., 4., 5.]]),
 torch.Size([2, 3]),
 tensor(15.))

In [33]:
A.shape, A.sum(axis=0)

(torch.Size([2, 3]), tensor([3., 5., 7.]))

Reducing a matrix along both rows and columns via summation is equivalent to summing up all the elements of the matrix.

In [34]:
A.sum(axis=[0, 1]) == A.sum()  # Same as A.sum()

tensor(True)

#### Non-Reduction Sum

Sometimes it can be useful to keep the number of axes unchanged when invoking the function for calculating the sum or mean. This matters when we want to use the broadcast mechanism.

In [35]:
A

tensor([[0., 1., 2.],
        [3., 4., 5.]])

In [36]:
sum_A = A.sum(axis=1, keepdims=True)
sum_A, sum_A.shape

(tensor([[ 3.],
         [12.]]),
 torch.Size([2, 1]))

For instance, since sum_A keeps its two axes after summing each row, we can divide A by sum_A with broadcasting to create a matrix where each row sums up to 1

In [42]:
(A / sum_A)

tensor([[0.0000, 0.3333, 0.6667],
        [0.2500, 0.3333, 0.4167]])

If we want to calculate the cumulative sum of elements of A along some axis, say axis=0 (row by row), we can call the cumsum function. By design, this function does not reduce the input tensor along any axis.

In [49]:
A = torch.arange(1, 13).reshape(3, 4)
A, A.cumsum(axis=0), A.cumsum(axis=1)

(tensor([[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12]]),
 tensor([[ 1,  2,  3,  4],
         [ 6,  8, 10, 12],
         [15, 18, 21, 24]]),
 tensor([[ 1,  3,  6, 10],
         [ 5, 11, 18, 26],
         [ 9, 19, 30, 42]]))

#### Dot Products