# Scalars

A scalar is represented by a tensor with just one element. In the next snippet, we instantiate two scalars and perform some familiar arithmetic operations with them, namely addition, multiplication, division, and exponentiation.

In [1]:
import tensorflow as tf

x = tf.constant([3.0])
y = tf.constant([2.0])

x + y, x * y, x / y, x**y

(<tf.Tensor: shape=(1,), dtype=float32, numpy=array([5.], dtype=float32)>,
 <tf.Tensor: shape=(1,), dtype=float32, numpy=array([6.], dtype=float32)>,
 <tf.Tensor: shape=(1,), dtype=float32, numpy=array([1.5], dtype=float32)>,
 <tf.Tensor: shape=(1,), dtype=float32, numpy=array([9.], dtype=float32)>)

# Vectors

You can think of a vector as simply a list of scalar values. We call these values the elements (entries or components) of the vector. When our vectors represent examples from our dataset, their values hold some real-world significance. For example, if we were training a model to predict the risk that a loan defaults, we might associate each applicant with a vector whose components correspond to their income, length of employment, number of previous defaults, and other factors. If we were studying the risk of heart attacks hospital patients potentially face, we might represent each patient by a vector whose components capture their most recent vital signs, cholesterol levels, minutes of exercise per day, etc. In math notation, we will usually denote vectors as bold-faced, lower-cased letters (e.g.,  x ,  y , and  z) .

We work with vectors via one-dimensional tensors. In general tensors can have arbitrary lengths, subject to the memory limits of your machine.

In [2]:
x = tf.range(4)
x

<tf.Tensor: shape=(4,), dtype=int32, numpy=array([0, 1, 2, 3])>

In [3]:
x[3]

<tf.Tensor: shape=(), dtype=int32, numpy=3>

# Length, Dimensionality and Shape

As with an ordinary Python array, we can access the length of a tensor by calling Python’s built-in len() function.

In [4]:
len(x)

4

When a tensor represents a vector (with precisely one axis), we can also access its length via the .shape attribute. The shape is a tuple that lists the length (dimensionality) along each axis of the tensor. For tensors with just one axis, the shape has just one element.

In [6]:
x.shape

TensorShape([4])

Note that the word “dimension” tends to get overloaded in these contexts and this tends to confuse people. To clarify, we use the dimensionality of a vector or an axis to refer to its length, i.e., the number of elements of a vector or an axis. However, we use the dimensionality of a tensor to refer to the number of axes that a tensor has. In this sense, the dimensionality of some axis of a tensor will be the length of that axis.

## Matrices

In [7]:
A = tf.reshape(tf.range(20), (5, 4))
A

<tf.Tensor: shape=(5, 4), dtype=int32, numpy=
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])>

In [8]:
tf.transpose(A)

<tf.Tensor: shape=(4, 5), dtype=int32, numpy=
array([[ 0,  4,  8, 12, 16],
       [ 1,  5,  9, 13, 17],
       [ 2,  6, 10, 14, 18],
       [ 3,  7, 11, 15, 19]])>

In [9]:
# Symmetric matrix
# B=transpose(B)
B = tf.constant([[1, 2, 3], [2, 0, 4], [3, 4, 5]])
B

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [2, 0, 4],
       [3, 4, 5]])>

In [10]:
B == tf.transpose(B)

<tf.Tensor: shape=(3, 3), dtype=bool, numpy=
array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])>

## Tensors

In [11]:
X = tf.reshape(tf.range(24), (2, 3, 4))
X

<tf.Tensor: shape=(2, 3, 4), dtype=int32, numpy=
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])>

# Tensor Arithmetic

In [12]:
A = tf.reshape(tf.range(20, dtype=tf.float32), (5, 4))
B = A  # No cloning of `A` to `B` by allocating new memory
A, A + B

(<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.]], dtype=float32)>,
 <tf.Tensor: shape=(5, 4), dtype=float32, numpy=
 array([[ 0.,  2.,  4.,  6.],
        [ 8., 10., 12., 14.],
        [16., 18., 20., 22.],
        [24., 26., 28., 30.],
        [32., 34., 36., 38.]], dtype=float32)>)

In [13]:
A*B

<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
array([[  0.,   1.,   4.,   9.],
       [ 16.,  25.,  36.,  49.],
       [ 64.,  81., 100., 121.],
       [144., 169., 196., 225.],
       [256., 289., 324., 361.]], dtype=float32)>

In [14]:
a = 2
X = tf.reshape(tf.range(24), (2, 3, 4))
a + X

<tf.Tensor: shape=(2, 3, 4), dtype=int32, numpy=
array([[[ 2,  3,  4,  5],
        [ 6,  7,  8,  9],
        [10, 11, 12, 13]],

       [[14, 15, 16, 17],
        [18, 19, 20, 21],
        [22, 23, 24, 25]]])>

In [15]:
a*X

<tf.Tensor: shape=(2, 3, 4), dtype=int32, numpy=
array([[[ 0,  2,  4,  6],
        [ 8, 10, 12, 14],
        [16, 18, 20, 22]],

       [[24, 26, 28, 30],
        [32, 34, 36, 38],
        [40, 42, 44, 46]]])>

# Reduction

One useful operation that we can perform with arbitrary tensors is to calculate the sum of their elements. In mathematical notation, we express sums using the  ∑  symbol

In [16]:
x = tf.range(4, dtype=tf.float32)
x, tf.reduce_sum(x)

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 1., 2., 3.], dtype=float32)>,
 <tf.Tensor: shape=(), dtype=float32, numpy=6.0>)

In [17]:
A.shape, tf.reduce_sum(A)

(TensorShape([5, 4]), <tf.Tensor: shape=(), dtype=float32, numpy=190.0>)

By default, invoking the function for calculating the sum reduces a tensor along all its axes to a scalar. We can also specify the axes along which the tensor is reduced via summation. Take matrices as an example. To reduce the row dimension (axis 0) by summing up elements of all the rows, we specify axis=0 when invoking the function. Since the input matrix reduces along axis 0 to generate the output vector, the dimension of axis 0 of the input is lost in the output shape.

In [18]:
A_sum_axis0 = tf.reduce_sum(A, axis=0)
A_sum_axis0, A_sum_axis0.shape

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([40., 45., 50., 55.], dtype=float32)>,
 TensorShape([4]))

Specifying axis=1 will reduce the column dimension (axis 1) by summing up elements of all the columns. Thus, the dimension of axis 1 of the input is lost in the output shape.

In [19]:
A_sum_axis1 = tf.reduce_sum(A, axis=1)
A_sum_axis1, A_sum_axis1.shape

(<tf.Tensor: shape=(5,), dtype=float32, numpy=array([ 6., 22., 38., 54., 70.], dtype=float32)>,
 TensorShape([5]))

Reducing a matrix along both rows and columns via summation is equivalent to summing up all the elements of the matrix.

In [20]:
tf.reduce_sum(A, axis=[0, 1])  # Same as `tf.reduce_sum(A)`

<tf.Tensor: shape=(), dtype=float32, numpy=190.0>

A related quantity is the mean, which is also called the average. We calculate the mean by dividing the sum by the total number of elements. In code, we could just call the function for calculating the mean on tensors of arbitrary shape.

In [21]:
tf.reduce_mean(A), tf.reduce_sum(A) / tf.size(A).numpy()

(<tf.Tensor: shape=(), dtype=float32, numpy=9.5>,
 <tf.Tensor: shape=(), dtype=float32, numpy=9.5>)

Likewise, the function for calculating the mean can also reduce a tensor along the specified axes.

In [22]:
tf.reduce_mean(A, axis=0), tf.reduce_sum(A, axis=0) / A.shape[0]

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>)

# Non reduction sum

However, sometimes it can be useful to keep the number of axes unchanged when invoking the function for calculating the sum or mean.

In [23]:
sum_A = tf.reduce_sum(A, axis=1, keepdims=True)
sum_A

<tf.Tensor: shape=(5, 1), dtype=float32, numpy=
array([[ 6.],
       [22.],
       [38.],
       [54.],
       [70.]], dtype=float32)>

For instance, since sum_A still keeps its two axes after summing each row, we can divide A by sum_A with broadcasting.

In [24]:
A / sum_A

<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
array([[0.        , 0.16666667, 0.33333334, 0.5       ],
       [0.18181819, 0.22727273, 0.27272728, 0.3181818 ],
       [0.21052632, 0.23684211, 0.2631579 , 0.28947368],
       [0.22222222, 0.24074075, 0.25925925, 0.2777778 ],
       [0.22857143, 0.24285714, 0.25714287, 0.27142859]], dtype=float32)>

If we want to calculate the cumulative sum of elements of A along some axis, say axis=0 (row by row), we can call the cumsum function. This function will not reduce the input tensor along any axis.

In [25]:
tf.cumsum(A, axis=0)

<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  6.,  8., 10.],
       [12., 15., 18., 21.],
       [24., 28., 32., 36.],
       [40., 45., 50., 55.]], dtype=float32)>

# Dot products

In [26]:
y = tf.ones(4, dtype=tf.float32)
x

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 1., 2., 3.], dtype=float32)>

In [27]:
y

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([1., 1., 1., 1.], dtype=float32)>

In [28]:
tf.tensordot(x, y, axes=1)

<tf.Tensor: shape=(), dtype=float32, numpy=6.0>

Note that we can express the dot product of two vectors equivalently by performing an elementwise multiplication and then a sum:

In [29]:
tf.reduce_sum(x * y)

<tf.Tensor: shape=(), dtype=float32, numpy=6.0>

## Matrix-Vector Products

In [32]:
A, A.shape

(<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [12., 13., 14., 15.],
        [16., 17., 18., 19.]], dtype=float32)>,
 TensorShape([5, 4]))

In [33]:
x, x.shape

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 1., 2., 3.], dtype=float32)>,
 TensorShape([4]))

In [35]:
tf.linalg.matvec(A,x)

<tf.Tensor: shape=(5,), dtype=float32, numpy=array([ 14.,  38.,  62.,  86., 110.], dtype=float32)>

# Matrix-matrix multiplication

In [36]:
B = tf.ones((4, 3), tf.float32)
tf.matmul(A, B)

<tf.Tensor: shape=(5, 3), dtype=float32, numpy=
array([[ 6.,  6.,  6.],
       [22., 22., 22.],
       [38., 38., 38.],
       [54., 54., 54.],
       [70., 70., 70.]], dtype=float32)>

Matrix-matrix multiplication can be simply called matrix multiplication, and should not be confused with the Hadamard product.

## Norms

In [37]:
# L2 norms

u = tf.constant([3.0, -4.0])
tf.norm(u)

<tf.Tensor: shape=(), dtype=float32, numpy=5.0>

In deep learning, we work more often with the squared  L2  norm

As compared with the  L2  norm, L1 norm is less influenced by outliers. To calculate the  L1  norm, we compose the absolute value function with a sum over the elements.

In [38]:
tf.reduce_sum(tf.abs(u))

<tf.Tensor: shape=(), dtype=float32, numpy=7.0>