# BasicComputations

We’ve spent the last sections covering the mathematical definitions of various tensors. It’s now time to cover how to create and manipulate tensors using TensorFlow. For this section, we recommend you follow along using an interactive Python session (with IPython). Many of the basic TensorFlow concepts are easiest to understand after experimenting with them directly.

# Installing TensorFlow and Getting Started

Before continuing this section, you will need to install TensorFlow on your machine. The details of installation will vary depending on your particular hardware, so we refer you to the official TensorFlow documentation for more details.

Although there are frontends to TensorFlow in multiple programming languages, we will exclusively use the TensorFlow Python API in the remainder of this book. We recommend that you install Anaconda Python, which packages many useful numerical libraries along with the base Python executable.

Once you’ve installed TensorFlow, we recommend that you invoke it interactively while you’re learning the basic API (see Example 2-1). When experimenting with TensorFlow interactively, it’s convenient to use tf.InteractiveSession(). Invoking this statement within IPython (an interactive Python shell) will make TensorFlow behave almost imperatively, allowing beginners to play with tensors much more easily. You will learn about imperative versus declarative style in greater depth later in this chapter.

Example 2-1. Initialize an interactive TensorFlow session

In [None]:
import tensorflow as tf
tf.InteractiveSession()

Output:<br>`<tensorflow.python.client.session.InteractiveSession>`

The rest of the code in this section will assume that an interactive session has been loaded.

# Initializing Constant Tensors

Until now, we’ve discussed tensors as abstract mathematical entities. However, a system like TensorFlow must run on a real computer, so any tensors must live on computer memory in order to be useful to computer programmers. TensorFlow provides a number of functions that instantiate basic tensors in memory. The simplest of these are tf.zeros() and tf.ones(). tf.zeros() takes a tensor shape (represented as a Python tuple) and returns a tensor of that shape filled with zeros. Let’s try invoking this command in the shell (Example 2-2).

Example 2-2. Create a zeros tensor

In [None]:
tf.zeros(2)

Output:<br>`<tf.Tensor 'zeros:0' shape=(2,) dtype=float32>`

TensorFlow returns a reference to the desired tensor rather than the value of the tensor itself. To force the value of the tensor to be returned, we will use the method tf.Tensor.eval() of tensor objects (Example 2-3). Since we have initialized tf.InteractiveSession(), this method will return the value of the zeros tensor to us.

Example 2-3. Evaluate the value of a tensor

In [None]:
a = tf.zeros(2)
a.eval()

Output:<br>`array([ 0., 0.], dtype=float32)`

Note that the evaluated value of the TensorFlow tensor is itself a Python object. In particular, a.eval() is a numpy.ndarray object. NumPy is a sophisticated numerical system for Python. We won’t attempt an in-depth discussion of NumPy here beyond noting that TensorFlow is designed to be compatible with NumPy conventions to a large degree.

We can call tf.zeros() and tf.ones() to create and display tensors of various sizes (Example 2-4).

Example 2-4. Evaluate and display tensors

In [None]:
a = tf.zeros((2, 3))
a.eval()

Output:<br>`
array([[ 0., 0., 0.],
       [ 0., 0., 0.]], dtype=float32)
    `

In [None]:
b = tf.ones((2,2,2))
b.eval()

Output:<br>`
array([[[ 1., 1.],
        [ 1., 1.]],
       [[ 1., 1.],
        [ 1., 1.]]], dtype=float32)
    `

What if we’d like a tensor filled with some quantity besides 0/1? The tf.fill() method provides a nice shortcut for doing so (Example 2-5).

Example 2-5. Filling tensors with arbitrary values

In [None]:
b = tf.fill((2, 2), value=5.)
b.eval()

Output:<br>`
array([[ 5., 5.],
       [ 5., 5.]], dtype=float32)
    `

tf.constant is another function, similar to tf.fill, which allows for construction of tensors that shouldn’t change during the program execution (Example 2-6).

Example 2-6. Creating constant tensors

In [None]:
a = tf.constant(3)
a.eval()

Output:<br>`3`

# Sampling Random Tensors

Although working with constant tensors is convenient for testing ideas, it’s much more common to initialize tensors with random values. The most common way to do this is to sample each entry in the tensor from a random distribution. tf.random_nor mal allows for each entry in a tensor of specified shape to be sampled from a Normal distribution of specified mean and standard deviation (Example 2-7).

> **_Symmetry Breaking_**<br>Many machine learning algorithms learn by performing updates to a set of tensors that hold weights. These update equations usually satisfy the property that weights initialized at the same value will continue to evolve together. Thus, if the initial set of tensors is initialized to a constant value, the model won’t be capable of learning much. Fixing this situation requires symmetry breaking. The easiest way of breaking symmetry is to sample each entry in a tensor randomly.

Example 2-7. Sampling a tensor with random Normal entries

In [None]:
a = tf.random_normal((2, 2), mean=0, stddev=1)
a.eval()

Output:<br>`
array([[ -0.73437649, -0.77678096],
       [  0.51697761,  1.15063596]], dtype=float32)
    `

One thing to note is that machine learning systems often make use of very large tensors that often have tens of millions of parameters. When we sample tens of millions of random values from the Normal distribution, it becomes almost certain that some sampled values will be far from the mean. Such large samples can lead to numerical instability, so it’s common to sample using tf.truncated_normal() instead of tf.ran dom_normal(). This function behaves the same as tf.random_normal() in terms of API, but drops and resamples all values more than two standard deviations from the mean.

tf.random_uniform() behaves like tf.random_normal() except for the fact that random values are sampled from the Uniform distribution over a specified range (Example 2-8).

Example 2-8. Sampling a tensor with uniformly random entries

In [None]:
a = tf.random_uniform((2, 2), minval=-2, maxval=2)
a.eval()

Output:<br>`
array([[ -1.90391684,  1.4179163 ],
       [  0.67762709,  1.07282352]], dtype=float32)
    `

# Tensor Addition and Scaling

TensorFlow makes use of Python’s operator overloading to make basic tensor arithmetic straightforward with standard Python operators (Example 2-9).

Example 2-9. Adding tensors together

In [None]:
c = tf.ones(( 2, 2 ))
d = tf.ones(( 2, 2 ))
e = c + d
e.eval()

Output:<br>`
array([[ 2., 2.],
       [ 2., 2.]],   dtype=float32)
    `

In [None]:
f = 2 * e
f.eval()

Output:<br>`
array([[ 4., 4.],
       [ 4., 4.]],   dtype=float32)
    `

Tensors can also be multiplied this way. Note, however, when multiplying two tensors we get elementwise multiplication and not matrix multiplication, which can be seen in Example 2-10.

Example 2-10. Elementwise tensor multiplication

In [None]:
c = tf.fill((2,2), 2.)
d = tf.fill((2,2), 7.)
e = c * d
e.eval()

Output:<br>`
array([[ 14., 14.],
       [ 14., 14.]], dtype=float32)
    `

# Matrix Operations

TensorFlow provides a variety of amenities for working with matrices. (Matrices by far are the most common type of tensor used in practice.) In particular, TensorFlow provides shortcuts to make certain types of commonly used matrices. The most widely used of these is likely the identity matrix. Identity matrices are square matrices that are 0 everywhere except on the diagonal, where they are 1. tf.eye() allows for fast construction of identity matrices of desired size (Example 2-11).

Example 2-11. Creating an identity matrix

In [None]:
a = tf.eye(4)
a.eval()

Output:<br>`
array([[ 1., 0., 0., 0. ],
       [ 0., 1., 0., 0. ],
       [ 0., 0., 1., 0. ],
       [ 0., 0., 0., 1. ]], dtype=float32)
    `

Diagonal matrices are another common type of matrix. Like identity matrices, diagonal matrices are only nonzero along the diagonal. Unlike identity matrices, they may take arbitrary values along the diagonal. Let’s construct a diagonal matrix with ascending values along the diagonal (Example 2-12). To start, we’ll need a method to construct a vector of ascending values in TensorFlow. The easiest way for doing this is invoking tf.range(start, limit, delta). Note that limit is excluded from the range and delta is the step size for the traversal. The resulting vector can then be fed to tf.diag(diagonal), which will construct a matrix with the specified diagonal.

Example 2-12. Creating diagonal matrices

In [None]:
r = tf.range( 1, 5, 1 )
r.eval()

Output:<br>`array([ 1, 2, 3, 4 ], dtype=int32)`

In [None]:
d = tf.diag(r)
d.eval()

Output:<br>`
array([[ 1, 0, 0, 0 ],
       [ 0, 2, 0, 0 ],
       [ 0, 0, 3, 0 ],
       [ 0, 0, 0, 4 ]], dtype=int32)
    `

Now suppose that we have a specified matrix in TensorFlow. How do we compute the matrix transpose? tf.matrix_transpose() will do the trick nicely (Example 2-13).

Example 2-13. Taking a matrix transpose

In [None]:
a = tf.ones(( 2, 3 ))
a.eval()

Output:<br>`
array([[ 1., 1., 1. ],
       [ 1., 1., 1. ]], dtype=float32)
    `

In [None]:
at = tf.matrix_transpose(a)
at.eval()

Output:<br>`
array([[ 1., 1. ],
       [ 1., 1. ],
       [ 1., 1. ]], dtype=float32)
    `

Now, let’s suppose we have a pair of matrices we’d like to multiply using matrix multiplication. The easiest way to do so is by invoking tf.matmul() (Example 2-14).

Example 2-14. Performing matrix multiplication

In [None]:
a = tf.ones((2, 3))
a.eval()

Output:<br>`
array([[ 1., 1., 1. ],
       [ 1., 1., 1. ]], dtype=float32)
    `

In [None]:
b = tf.ones((3, 4))
b.eval()

Output:<br>`
array([[ 1., 1., 1., 1. ],
       [ 1., 1., 1., 1. ],
       [ 1., 1., 1., 1. ]], dtype=float32)
    `

In [None]:
c = tf.matmul(a, b)
c.eval()

Output:<br>`
array([[ 3., 3., 3., 3. ],
       [ 3., 3., 3., 3. ]], dtype=float32)
    `

You can check that this answer matches the mathematical definition of matrix multiplication we provided earlier.

# Tensor Types

You may have noticed the dtype notation in the preceding examples. Tensors in TensorFlow come in a variety of types such as tf.float32, tf.float64, tf.int32, tf.int64. It’s possible to to create tensors of specified types by setting dtype in tensor construction functions. Furthermore, given a tensor, it’s possible to change its type using casting functions such as tf.to_double(), tf.to_float(), tf.to_int32(), tf.to_int64(), and others (Example 2-15).

Example 2-15. Creating tensors of different types

In [None]:
a = tf.ones((2,2), dtype=tf.int32)
a.eval()

Output:<br>`
array([[0, 0],
       [0, 0]], dtype=int32)
    `

In [None]:
b = tf.to_float(a)
b.eval()

Output:<br>`
array([[ 0., 0.],
       [ 0., 0.]], dtype=float32)
    `

# Tensor Shape Manipulations

Within TensorFlow, tensors are just collections of numbers written in memory. The different shapes are views into the underlying set of numbers that provide different ways of interacting with the set of numbers. At different times, it can be useful to view the same set of numbers as forming tensors with different shapes. tf.reshape() allows tensors to be converted into tensors with different shapes (Example 2-16).

Example 2-16. Manipulating tensor shapes

In [None]:
a = tf.ones(8)
a.eval()

Output:<br>`array([ 1., 1., 1., 1., 1., 1.,          1.,   1.], dtype=float32)`

In [None]:
b = tf.reshape(a, (4, 2))
b.eval()

Output:<br>`
array([[ 1., 1.],
       [ 1., 1.],
       [ 1., 1.],
       [ 1., 1.]], dtype=float32)
    `

In [None]:
c = tf.reshape(a, (2, 2, 2))
c.eval()

Output:<br>`
array([[[ 1., 1.],
        [ 1., 1.]],
         [[ 1.,      1.],
          [ 1.,      1.]]], dtype=float32)
    `

Notice how we can turn the original rank-1 tensor into a rank-2 tensor and then into a rank-3 tensor with tf.reshape. While all necessary shape manipulations can be performed with tf.reshape(), sometimes it can be convenient to perform simpler shape manipulations using functions such as tf.expand_dims or tf.squeeze. tf.expand_dims adds an extra dimension to a tensor of size 1. It’s useful for increasing the rank of a tensor by one (for example, when converting a rank-1 vector into a rank-2 row vector or column vector). tf.squeeze, on the other hand, removes all dimensions of size 1 from a tensor. It’s a useful way to convert a row or column vector into a flat vector.

This is also a convenient opportunity to introduce the tf.Tensor.get_shape() method (Example 2-17). This method lets users query the shape of a tensor.

Example 2-17. Getting the shape of a tensor

In [None]:
a = tf.ones(2)
a.get_shape()

Output:<br>`TensorShape([Dimension(2)])`

In [None]:
a.eval()

Output:<br>`array([ 1., 1.], dtype=float32)`

In [None]:
b = tf.expand_dims(a, 0)
b.get_shape()

Output:<br>`TensorShape([Dimension(1), Dimension(2)])`

In [None]:
b.eval()

Output:<br>`array([[ 1., 1.]], dtype=float32)`

In [None]:
>>> c = tf.expand_dims(a, 1)
>>> c.get_shape()

Output:<br>`TensorShape([Dimension(2), Dimension(1)])`

In [None]:
c.eval()

Output:<br>`
array([[ 1.],
       [ 1.]], dtype=float32)
    `

In [None]:
d = tf.squeeze(b)
d.get_shape()

Output:<br>`TensorShape([Dimension(2)])`

In [None]:
d.eval()

Output:<br>`array([ 1., 1.], dtype=float32)`

# Introduction to Broadcasting

Broadcasting is a term (introduced by NumPy) for when a tensor system’s matrices and vectors of different sizes can be added together. These rules allow for conveniences like adding a vector to every row of a matrix. Broadcasting rules can be quite complex, so we will not dive into a formal discussion of the rules. It’s often easier to experiment and see how the broadcasting works (Example 2-18).

Example 2-18. Examples of broadcasting

In [None]:
a = tf.ones((2, 2))
a.eval()

Output:<br>`
array([[ 1., 1.],
       [ 1., 1.]], dtype=float32)
    `

In [None]:
b = tf.range(0, 2, 1, dtype=tf.float32)
b.eval()

Output:<br>`array([ 0., 1.], dtype=float32)`

In [None]:
c = a + b
c.eval()

Output:<br>`
array([[ 1., 2.],
       [ 1., 2.]], dtype=float32)
    `

Notice that the vector b is added to every row of matrix a. Notice another subtlety; we explicitly set the dtype for b. If the dtype isn’t set, TensorFlow will report a type error. Let’s see what would have happened if we hadn’t set the dtype (Example 2-19).

Example 2-19. TensorFlow doesn’t perform implicit type casting

In [None]:
b = tf.range(0, 2, 1)
b.eval()

Output:<br>`array([0, 1], dtype=int32)`

In [None]:
c = a + b

Output:<br>`
ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32:
'Tensor("range_2:0", shape=(2,), dtype=int32)
    `

Unlike languages like C, TensorFlow doesn’t perform implicit type casting under the hood. It’s often necessary to perform explicit type casts when doing arithmetic operations.