# 2.1. Data Manipulation

# 2.1.1. Getting Started

To start, we import tensorflow. For brevity, practitioners often assign the alias tf.

In [1]:
import tensorflow as tf




In [2]:
x = tf.range(12, dtype = tf.float32)
x

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.],
      dtype=float32)>

Each of these values is called an element of the tensor. The tensor _x_ contains 12 elements. We can inspect the total number of elements in a tensor via its ```size``` function.

In [3]:
tf.size(x)

<tf.Tensor: shape=(), dtype=int32, numpy=12>

We can access a tensor’s shape (the length along each axis) by inspecting its ```shape``` attribute. Because we are dealing with a vector here, the shape contains just a single element and is identical to the size.

In [4]:
x.shape

TensorShape([12])

We can change the shape of a tensor without altering its size or values, by invoking ```reshape```. For example, we can transform our vector x whose shape is (12,) to a matrix X with shape (3, 4). This new tensor retains all elements but reconfigures them into a matrix. Notice that the elements of our vector are laid out one row at a time and thus x[3] == X[0, 3].

In [5]:
X = tf.reshape(x, (3, 4))
X

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

Practitioners often need to work with tensors initialized to contain all 0s or 1s. We can construct a tensor with all elements set to 0 and a shape of (2, 3, 4) via the ```zeros``` function.

In [6]:
tf.zeros((2, 3, 4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]], dtype=float32)>

Similarly, we can create a tensor with all 1s by invoking ```ones```.

In [7]:
tf.ones((2, 3, 4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]], dtype=float32)>

We often wish to sample each element randomly (and independently) from a given probability distribution. For example, the parameters of neural networks are often initialized randomly. The following snippet creates a tensor with elements drawn from a standard Gaussian (normal) distribution with mean 0 and standard deviation 1.

In [8]:
tf.random.normal(shape=[3, 4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0.90030813, -0.3498181 ,  1.00829   , -0.93116874],
       [-0.3247628 ,  0.2344069 , -0.6093807 ,  0.7520952 ],
       [-1.0945755 ,  0.58003813, -1.5838258 , -0.9817583 ]],
      dtype=float32)>

Finally, we can construct tensors by supplying the exact values for each element by supplying (possibly nested) Python list(s) containing numerical literals. Here, we construct a matrix with a list of lists, where the outermost list corresponds to axis 0, and the inner list corresponds to axis 1.

In [9]:
tf.constant([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[2, 1, 4, 3],
       [1, 2, 3, 4],
       [4, 3, 2, 1]])>

## 2.1.2. Indexing and Slicing

In [10]:
X[-1], X[1:3]

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>,
 <tf.Tensor: shape=(2, 4), dtype=float32, numpy=
 array([[ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)>)

```Tensors``` in TensorFlow are immutable, and cannot be assigned to. ```Variables``` in TensorFlow are mutable containers of state that support assignments. Keep in mind that gradients in TensorFlow do not flow backwards through ```Variable``` assignments.

Beyond assigning a value to the entire ```Variable```, we can write elements of a ```Variable``` by specifying indices.

In [11]:
X_var = tf.Variable(X)
X_var[1, 2].assign(9)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  9.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

If we want to assign multiple elements the same value, we apply the indexing on the left-hand side of the assignment operation. For instance, [:2, :] accesses the first and second rows, where : takes all the elements along axis 1 (column). While we discussed indexing for matrices, this also works for vectors and for tensors of more than two dimensions.ns.

In [12]:
X_var = tf.Variable(X)
X_var[:2, :].assign(tf.ones(X_var[:2, :].shape, dtype = tf.float32) * 12)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[12., 12., 12., 12.],
       [12., 12., 12., 12.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

## 2.1.3. Operations

In mathematical notation, we denote such _unary_ scalar operators (taking one input) by the signature $f: \mathbb{R} \rightarrow \mathbb{R}$
. This just means that the function maps from any real number onto some other real number. Most standard operators, including unary ones like$e^x$ , can be applied elementwise.

In [13]:
tf.exp(x)

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([1.0000000e+00, 2.7182817e+00, 7.3890562e+00, 2.0085537e+01,
       5.4598148e+01, 1.4841316e+02, 4.0342877e+02, 1.0966332e+03,
       2.9809580e+03, 8.1030840e+03, 2.2026465e+04, 5.9874141e+04],
      dtype=float32)>

Likewise, we denote binary scalar operators, which map pairs of real numbers to a (single) real number via the signature $f: \mathbb{R}, \mathbb{R} \rightarrow \mathbb{R}$. Given any two vectors $\mathbf{u}$ and $\mathbf{v}$ of the same shape, and a binary operator $f$, we can produce a vector $\mathbf{c} = F\left(\mathbf{u}, \mathbf{v}\right)$ by setting $c_i \leftarrow f\left(u_i, v_i \right)$, for all $i$, where $c_i$, $u_i$ and $v_i$ are the $i^\text{th}$ elements of vectors $\mathbf{c}$, $\mathbf{u}$ and $\mathbf{v}$. Here, we produced the vector-valued $F: \mathbb{R}^d, \mathbb{R}^d \rightarrow \mathbb{R}^d$ by _lifting_ the scalar function to an elementwise vector operation. The common standard arithmetic operators for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (**) have all been _lifted_ to elementwise operations for identically-shaped tensors of arbitrary shape.

In [14]:
x = tf.constant([1.0, 2, 4, 8])
y = tf.constant([2.0, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 3.,  4.,  6., 10.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([-1.,  0.,  2.,  6.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 2.,  4.,  8., 16.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.5, 1. , 2. , 4. ], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 1.,  4., 16., 64.], dtype=float32)>)

We can also concatenate multiple tensors, stacking them end-to-end to form a larger one. We just need to provide a list of tensors and tell the system along which axis to concatenate. The example below shows what happens when we concatenate two matrices along rows (axis 0) instead of columns (axis 1). y

In [15]:
X = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4))
Y = tf.constant([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
tf.concat([X, Y], axis=0), tf.concat([X, Y], axis=1)

(<tf.Tensor: shape=(6, 4), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]], dtype=float32)>,
 <tf.Tensor: shape=(3, 8), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]], dtype=float32)>)

Sometimes, we want to construct a binary tensor via logical statements. Take X == Y as an example. For each position i, j, if X[i, j] and Y[i, j] are equal, then the corresponding entry in the result takes value 1, otherwise it takes value 0.

In [16]:
X == Y

<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False,  True, False,  True],
       [False, False, False, False],
       [False, False, False, False]])>

Summing all the elements in the tensor yields a tensor with only one element.

In [17]:
tf.reduce_sum(X)

<tf.Tensor: shape=(), dtype=float32, numpy=66.0>

## 2.1.4. Broadcasting

Under certain conditions, even when shapes differ, we can still perform elementwise binary operations by invoking the broadcasting mechanism. Broadcasting works according to the following two-step procedure: (i) expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape; (ii) perform an elementwise operation on the resulting arrays.

In [18]:
a = tf.reshape(tf.range(3), (3, 1))
b = tf.reshape(tf.range(2), (1, 2))
a, b

(<tf.Tensor: shape=(3, 1), dtype=int32, numpy=
 array([[0],
        [1],
        [2]])>,
 <tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[0, 1]])>)

Since a and b are $3\times 1$ and $1\times 2$ matrices, respectively, their shapes do not match up. Broadcasting produces a larger $3\times 2$ matrix by replicating matrix a along the columns and matrix b along the rows before adding them elementwise.

$$ a = \begin{pmatrix}
0\\
1 \\ 
2 
\end{pmatrix}, \quad b = \begin{pmatrix}
0, 1 \end{pmatrix}$$

$$ a + b \rightarrow \begin{pmatrix}
0, 0\\
1, 1 \\ 
2, 2 
\end{pmatrix} + \begin{pmatrix}
0, 1\\
0, 1 \\ 
0, 1 
\end{pmatrix} = \begin{pmatrix}
0, 1\\
1, 2 \\ 
2, 3 
\end{pmatrix}$$

In [19]:
a + b

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[0, 1],
       [1, 2],
       [2, 3]])>

## 2.1.5. Saving Memory

Running operations can cause new memory to be allocated to host results. For example, if we write Y = X + Y, we dereference the tensor that Y used to point to and instead point Y at the newly allocated memory. We can demonstrate this issue with Python’s id() function, which gives us the exact address of the referenced object in memory. Note that after we run Y = Y + X, id(Y) points to a different location. That is because Python first evaluates Y + X, allocating new memory for the result and then points Y to this new location in memory.

In [20]:
id(Y)

2861766828880

In [21]:
before = id(Y)
Y = Y + X
id(Y) == before

False

`Variables` are mutable containers of state in TensorFlow. They provide a way to store your model parameters. We can assign the result of an operation to a `Variable` with `assign`. To illustrate this concept, we overwrite the values of `Variable` Z after initializing it, using `zeros_like`, to have the same shape as Y.

In [22]:
Z = tf.Variable(tf.zeros_like(Y))
print(f"id(Z): {id(Z)}")
Z.assign(X + Y)
print(f"id(Z): {id(Z)}")

id(Z): 2861766667248
id(Z): 2861766667248


Even once you store state persistently in a Variable, you may want to reduce your memory usage further by avoiding excess allocations for tensors that are not your model parameters. Because TensorFlow Tensors are immutable and gradients do not flow through Variable assignments, TensorFlow does not provide an explicit way to run an individual operation in-place.

However, TensorFlow provides the tf.function decorator to wrap computation inside of a TensorFlow graph that gets compiled and optimized before running. This allows TensorFlow to prune unused values, and to reuse prior allocations that are no longer needed. This minimizes the memory overhead of TensorFlow computations.

In [23]:
@tf.function
def computation(X, Y):
    Z = tf.zeros_like(Y)  # This unused value will be pruned out
    A = X + Y  # Allocations will be reused when no longer needed
    B = A + Y
    C = B + Y
    return C + Y

computation(X, Y)

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 8.,  9., 26., 27.],
       [24., 33., 42., 51.],
       [56., 57., 58., 59.]], dtype=float32)>

## 2.1.6. Conversion to Other Python Objects

Converting to a NumPy tensor (`ndarray`), or vice versa, is easy. The converted result does not share memory. This minor inconvenience is actually quite important: when you perform operations on the CPU or on GPUs, you do not want to halt computation, waiting to see whether the NumPy package of Python might want to be doing something else with the same chunk of memory.

In [24]:
A = X.numpy()
B = tf.constant(A)
type(A), type(B)

(numpy.ndarray, tensorflow.python.framework.ops.EagerTensor)

To convert a size-1 tensor to a Python scalar, we can invoke the ```item``` function or Python’s built-in functions.

In [25]:
a = tf.constant([3.5]).numpy()
a, a.item(), float(a), int(a)

(array([3.5], dtype=float32), 3.5, 3.5, 3)

## 2.1.7. Summary

The tensor class is the main interface for storing and manipulating data in deep learning libraries. Tensors provide a variety of functionalities including construction routines; indexing and slicing; basic mathematics operations; broadcasting; memory-efficient assignment; and conversion to and from other Python objects.