# 2.1 Data Manipulation

[book link](https://www.d2l.ai/chapter_preliminaries/ndarray.html#getting-started)

## Tensors

- an array (possibly multidimensional) of numerical values
- **Tensor** with one axis => _vector_
- **Tensor** with two axes => _matrix_

**Tensors** in **Tensorflow** are similar to **NumPy**'s `ndarray`, while also supporting:
- GPU computation
- Automatic differentiation

### 2.1.1 Tensorflow & Tensors

**Basic operations for Tensors**

In [3]:
import tensorflow as tf

> **Element** - Each item inside the Tensor

In [4]:
x = tf.range(12)
x

<tf.Tensor: shape=(12,), dtype=int32, numpy=array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11], dtype=int32)>

In [5]:
x.shape  # The Dimensions of x

TensorShape([12])

In [6]:
tf.size(x)  # The total # of Elements inside x

<tf.Tensor: shape=(), dtype=int32, numpy=12>

In [7]:
X = tf.reshape(x, (3,4))  # Reshape the Tensor, without changing the elements
X

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]], dtype=int32)>

The above is equivalent to:

In [8]:
Y = tf.reshape(x, (-1, 4))
Y

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]], dtype=int32)>

**Initializing w/ Zeros, Ones, Random Numbers (from Gaussian distribution), & User-Defined Constants**

In [9]:
tf.zeros((2,3,4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]], dtype=float32)>

In [10]:
tf.ones((2,3,4))

<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]], dtype=float32)>

In [11]:
tf.random.normal(shape=[3,4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0.547278  , -0.8412348 ,  0.02743153, -0.54868853],
       [ 0.6699073 , -0.1383807 ,  1.2917691 ,  0.9759256 ],
       [ 0.70839417, -0.78390104,  0.4515836 ,  0.6403269 ]],
      dtype=float32)>

In [12]:
tf.constant([[2,1,4,3],[1,2,3,4],[4,3,2,1]])

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[2, 1, 4, 3],
       [1, 2, 3, 4],
       [4, 3, 2, 1]], dtype=int32)>

### 2.1.2 Operations

In [13]:
x = tf.constant([1.0, 2, 4, 8])
y = tf.constant([2.0,2,3,4])

x+y, x-y, x*y, x/y, x**y

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 3.,  4.,  7., 12.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([-1.,  0.,  1.,  4.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 2.,  4., 12., 32.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.5      , 1.       , 1.3333334, 2.       ], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([1.000e+00, 4.000e+00, 6.400e+01, 4.096e+03], dtype=float32)>)

In [14]:
tf.exp(x)  # Exponentiation --> e^x

<tf.Tensor: shape=(4,), dtype=float32, numpy=
array([2.7182817e+00, 7.3890562e+00, 5.4598148e+01, 2.9809580e+03],
      dtype=float32)>

**CONCATENATION**

In [15]:
X = tf.reshape(tf.range(12, dtype=tf.float32), (3,4))
Y = tf.ones(shape=[3,4])
tf.concat([X, Y], axis=0), tf.concat([X, Y], axis=1)

(<tf.Tensor: shape=(6, 4), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]], dtype=float32)>,
 <tf.Tensor: shape=(3, 8), dtype=float32, numpy=
 array([[ 0.,  1.,  2.,  3.,  1.,  1.,  1.,  1.],
        [ 4.,  5.,  6.,  7.,  1.,  1.,  1.,  1.],
        [ 8.,  9., 10., 11.,  1.,  1.,  1.,  1.]], dtype=float32)>)

In [16]:
X == Y

<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False,  True, False, False],
       [False, False, False, False],
       [False, False, False, False]])>

In [17]:
tf.reduce_sum(X)

<tf.Tensor: shape=(), dtype=float32, numpy=66.0>

**BROADCASTING**

In [18]:
a = tf.reshape(tf.range(3), (3,1))
b = tf.reshape(tf.range(2), (1,2))
a,b

(<tf.Tensor: shape=(3, 1), dtype=int32, numpy=
 array([[0],
        [1],
        [2]], dtype=int32)>,
 <tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[0, 1]], dtype=int32)>)

When Tensors attempting elementwise operations have different _shapes_, both are transformed to the same shape (i.e. (1,2) and (3,1) --> (3,2) matrix, by duplicating the rows/columns.

In [19]:
a+b

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[0, 1],
       [1, 2],
       [2, 3]], dtype=int32)>

In [20]:
a = tf.reshape(tf.ones(12), (2,2,-1))
b = tf.reshape(tf.ones(12) * 12, (-1,1))

**INDEXING**

In [21]:
X[-1], X[1,3]

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>,
 <tf.Tensor: shape=(), dtype=float32, numpy=7.0>)

**Tensors in Tensorflow are IMMUTABLE - Variables are MUTABLE**

In [22]:
X_var = tf.Variable(X)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

In [23]:
X_var[1,2].assign(9)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  9.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

In [24]:
X_var[0:2, :].assign(tf.ones(X_var[0:2, :].shape, dtype=tf.float32)* 12)

<tf.Variable 'UnreadVariable' shape=(3, 4) dtype=float32, numpy=
array([[12., 12., 12., 12.],
       [12., 12., 12., 12.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

### 2.1.5 Saving Memory
We want to _avoid_ allocating extra memory when assigning variables:

In [25]:
before = id(Y)
Y = Y + X
id(Y) == False

False

This is _undesirable_ because
1. Extra memory is used when unnecessary,
2. Memory references may be lost, thus causing stale parameters

In [26]:
Z = tf.Variable(tf.zeros_like(Y))
print('id(Z):', id(Z))
Z.assign(X+Y)
print('id(Z):', id(Z))

id(Z): 139885748564848
id(Z): 139885748564848


`@tf.function` decorator optimizes memeory usage in TF

In [28]:
@tf.function
def computation(X, Y):
        Z = tf.zeros_like(Y)  # Since Z is unused, TF will prune it out
        A = X + Y  # Allocation swill be reused when no longer needed
        B = A + Y
        C = B + Y
        return  C + Y