# Basics


The basic computation used in tensorflow is called a tensor.

Tensors are immutable objects, thus when we assign a new value to it, usually the location of the object changes until and unless we take special care in assigning a new value.

In [4]:
import tensorflow as tf
from IPython.display import display

## Saving Memory

If we dont use certain methods, then the variable used for storing results of some operation usually allocates new memory and points the variable to the new location in place of changing the value stored in existing location.

Here z's location doesn't change when we use `.assign()` but changes if we directly assign the addition of x and y.

In [3]:
x = tf.ones((2, 2))
y = tf.ones((2, 1))

In [5]:
display(x)
display(y)

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)>

<tf.Tensor: shape=(2, 1), dtype=float32, numpy=
array([[1.],
       [1.]], dtype=float32)>

In [7]:
z = tf.Variable(tf.zeros_like(x))
initia_id = id(z)

In [8]:
z.assign(x + y)

<tf.Variable 'UnreadVariable' shape=(2, 2) dtype=float32, numpy=
array([[2., 2.],
       [2., 2.]], dtype=float32)>

In [10]:
initia_id == id(z)

True

In [11]:
z = x + y
display(z)
initia_id == id(z)

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[2., 2.],
       [2., 2.]], dtype=float32)>

False

Here in the initial statement the id was preserved but in the later section it got reassigned a new location

## Tensorflow Decorations


TensorFlow provides the `tf.function` decorator to wrap computation inside of a TensorFlow graph that gets compiled and optimized before running. This allows TensorFlow to prune unused values, and to re-use prior allocations that are no longer needed. This minimizes the memory overhead of TensorFlow computations.

In [12]:
@tf.function
def some_op(a, b, c):
    c = a + b
    return c

In [14]:
initial_id = id(z)
z = some_op(x, y, z)
initial_id == id(z)

False

## Creating a matrix

In [16]:
a = tf.reshape(tf.range(12), (3, 4))
display(a)

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]], dtype=int32)>

## Operations on tensors

### Reduction

In [23]:
display(tf.reduce_sum(a, axis=1))

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([ 6, 22, 38], dtype=int32)>

In [25]:
display(tf.reduce_sum(a, axis=1, keepdims=True))

<tf.Tensor: shape=(3, 1), dtype=int32, numpy=
array([[ 6],
       [22],
       [38]], dtype=int32)>

### Dot product

In [31]:
x = tf.ones((3,4))
y = tf.ones((4,5))
z = tf.tensordot(x, y, axes=1)
display(z)

<tf.Tensor: shape=(3, 5), dtype=float32, numpy=
array([[4., 4., 4., 4., 4.],
       [4., 4., 4., 4., 4.],
       [4., 4., 4., 4., 4.]], dtype=float32)>

In [39]:
x = tf.ones((3,4))
y = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4))
z = tf.tensordot(x, y, axes=0)
display(z)

<tf.Tensor: shape=(3, 4, 3, 4), dtype=float32, numpy=
array([[[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]],


       [[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]],


       [[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],

        [[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 

### Matrix Multiplication

In [41]:
x = tf.reshape(tf.range(10, dtype=tf.float32), (2, 5))
y = tf.reshape(tf.range(15, dtype=tf.float32), (5, 3))
z = tf.matmul(x, y)
display(z)

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[ 90., 100., 110.],
       [240., 275., 310.]], dtype=float32)>

### Norm

#### L1

In [43]:
z = tf.norm(x)
display(z)

<tf.Tensor: shape=(), dtype=float32, numpy=16.881943>

#### L2

In [44]:
tf.reduce_sum(tf.abs(x))

<tf.Tensor: shape=(), dtype=float32, numpy=45.0>

## Automatic Diferentiation

In [45]:
x = tf.range(4, dtype=tf.float32)

To find the gradient of a function $y = f(x)$ we will need to 
- convert x into a variable
- record the function calculation under the `tf.GradientTape()` context
- find gradient of y with respect to x

In [46]:
x = tf.Variable(x)

In [52]:
with tf.GradientTape() as gt:
    y = 2 * tf.tensordot(x, x, axes=1)

In [53]:
x_grad = gt.gradient(y, x)
display(x_grad)

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 0.,  4.,  8., 12.], dtype=float32)>

### Detaching gradient

Sometimes we dont want to find gradient of certain intermediate variables and treat them as constants. In this case we need to use the `stop_gradient()` on these variables.

In [59]:
with tf.GradientTape(persistent = True) as gt:
    y = x * x
    u = tf.stop_gradient(y)
    z = u * x

In [63]:
d_z = gt.gradient(z, x)
display(d_z)
d_z == u

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 1., 4., 9.], dtype=float32)>

<tf.Tensor: shape=(4,), dtype=bool, numpy=array([ True,  True,  True,  True])>