<a href="https://colab.research.google.com/github/AjeetSingh02/Courses/blob/master/Intro_to_tensorflow/week1IntroToTensorflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import tensorflow as tf

# Lazy Evaluation

In [0]:
a = tf.constant([5, 6, 2])
b = tf.constant([2, 1, 2])
c = tf.add(a, b)
print(c)

Tensor("Add:0", shape=(3,), dtype=int32)


Here the "c" is a 1D tensor but it is not the sum of "a" and "b". At current state is not evaluated. This is lazy evaluation. Until explicitlly told the evaluation will not occur. For evaluation we will do this: 

In [0]:
with tf.Session() as sess:
    print(sess.run(c))

[7 7 4]


Now we can see the results. We have made a instance of Session and the evaluated using run command.

# Graph and Session

The Directed Acyclic Graph, the DAG in TensorFlow, is like any graph. It consists of edges and nodes. The edges represent data, they represent tensors, which as we now know, are n-dimensional arrays. The nodes represent TensorFlow operations on those tensors. 

A TensorFlow DAG consists of tensors and operations on those tensors. 

Whatever calculation happens in TensorFlow, it is represented as Directed Ascyclic Graph (DAG) and only after we explicitly tells to run the results are calculated.

**why does TensorFlow do lazy evaluation?**

It's because lazy evaluation allows for a lot of flexibility and optimization when you're running the graph. 

TensorFlow can now process the graph, compile it, inserts **send** and **receive** nodes in the middle of the DAG, also that it can be remotely executed. 

Tensorflow can assign different parts of the DAG to different devices, depending on whether it's I/O bound, or whether it's going to require GPU capabilities. 

While the graph is being processed, TensorFlow can add quantization or data types, it can add debug nodes, it can create summaries to write values out, so tensor can read them besides computation like add, matmul, constants, variables all of these are ops and TensorFlow can work with them. 

When the graph is being compiled, TensorFlow can take two ops and fuse them to improve performance. For example, you may have two consecutive add nodes, and TensorFlow can fuse them into a single one.

TensorFlow's XLA compiler can use the information into a Directed Acyclic Graph to generate faster code. So, that's one aspect of why you want to use a DAG for optimization. 

But the most exciting part is that the DAG can be remotely executed and assigned to devices. And that's where the benefits of the DAG approach become very evident. 

By using explicit edges to represent dependencies between operations, it's easy for the system to identify operations that can execute in parallel. And by using explicit edges to represent the values that flow between operations, it's possible for TensorFLow to partition your program across multiple devices; CPUs, GPUs, TPUs, etc that are attached even to different machines. 
TensorFlow inserts the necessary communication and coordination between these devices.

Several parts of the graph can be on different devices, it doesn't matter whether it's GPU or different computers. So, one key benefit of this model to be able to distribute computation across many machines, and many types of machines, comes because of the DAG. We just write Python code and let the TensorFlow execution system optimize and distribute the graph.

**Session Class**

The session class represents the connection between the Python program that we write, and the C++ runtime. 

The session object provides access to the devices on the local machine, and to remote devices using the distributor TensorFlow runtime. It also caches information about the graph, so, the same computation can be run multiple times. 

As we saw, we execute TensorFlow graphs by calling run on a tf session, and when we do that, we specify a tensor that we want to evaluate.

So, in the code example above, we defined two data tensors **a** and **b**. They're constants, they are 1D tensors. The tensor **c** is a result of invoking **tf.add** on **a** and **b**. When we want to evaluate, we call **session.run** on **c**. Session here **sess**, is an instance of **tf** session, and the **with** statement in Python, is how we can ensure that the session is automatically closed when we are done.

# Evaluating a Tensor

You can call **sess.run(z)** or you can call **z.eval** to evaluate **z** in the context of the default session. 

**z.eval** is just a shortcut, and you will often see it in the code. It is the same as calling run on the default session. 

In [0]:
x = tf.constant([1, 2, 3])
y = tf.constant([5, 6, 3])
z = tf.add(x, y)

with tf.Session() as sess:
    print(z.eval())

[6 8 6]


While you can call **session.run** and passing a single answer, you can also pass in a list of tensors to evaluate. 

TensorFlow will figure out which parts of the graph it needs to evaluate and carry out the evaluation. 

For each input tensor, there is a corresponding numPy array in the output. 

Since we passed in **z** and **z3**, you get back two numPy arrays that I'm calling **a1** and **a3**. 

Notice that this code also shows that you don't need to write out **tf.Add( x,y)**. You can simply say **x** plus **y**, because the common arithmetic operations, they're overloaded. 


In [0]:
x = tf.constant([1, 2, 3])
y = tf.constant([5, 6, 3])
z1 = tf.add(x, y)
z2 = x + y
z3 = z2 - z1

with tf.Session() as sess:
    
    # For each input tensor, there is a corresponding numPy array in the output.
    a1, a3 = sess.run([z1, z3])
    
    print(a1)
    print(a3)

[6 8 6]
[-1  4  3]


I briefly mentioned **tf.eager** earlier. Commonly, TensorFlow programs use the lazy evaluation, and this is what I recommend when you're writing production code. 

However, when you're developing, when you're debugging, it can sometimes be convenient to have the code executed immediately instead of lazily. So here, I'm showing how to use **tf.eager**. 

You import tf eager and enable eager execution. But make sure to do this only once. Typically you do it at the start of your code. 

So here, I'm creating two tensors **x** and **y**, and printing out **x** minus **y**. 

If we are not an eager mode, what would get printed out? Just the debug output of the tensor. This would have included a system assigned a unique name for the node and the Dagg and the shape, and the datatype of the value that will show up when the daggers run. 

But because we are in eager mode, we don't have to wait for session that run to get the actual result of the subtraction. That's why, when I do x minus y, you see the list 2, 3, 4.

In [0]:
from tensorflow.contrib.eager.python import tfe

In [0]:
# Don't run
# It will throw ValueError: tf.enable_eager_execution must be called at program startup here.
tfe.enable_eager_execution()

In [0]:
x = tf.constant([3, 5, 7])
y = tf.constant([1, 2, 3])
print(x - y)

# Visualizing a Graph

In [0]:
x = tf.constant([3, 5, 7], name="x")
y = tf.constant([1, 2, 3], name="y")

In [0]:
z1 = tf.add(x, y, name="z1")
z2 = x * y
z3 = z2 - z1

In [0]:
with tf.Session() as sess:
    with tf.summary.FileWriter('summaries', sess.graph) as writer:
        a1, a3 = sess.run([z1, z3])

In [0]:
! ls summaries

events.out.tfevents.1582236612.580400288a05


In [0]:
%%capture
! pip install datalab

In [0]:
from google.datalab.ml import TensorBoard

In [0]:
TensorBoard().start('./summaries')

664

# Variables and Tensors


In [0]:
# Shape of Tensors
x = tf.constant(3)
print("0D ", x)

x = tf.constant([1,2,3])
print("1D ", x)

x = tf.constant([[1,2,3], [2,3,4]])
print("2D ", x)


0D  Tensor("Const_7:0", shape=(), dtype=int32)
1D  Tensor("Const_8:0", shape=(3,), dtype=int32)
2D  Tensor("Const_9:0", shape=(2, 3), dtype=int32)


We can create higher dimensional Tesors as above or we can use **stack**

In [0]:
x1 = tf.constant([1,2,3])

x2 = tf.stack([x1, x1])
print("Two 1D Tensors of shape (3,) stacked one above other results in 2D Tensor of shape (2,3): ", x2)

x3 = tf.stack([x2, x2, x2, x2])
print("Four 2D Tensors of shape(2,3) stacked one above other results in 3D Tensor of shape (4, 2, 3): ", x3)

x4 = tf.stack([x2, x2])
print("Two 3D Tensors of shape (4, 2, 3) stacked one above other results in 4D Tensor of shape (2, 4, 2, 3)", x4)

Two 1D Tensors of shape (3,) stacked one above other results in 2D Tensor of shape (2,3):  Tensor("stack_3:0", shape=(2, 3), dtype=int32)
Four 2D Tensors of shape(2,3) stacked one above other results in 3D Tensor of shape (4, 2, 3):  Tensor("stack_4:0", shape=(4, 2, 3), dtype=int32)
Two 3D Tensors of shape (4, 2, 3) stacked one above other results in 4D Tensor of shape (2, 4, 2, 3) Tensor("stack_5:0", shape=(2, 2, 3), dtype=int32)


**Slicing on Tensors**

Similar to python Slicing

In [0]:
x = tf.constant([[3, 5, 7],
                 [4, 8, 7]])
y = x[:, 1]

In [0]:
with tf.Session() as sess:
    print(y.eval())

[5 8]


**Reshaping on Tensors**

In [0]:
x = tf.constant([[3, 5, 7],
                 [4, 8, 7]])
y = tf.reshape(x, [3, 2])

with tf.Session() as sess:
    print(y.eval())

[[3 5]
 [7 4]
 [8 7]]


Above in the reshaping you can see that since the desired shape is [3,2] that is 3 rows 2 columns. So 1st two elements are choosen then next 2 and so on

**Variables**

A variable is a tensor whose value is initialized and then the value gets changed as a program runs.

We will understand using the below code

In [0]:
def forward_pass(w, x):
    return tf.matmul(w, x)


def train_loop(x, niter=5):
    with tf.variable_scope("model", reuse=tf.AUTO_REUSE):
        w = tf.get_variable("weights",
                            shape=(1,2),    # 1 * 2 Matrix
                            initializer = tf.truncated_normal_initializer(),
                            trainable=True
                            )
    preds = []
    for k in range(niter):
        preds.append(forward_pass(w, x))
        w = w + 0.1     # Kind of gradient update
    return preds


with tf.Session() as sess:
    preds = train_loop(tf.constant([[3.2, 5.1, 7.2], [4.3, 6.2, 8.3]]))     # 2 * 3 Matrix
    tf.global_variables_initializer().run()
    for i in range(len(preds)):
        print(f"{i}: {preds[i].eval()}")

0: [[ -5.7171655  -8.716917  -12.032432 ]]
1: [[ -4.967165   -7.5869164 -10.482431 ]]
2: [[-4.2171655 -6.456917  -8.932431 ]]
3: [[-3.4671652 -5.3269167 -7.3824315]]
4: [[-2.7171652 -4.1969166 -5.8324313]]


Let's take a close look at this example. 

I have a function called **forward_pass**, which takes two parameters, **w** and **x**, and multiplies them. Well, it's a matrix multiply because these are **tensors**.

In my train loop function, I basically create the tensor **w** except that **w** is not a constant like the tensors that we've been looking at so far. **W** is a variable. 

It has a **name**, **weights**. Its shape is **(1,2)**, which means that it has one row and two columns. It's a 1 by 2 matrix. And when **w** is initialized, we are not initializing it here because remember, TensorFlow is a lazy evaluation framework and so we are only building the graph. We're not yet running it. 

When w is initialized, it will be initialized by a **truncated normal initializer**. 

This is a very common initializer that you will see in TensorFlow neural network programs. It initializes a variable to random numbers, but these random numbers are not uniformly distributed. Instead, they have a Gaussian normal distribution with zero mean and unit variants. But Gaussian normal has a very long tail and you might get extreme outliers. It's very unlikely but it could happen. So, what a truncated normal does, well, it kind of truncates things at sum multiplication of sigma. 

Finally, we say that the variable **w** is **trainable**. A trainable variable is a variable that can be changed during training. The point of a variable of course is to be able to change it so most variables will be trainable. But every once in a while, we'll talk about this when we talk about model size reduction and then we talk about transferred learning. Every once in a while, it can be helpful to freeze a graph to make it such that the variables are not changed. This Boolean flag lets us do that. 

Notice that I'm calling **tf.get_variable** to create **w**. Now, you might see TensorFlow code that directly creates a variable by calling the **tf.variable** constructor. **Calling the constructor directly is not recommended**. 

**Use tf.get_variable** because, as we'll see in course 9, it can be helpful to be able to reuse variables or create them afresh depending on different situations and using **tf.get_variable** let's us do so. So, I recommend that you get into the habit of using tf.get_variable. 

So, we then run the forward pass five times and store the result of the matrix multiply at each iteration. So, after we do the product, we change the weight. Here we are adding **0.1** to it. This is like a gradient update. In reality, of course, in gradient update, we will choose what weights to change and how to change them. But here, for just demo purposes, I'll just add 0.1 to the weights each time. 

Now, from the session, we call train loop by passing in **x**. The **x is a 2 by 3 matrix**. So in the forward pass, we multiply **w** by this **x**. **W** is a 1 by 2 matrix. Multiplying a 1 by 2 by 2 by 3 gives us a 1 by 3 matrix. 

So, at this point, the graph is done but we still need to initialize the variables. That's the run stage. We typically just initialized all the variables in the graph all at once by running the **global variables initializer**. 

So, when we now look at the value of the product after each step of the loop, we notice that the 1 by 3 matrix each time is different as you would expect. 

**So, let's summarize what we have just learned.** 

**1.** create a variable by calling **get variable** 

Well, I skipped over one line of code when I went through it, the **scope piece**. **tf.variable_scope**. When you create a variable, you can specify the scope. That's where I'm telling TensorFlow to reuse the variable each time instead of creating a new variable each time. I'm calling train loop only once so it doesn't matter here, but if I were to call train loop again, the weights would resume from where they left off. We will not create a new variable. We would reuse it. 

**2.** Second thing that you're learning here is that when you create a variable, you have to decide on how to initialize a variable. In neural network training, random normal with truncation is a typical choice. 

**3.** Use the variable just like any other tensor when building the graph.

**4.** In your session, remember to initialize the variable. Usually, you will initialize all the variables together by calling the **global variables initializer**. 

**5.** And after the variables are initialized, you can evaluate any tensor that you want to evaluate. 


**Placeholders**

In this example, we are calling the train loop with the **x**, but the **x** is a **constant**. How realistic is that? Do you hardcode input values into your programs? **NO**

**Placeholders allow you to feed in values into the graph.** 

For example, you can read values from a text file into a Python list and then feed that list into the TensorFlow graph. 

In [0]:
a = tf.placeholder("float", None)
b = a * 4
print(a)
with tf.Session() as sess:
    print(sess.run(b, feed_dict={a: [1,2,3]}))

Tensor("Placeholder:0", dtype=float32)
[ 4.  8. 12.]


So, here, **a** is a **placeholder**. It will hold a scalar. **b** is **a** multiplied by **4**. If you print **a**, you will get the debug output of a tensor. You will learn that this particular tensor is a placeholder that expects floating point numbers to be fed into it. 

If you now want to evaluate **b**, you can adjust this **session.run(b)**. You have to feed in values for the placeholders that **b** depends upon. So in this case, you have to pass in a list or a numpy array of numbers for the placeholder **a**, and you do this using a **feed dict**, a dictionary. The key is a placeholder, in this case, **a**. The value is a list of numpy array. And in this case, it's **[1,2,3]**. So that's what we feed in, and so when **b** is evaluated, you get the value of **a** multiply by **4**, so we get **[4,8,12]**.