<a href="https://colab.research.google.com/github/a-forty-two/DataSetsForML/blob/master/16_Eager_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# Tensorflow code execution is via tf SESSIONS which execute Directed Acyclic Graphs in the backend
# Pytorch on the other hand, provided a rapid execution with statically forming a DAG
# instead, it took a DYNAMIC COMPUTATION GRAPH approach where each layer or activation function was 
# just an actual FUNCTION CALL! just like numpy, most elements were treated like matrices 
# and layers could be formed by simple matmul and add

# TF hence provides an easier method to prototype- this prototyping assumes that developer wants QUICK and DIRTY results
# instead of creating a proper DAG, it just does simple numerical calculations that help with QUICK PROTOTYPING 

# DAG execution (native to TF, graph is formed)
import tensorflow as tf
import numpy as np
x = tf.constant(np.array([[[1],[2]]]))
x
# output is the node on the GRAPH-> tf.Tensor 'Const:0' shape=(2,1) dtype=...
# Const:0 -> STATIC CONSTANT VALUE-> once created, this node cannot be edited!!!

<tf.Tensor 'Const:0' shape=(2, 1) dtype=int64>

In [1]:
# eager execution (no graph formed)
# the above and this block CANNOT run together! so in order to run this block, restart the RUNTIME
import tensorflow as tf
# enable eager execution
tf.enable_eager_execution()
import numpy as np
x = tf.constant(np.array([[1],[2]]))
x
#running this block without restarting runtime will not be successful because- EITHER you create a Graph or you don't
# you can't have both!!! 
# output: Actual value displayed because its no longer waiting for a graph to be created and executed in a tf.session
# id : this is no longer static-> id is w.r.t. the local environment, and hence just a dynamic field (pointer)
# id is not static! within an execution, id remains constant. but next execution, given the sequence
# or shuffling, it may change

<tf.Tensor: id=0, shape=(2, 1), dtype=int64, numpy=
array([[1],
       [2]])>

In [2]:
import tensorflow as tf
# enable eager execution
tf.enable_eager_execution()
import numpy as np
x = tf.constant(np.array([[[1.],[2.]],[[3.],[4.]]]))
x[0] 
# 3 -> 4
# 2 -> 2
# 1 -> 0, float-> 2 blocks, 1 block before decimal, 1 after

<tf.Tensor: id=4, shape=(2, 1), dtype=float64, numpy=
array([[1.],
       [2.]])>

In [6]:
# a lot more nodes were created in between hence id jumped so quickly
# in eager mode
(x[0])
(x[1])
# DYNAMIC MODE: OBSERVE that previously x[0] and x[1] had ids like 10, 11, 22. Now they have
# id's like 30, 40 and so on
# every diff datatype has diff memory allocation
# memory consumed by 1 element = sizeof(element) X count(element)
# if you had 2 KG pack of sugar, and 10 such packs, total sugar = 2 KG X 10 packs = 20 KG!
# if i had to find an address, i can basically mention it as
# Node_NAME Node_ID STARTING_ADDRESS       SIZE  # Address is usually hexadecimel, we are considering Decimel just for example
#  X               1         100           int32-> 32 bits-> 1 block   # from block 100 to 149 -> 50 blocks are for abc.txt
#  Y               2         101           int64-> 2 blocks    # from 150 to 169 -> 20 blocks for bc.txt 
#  Z               4         103           int64-> 2 blocks
# next_var         6         105           int32-> 1 block
# [a,b]            7         106           [int32,int32] -> int32 X 2 -> 1 block X 2 = 2 blocks
# nexter_var       9         108 
# .....
# instead of variables, consider Nodes
# NOT REUSING ANY STATIC GRAPH, it is just creating variables dynamically one after the other
# hence id will keep on increasing 

<tf.Tensor: id=28, shape=(2, 1), dtype=float64, numpy=
array([[3.],
       [4.]])>

In [1]:
#increase dimensions
import tensorflow as tf
import numpy as np
x = tf.constant(np.array([[[1],[2]],[[3],[4]]]))
y = tf.constant(np.array([[[1],[3]],[[5],[4]]]))
z = x + y
z
# if i made a mistake in calculating Z-> there is no way to find it till actually training the model and scoring it
# thanks to dynamic (EAGER) evaluation, i will be able to see my outputs on the go, with better chances to avoid mistakes
# GOOD FOR PROTYPING 

# id is generated dynamically -> proof that no static graph is being created 
# ERRORS can be detected early
# add:0 is telling us that there is a static dictionary mantained for operation names 

<tf.Tensor 'add:0' shape=(2, 2, 1) dtype=int64>

In [2]:
z
# observe that running z again did not increase the add:0 id of the node. 
# reason: this is from a STATIC graph-> node id is not going to change till you actually change the NN structure


<tf.Tensor 'add:0' shape=(2, 2, 1) dtype=int64>

In [1]:
import tensorflow as tf
# enable eager execution
tf.enable_eager_execution()
import numpy as np
x = tf.constant(np.array([[[1],[2]],[[3],[4]]])) # constants are not nodes on the graph, but inputs to the graph so declare
# them are tf not tfe
y = tf.constant(np.array([[[1],[3]],[[5],[4]]]))
z = x + y
z

<tf.Tensor: id=2, shape=(2, 2, 1), dtype=int64, numpy=
array([[[2],
        [5]],

       [[8],
        [8]]])>

In [2]:
z = x+y
z

<tf.Tensor 'add_1:0' shape=(2, 2, 1) dtype=int64>

In [6]:
z=x+y # add operation is the node- not X and Y! hence the size is increasing only by sizeof(+ operation)!
z

<tf.Tensor: id=4, shape=(2, 2, 1), dtype=int64, numpy=
array([[[2],
        [5]],

       [[8],
        [8]]])>

In [7]:
x[0] # x[0] has bulk size 
# id no longer linear

<tf.Tensor: id=8, shape=(2, 1), dtype=int64, numpy=
array([[1],
       [2]])>

In [8]:
z=x+y # add operation is the node- not X and Y! hence the size is increasing only by sizeof(+ operation)!
z

<tf.Tensor: id=9, shape=(2, 2, 1), dtype=int64, numpy=
array([[[2],
        [5]],

       [[8],
        [8]]])>

WHen to use Eager execution?

1. When debugging your neural network - eager_eval is highly compatible with most python (and other few) debugging tools 

2. Immediate error logging 

3. Observing micro operations/dry-running complex structures like lambdas, recursions, loops

Does this mean there will be no learning?

Not really-> learning is the process of improving and making lesser mistakes

For learning all you need is a gradient descend to update weights after calculation of loss

Eager evaluation has inbuilt GD so that back propogration is automatically done everytime an operation executes. 

For a very large network, this may not be suitable, hence break down your code into small pieces of files so that you can switch between eager and lazy execution

In [4]:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
tfe = tf.contrib.eager
w = tfe.Variable(tf.random_normal([1,1])) # NODES on graph are EAGER 
b = tfe.Variable(tf.random_normal([1]))
print(w)
print(b)
# variables are nodes on the graph-> hence they can be executed eagerly! hence tfe can be used on variables 

<tf.Variable 'Variable:0' shape=(1, 1) dtype=float32, numpy=array([[0.03777953]], dtype=float32)>
<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([-0.29082152], dtype=float32)>


In [1]:
import tensorflow as tf
import numpy as np
w = tf.Variable(tf.random_normal([1,1]))
b = tf.Variable(tf.random_normal([1]))
w

<tf.Variable 'Variable:0' shape=(1, 1) dtype=float32_ref>

In [2]:
b

<tf.Variable 'Variable_1:0' shape=(1,) dtype=float32_ref>

In [0]:
# let's reassemble at 2:50