# <span style="color:#0b486b">  FIT3181: Deep Learning (2022)</span>
***
*CE/Lecturer (Clayton):*  **Dr Trung Le** | trunglm@monash.edu <br/>
*Lecturer (Malaysia):*  **Dr Lim Chern Hong** | lim.chernhong@monash.edu <br/>  <br/>
*Tutor:*  **Mr Thanh Nguyen** \[Thanh.Nguyen4@monash.edu \] |**Mr Tuan Nguyen**  \[tuan.ng@monash.edu \] |**Mr Anh Bui** \[tuananh.bui@monash.edu\] | **Dr Binh Nguyen** \[binh.nguyen1@monash.edu \] | **Mr Md Mohaimenuzzaman** \[md.mohaimen@monash.edu \] |**Mr James Tong** \[james.tong1@monash.edu \]
<br/> <br/>
Faculty of Information Technology, Monash University, Australia
***

# <span style="color:#0b486b">Tutorial 1: Deep Learning with TensorFlow</span> #
**The purpose of this tutorial is to demonstrate how to work with an open source software library for developing deep neural networks apllications, called TensorFlow (TF). In this tutorial, following topics are covered:**

1. A quick tour of TensorFlow 1.x
2. How to visualize a computational graph and its node values using Tensorboard.
3. Advancements of TensorFlow 2.x in comparison to TensorFlow 1.x.

**Part I of this tutorial introduces TF 1.x. Although it is much more convenient if we build up and train deep learning models in TF 2.x, TF 1.x serves as a building block for TF 2.x, which can be viewed as a wrapper of TF 1.x. Therefore, understanding how TF 1.x operates is crucial and supports us in diving deeper into TF 2.x. In addition, the concept of computational graph with computational nodes (e.g., constants, variables, and place holders) is central of both TF 1.x and 2.x though being encapsulated in TF 2.x. Furthermore, when you join in the job marker, many of legacy codes have been written in TF 1.x. Therefore, it is crucial to have ideas of how TF 1.x generally operate.**

**Part II introduces TF 2.x, a framework built up on top of TF 1.x. It is much simpler to use and deploy compared to TF 1.x. Most of latter advanced deep learning models in this unit will be implemented using TF 2.x.**

**In addition, to facilitate the study of the students, the tutorial content also has additional material to indicate how the TF 1.x code can be ported to that of TF 2.x.** 

**References and additional reading and resources**
- [Installing Tensorflow on Windows](https://www.tensorflow.org/install)
- [Tensorflow API documentations](https://www.tensorflow.org/api_docs)
- [Examples with Tensorflow](https://www.tensorflow.org/tutorials)

**Acknowledgement**: *Some materials used in this tutorial have been adapted from Chapter 3 of the the book "Learning TensorFlow: A Guide to Building Deep Learning Systems" by Hope, Resheff and Lieder and Chapter 2 of "Deep Learning with with TensorFlow 2 and Keras" by Antonio Culli, Amita Kapoor, Sujit Pal.*

---

## <span style="color:#0b486b"> I. A quick tour of TensorFlow 1.x </span>  <span style="color:red">***** (highly important)</span>

### Table of Content
I.1 Computational Graph <br>
I.2 Declare a Computational Graph  <br>
I.3 Fetches Values  <br>
I.4 More about Graph  <br>
I.5 Data Types and Cast  <br>
I.6 Get shape  <br>
I.7 Initialize Tensors <br>
I.8 Matrix Multiplication and Activation  <br>
I.9 Name <br>
I.10 Name Scope <br>
I.11 Variable <br>
I.12 Variable Scope and Reuse Variable <br>
I.13 Name Scope & Variable Scope <br>
I.14 Placeholder <br>
I.15 Save and Restore Models <br>
I.16 Visualization with TensorBoard <br>

### <span style="color:#0b486b"> I.1 Computational Graph </span> ###
Computational graph is a basic concept of TensorFlow which represents the functional dependency of nodes. The following figure presents a typical computational graph wherein each node is a tensor. <br/>

<img src="images/ComputationalGraph.png" width="500" align="center"/> 

The above computational graph presents some functional dependencies:

- $c$ is dependent on $a$, $b$.
- $d$ depends on $a$, $e$ depends on $c$.
- $f$ depends on $d$, $e$.


### <span style="color:#0b486b"> I.2 Declare a Computational Graph </span>
The following code snippet shows how to declare a computational graph in TensorFlow. Once we import TF, a specific `empty default graph` is formed. All the nodes we create are automatically associated with that default graph. It is worth noting that we just `declare a graph` and **nothing** has been executed yet.

Below we import TensorFlow 2.x. To ensure that this is comparatible and works with TF 1.x codes, we need to import the coressponding package `tensorflow.compat.v1`.




In [2]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

Instructions for updating:
non-resource variables are not supported in the long term


In [6]:
a = tf.constant(5)
b = tf.constant(2)
c = tf.constant(3)
d = tf.multiply(a,b)
e = tf.add(b,c)
f = tf.subtract(d,e)


**<span style="color:red">Exercise 1</span>**: Draw the computational graph of the above computational process.

### <span style="color:#0b486b"> I.3 Fetches Values </span> ###
To query the nodes in a computational graph, we create a `session` and run a query in this sesstion.

In [4]:
with tf.Session() as sess:
    fetches = [a,b,c,d,e,f] # First you need to fetch all the sessions
    outs = sess.run(fetches) 
    print("outs={}".format(outs))

outs=[5, 2, 3, 10, 5, 5]


**<span style="color:red">Exercise 2</span>**: Write code in the cell below to create a new node $g = a^2 + (b * f)$ and $h= (b+c)*d + e^2*c$, create a session, run a query to compute and display the values of $g$ and $h$.

In [None]:
# write your code here
g = tf.add(tf.pow(a, 2), tf.multiply(b,f))
h = tf.add(tf.multiply(tf.add(b, c), d), tf.multiply(tf.pow(e, 2), c))

### <span style="color:#0b486b"> I.4 More about Graph </span> ###
Each TF project is associated with a default computational graph. Besides the default graph, we can create other graphs on demand and work with these graphs.

In [3]:
g1= tf.get_default_graph()
g2 = tf.Graph()
print(g1 is tf.get_default_graph())
with g2.as_default():
    print(g1 is tf.get_default_graph())
    print(g2 is tf.get_default_graph())

True
False
True


### <span style="color:#0b486b"> I.5 Data Types and Cast </span> ###
TensorFlow supports several data types when declaring its nodes. We can cast a node from a data type to another. 

In [None]:
x = tf.constant(name="x", value=[1,2,3], dtype= tf.int32)
print("x type is {}".format(x.dtype))
y = tf.cast(x, tf.float32)
print("y type is {}".format(y.dtype))

Below are the popular data types in TensorFlow.

<img src="images/TF_DataTypes.png" width="600" align="center"/>

### <span style="color:#0b486b">I.6 Get Shape </span> ###
Each node in a computational graph of TensorFlow is a tensor. We can use the method `get_shape()` to get shape of a tensor. Note that we create an interactive session and use `x.eval()` to evaluate the value of node `x` in the computational graph.

In [10]:
x= tf.constant([[[1,2,3],[4,5,6]],
                [[-1,-2,-3],[-4,-5,-6]]])
sess = tf.InteractiveSession()
print("x= {}".format(sess.run(x))) 
# Eval and sess.run() is the same, but remember to chuck everything into the run
print("x shape: {}".format(x.get_shape()))
sess.close()

x= [[[ 1  2  3]
  [ 4  5  6]]

 [[-1 -2 -3]
  [-4 -5 -6]]]
x shape: (2, 2, 3)


## <span style="color:#0b486b"> I.7 Initialize Tensors </span> ###
When declaring nodes in TensorFlow, we can initialize them using a heap of methods offered by TensorFlow. After that we can querry their values in an `interactive session`.

In [11]:
a= tf.constant(name='a', value=3)
b= tf.fill(name='b', dims= [2,3], value=-1)
c= tf.zeros(name='c', shape= [2,3])
d= tf.ones(name='d', shape=[2,3])
e= tf.random_normal(name='e', shape=[2,3], mean=0, stddev=1)
f= tf.ones_like(e)
g= tf.zeros_like(e)
h= tf.random_shuffle([5,10,15,20])
k= tf.random_uniform([2,3], minval=-1, maxval=1)

In [12]:
sess= tf.InteractiveSession()
print("a={}".format(a.eval()))
print("b={}".format(b.eval()))
print("c={}".format(c.eval()))
print("d={}".format(d.eval()))
print("e={}".format(e.eval()))
print("f={}".format(f.eval()))
print("g={}".format(g.eval()))
print("h={}".format(h.eval()))
print("k={}".format(k.eval()))
sess.close()

a=3
b=[[-1 -1 -1]
 [-1 -1 -1]]
c=[[0. 0. 0.]
 [0. 0. 0.]]
d=[[1. 1. 1.]
 [1. 1. 1.]]
e=[[ 1.1932441   0.10100866 -0.1715935 ]
 [ 0.24681237  1.1333975  -2.6883318 ]]
f=[[1. 1. 1.]
 [1. 1. 1.]]
g=[[0. 0. 0.]
 [0. 0. 0.]]
h=[15 20  5 10]
k=[[-0.6018903   0.82583356  0.8424437 ]
 [-0.2821567  -0.94123673  0.5967517 ]]


### <span style="color:#0b486b"> I.8 Matrix Multiplication and Activation </span> ###
Matrix multiplication in conjunction with activation is a building block in deep learning. In the following code snippet, we declare the matrix $A$ of $(2,3)$ and a vector $x$ of $(3,)$. To be able to apply $A \times x$, we need to expand $x$ one dimension via the method `tf.expand_dims()`. Now $x$ has shape $(3,1)$ and we can do a matrix multiplication. Finnaly, we apply the activation function ReLu over the output of matrix multiplication: $ReLu(A \times x)$. In addition, the formulation of ReLu is $ReLu(t)= \max\{0,t\}$, which is the most popular activation function in deep learning.

In [13]:
A = tf.constant([[1,2,3],
                 [-4,-5,-6]])
x= tf.constant([1,1,1])
print("A shape: {}".format(A.get_shape()))
print("x shape: {}".format(x.get_shape()))
x= tf.expand_dims(x,1) # expand one dimension to turn x's shape into (3, 1) 
print("x shape: {}".format(x.get_shape()))
sess= tf.InteractiveSession()
y =tf.matmul(A,x)
print("y before activation: {}".format(y.eval()))
y= tf.nn.relu(y) 
print("y after activation: {}".format(y.eval()))
sess.close()

A shape: (2, 3)
x shape: (3,)
x shape: (3, 1)
y before activation: [[  6]
 [-15]]
y after activation: [[6]
 [0]]


In [None]:
x= tf.expand_dims(x,1)
print("x shape: {}".format(x.get_shape()))

**<span style="color:red">Exercise 3</span>**: Create a tensor $x$ of shape $(3, 1)$ with entry values $x[i, 0]=i$, $\forall i = 0, 1, 2$ and another tensor $y$ of shape $(1, 3)$ with all entry values of $2$. Create another operation $z$ to perform matrix multiplication of $x$ and $y$. This operation is also called outer product. Print the shape of $z$. Also, run a querry to compute $z$ and print the result.

In [22]:
# your code here
sess = tf.InteractiveSession() 
x = tf.constant([[0], [1], [2]])
y = tf.constant([[2,2,2]])
print(x.get_shape())
print(y.get_shape())
z = tf.matmul(x, y)
print(z.eval())
sess.close()

(3, 1)
(1, 3)
[[0 0 0]
 [2 2 2]
 [4 4 4]]


### <span style="color:#0b486b">I.9 Name </span> ###
Each tensor object also has an identifying name. This name is an *intrinsic* string name. We can use the `object.name` attribute to see the name of the object.
Objects residing within the same graph cannot have the same name.


TF will automatically add an underscore and a number to distinguish the two which accidentally have the same name. Both objects can have the same name when they are associated with different graphs.

In [26]:
tf.reset_default_graph()
g = tf.get_default_graph()
with g.as_default():
    c1= tf.constant(name='c', value=3)
    c2= tf.constant(name='c', value=5)
print("c1 name: {}".format(c1.name))
print("c2 name: {}".format(c2.name))

c1 name: c:0
c2 name: c_1:0


### <span style="color:#0b486b">I.10 Name Scope </span> ### 
Name scope is a *hierarchically group nodes* together by name to divide a graph into subgraphs with some semantic meaning. To declare a name space, we use the syntax: `tf.name_scope("prefix")`.

Organizing your TensorFlow code using name scopes shows some advantages:
- Make it easier to follow and manage.
- Visualization of the graph structure.
- Useful when dealing with a large, complicated graph.

In [27]:
with tf.get_default_graph().as_default():
    c1 = tf.constant(name= "c", value=1.0)
    with tf.name_scope("prefix"):
        c2 = tf.constant(name= "c", value=2.0)
        c3 = tf.constant(name= "c", value=3.0)
print("c1 name: {}".format(c1.name))
print("c2 name: {}".format(c2.name))
print("c3 name: {}".format(c3.name))

c1 name: c_2:0
c2 name: prefix/c:0
c3 name: prefix/c_1:0


### <span style="color:#0b486b">I.11 Variable </span> ### 
Variable nodes are crucial nodes in a TensorFlow computational graph whose values can be *modified* and *changed*. 

In a deep learning model, `model parameters` are declared as `variable nodes` in the relevant computational graph whose values are modified during training course. Using variables is done in two stages:
- Call the `tf.Variable()` function to create a variable and define what value it will be initialized with.
- Explicitly perform an initialization operation by running the session with the `tf.global_variables_initializer()` method. This will allocate the memory for the variable and set its initial values.



In [28]:
# Remember the initial value are random values for Dl, then we update based on Baclpropagation
init_var = tf.random.normal(shape=[2,3], mean=0, stddev=0.1, dtype= tf.float32) 
my_var=  tf.Variable(initial_value= init_var, name='x')
init= tf.global_variables_initializer()
print("Pre-run my_var: {}".format(my_var))
with tf.Session() as sess:
    sess.run(init)
    my_var = sess.run(my_var)
    print("Post-run my_var: {}".format(my_var))

Pre-run my_var: <tf.Variable 'x:0' shape=(2, 3) dtype=float32_ref>
Post-run my_var: [[-0.09475895 -0.11482976  0.00464236]
 [-0.07952299 -0.05104303 -0.10303134]]


Below presents some popular ways to *randomly initialize* the intial values of variables.

<img src="images/tf_random.png" width="800" align="center"/>


### <span style="color:#0b486b">I.12 Variable Scope and Reuse Variable </span> ###  

Sometimes we might want to reuse a variable. This can be done by `tf.get_variable()`, which either reuses the variable with a specified name or creates a new variable with the name if it is not created before.

If we want to reuse a variable later, we first need to declare this in a variable scope with `tf.variable_scope()` and then in this variable scope invoke `tf.get_variable()` with the flag reuse to be set to `True`.

In [30]:
with tf.variable_scope("layer1"):
    W1= tf.get_variable(name='W', shape=[2,3], dtype= tf.float32, initializer= tf.initializers.random_normal(0,0.1))
    x= tf.Variable(initial_value=tf.random_normal([4,2], 0, 0.1), name="x")
    v1= tf.matmul(x, W1)

with tf.variable_scope("layer1", reuse=True):
    W2= tf.get_variable(name='W', shape=[2,3], dtype= tf.float32, initializer= tf.initializers.random_normal(0,0.2))

diff = tf.subtract(W1,W2)
norm_diff = tf.norm(diff, ord='euclidean')
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    norm_diff= sess.run(norm_diff)

if norm_diff==0:
    print("W1 and W2 are tight")
else:
    print("W1 and W2 are not tight")

ValueError: Variable layer1/W already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "d:\Work Stuff\Monash Stuff\Year 3 Semester 1\FIT3181 Deep Learning\FIT3181-Deep-Learning\Week 1\wk1\lib\site-packages\tensorflow\python\framework\ops.py", line 2133, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)
  File "d:\Work Stuff\Monash Stuff\Year 3 Semester 1\FIT3181 Deep Learning\FIT3181-Deep-Learning\Week 1\wk1\lib\site-packages\tensorflow\python\framework\ops.py", line 3754, in _create_op_internal
    ret = Operation(
  File "d:\Work Stuff\Monash Stuff\Year 3 Semester 1\FIT3181 Deep Learning\FIT3181-Deep-Learning\Week 1\wk1\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 797, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "d:\Work Stuff\Monash Stuff\Year 3 Semester 1\FIT3181 Deep Learning\FIT3181-Deep-Learning\Week 1\wk1\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 1750, in variable_v2
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "d:\Work Stuff\Monash Stuff\Year 3 Semester 1\FIT3181 Deep Learning\FIT3181-Deep-Learning\Week 1\wk1\lib\site-packages\tensorflow\python\ops\state_ops.py", line 70, in variable_op_v2
    return gen_state_ops.variable_v2(


### <span style="color:#0b486b">I.13 Name Scope & Variable Scope </span> ###   
There are two different types of scopes: `name scope` created using `tf.name_scope` and `variable scope` created using `tf.variable_scope`. Both scopes have the same effect on all operations as well as variables created using `tf.Variable`, while only variable scope affects on `tf.get_variable()`

The following figure shows the difference of two ways to create variables: `tf.Variable` and `tf.get_variable()`. <br/>

<img src="images/Compare2WaysVariables.png" width="400" align="center"/>

In [31]:
tf.reset_default_graph()
with tf.name_scope("ns"):
    with tf.variable_scope("vs", reuse=tf.AUTO_REUSE):
        c1= tf.Variable(name="c", initial_value= tf.constant(1.0))
        c2= tf.get_variable(name="c", initializer= tf.constant([-1,1]))
print("c1 name: {}".format(c1.name))
print("c2 name: {}".format(c2.name))

c1 name: ns/vs/c:0
c2 name: vs/c:0


**<span style="color:red">Exercise 4</span>**: TensorFlow can also automatically decide to create a new variable if it does not exist and reuse if it is already created by setting `reuse=tf.AUTO_REUSE` as shown in the function `linear()` below. Questions:

(a) What is the value of `diff`? You can run a session to compute its value to answer this question.

(b) Why does `diff` get that value?

(c) What happens if we set `reuse=False` in the delaration of `z` in the code below? Why?

In [None]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

tf.reset_default_graph()

def linear(x, output_dim, reuse=tf.AUTO_REUSE):
    with tf.variable_scope("linear", reuse=reuse):
        W = tf.get_variable('weights', shape=[output_dim, x.get_shape()[0]], dtype=tf.float32)
        output = tf.matmul(W, x)
        return output
    
x = tf.constant(0.0, shape=(3, 1))
y = linear(x, 3, reuse=tf.AUTO_REUSE) 
z = linear(x, 3, reuse=tf.AUTO_REUSE) 
diff = y - z

In [None]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(diff.eval())

**Write your answer here:**

(a)

(b)

(c)

### <span style="color:#0b486b">I.14 Place Holder </span> ###    
Placeholders can be thought of as `empty variables` that will be filled with data later on.

We use them when constructing our graph and when executing querying value of nodes, we need to feed them with the input data.


In [32]:
import numpy as np

tf.reset_default_graph()
x_data= np.random.rand(10,3)

with tf.name_scope("layer1"):
    W = tf.Variable(name= "W", initial_value=tf.random.normal([1,10],0,0.1, dtype= tf.float32))
    b = tf.Variable(name="b", initial_value=tf.random.normal([1,3],0,0.1, dtype=tf.float32))
    x = tf.placeholder(name="x", shape=[10,3], dtype= tf.float32)
    v = tf.matmul(W,x) +b
    s = tf.reduce_mean(v)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    s= sess.run(s, feed_dict={x:x_data}) # Feed in the placeholder value here using feed_dict here

print("s= {}".format(s))

s= -0.14842142164707184


### <span style="color:#0b486b">I.15. Visualization with TensorBoard</span>

##### <span style="color:#0b486b"> Logging and creating summary for visualization </span>

Usually one would use the `print()` function and `matplotlib` to visualize progress during training. There is a better way to do this via TensorBoard. If you feed in training stats, it will display nice interactive visualizations of
these stats in your web browser (e.g., learning curves).

You can also provide the graph’s definition and TensorBoard provides interface to browse through it. This is very useful to identify errors in the graph, to find bottlenecks, and so on.

Let's visualize learning rate and global step in the example above.

In [33]:
# construction
tf.reset_default_graph()

starter_lr = 1.
decay_rate = 0.9
global_step = tf.Variable(0., trainable=0)
incr = tf.assign(global_step, global_step + 1)

with tf.control_dependencies([incr]):
    learning_rate = starter_lr * tf.pow(decay_rate, global_step)

Here, we construct the graph as normal:

In [34]:
tf.summary.scalar('learning_rate', learning_rate)
tf.summary.scalar('global_step', global_step)
merged = tf.summary.merge_all() # Merges all summaries collected in the default graph

The first two lines create two summary ops in the graph that will evaluate the learning_rate and global_step value and write them to a TensorBoard compatible binary log string called a summary.

The third line creates a node that merges all summaries collected in the default graph. In the execution phase, you'll need to evaluate the merged node regularly during training (e.g., every 10 mini-batches). This will output a summary that you can then write to the events file using the *`file_writer`*.

In [35]:
import time
logdir = "tf_logs/example01/model-at-{}".format(time.strftime('%Y-%m-%d_%H.%M.%S'))
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

<img src="images/note.gif" width="40" align="left"> Tip: You need to use a different log directory every time you run your program, or else TensorBoard will merge stats from different runs, which will mess up the visualizations. The simplest solution for this is to include a timestamp in the log directory name.

Now's the execution phase: <br>
\- The first line creates a node in the graph that will evaluate the *MSE* value and write it to a TensorBoard compatible binary log string called a summary. Then you need to update the execution phase to evaluate the *`mse_summary`* node regularly during training
(e.g., every 10 mini-batches). This will output a summary that you can then write to the events file using
the *`file_writer`*. Here is the updated code:

In [36]:
with tf.Session() as sess:
    global_step.initializer.run()
    for i in range(50):
        merged_ = merged.eval()
        file_writer.add_summary(merged_, i+1)

<img src="images/warning.png" width="40" align="left"></img> **Warning**: *In real-world application logging training stats at every single training step would significantly slow down training; instead, one should log at regular interval, such as after each 200 iterations*.

Finally, you want to close the FileWriter at the end of the program:

In [37]:
file_writer.close()

Great, now it's time to show Tensorboard. Fortunately, we can run and show tensorboard within the notebook by running the command line as below.

In [38]:
%load_ext tensorboard
%tensorboard --logdir tf_logs

Otherwise, you can open Tensorboard in a separate browser's window. You need to activate your virtual environment
if you created one, then start the server by running the *`tensorboard`* command, pointing it to the root log
directory. This starts the TensorBoard web server, listening on port 6006 <br> <br>
- Open command line, nevigate to the folder of this tute and run **> tensorboard --logdir tf_logs**
- Open your browser and go to https://localhost:6006. Welcome to
TensorBoard! In the Scalars tab, you'll see *`global_step`* and *`learning_rate`*: <br><br>

<img src='images/learning_rate.png' width=300>

## <span style="color:#0b486b"> II. Understanding TensorFlow 2.x </span> <span style="color:red">*****</span>

### Table of Content 
II.1 Advancements of TensorFlow 2. <br>
II.2 TensorFlow 2.x's API and architecture. <br>
II.3 Eager Execution and Dynamic Graph. <br> 
II.4 Placeholder Replacement and The Data API in TF 2.x <br> 


### <span style="color:#0b486b"> II.1. Advancements of TensorFlow 2.x </span>

In TF 1.x, we need to declare a computational graph manually and create session to query node values. This is different from an imperative and high-level programing language such as Python, which is much easier to work with and more dynamic. TF 2.x brings TF to closer to an imperative and high-level programming language such as Python, thereby making the task of building up deep learning models more conveniently. This is due to two new features: **eager execution** and **AutoGraph**.
- **Eager execution**: you still have a graph, but you can define, change, and execute nodes on-the-ﬂy, with no special session interfaces or placeholders. This is what is called eager execution, meaning that the model definitions are dynamic, and the execution is immediate. Graphs and sessions should be considered as implementation details. In TF 2.x, we do not need to work directly with a computational graph and session.
- **AutoGraph**: AutoGraph takes eager-style Python code and automatically converts it to graph-generating code. So, again, transparently TensorFlow 2.x creates a bridge between imperative, dynamic, and eager Python style programming with efficient graph computations, taking the best of both worlds.

In addition, Keras has been incorporated in and become a part of TF 2.x, which allows us to build up deep learning models comfortably and conveniently.


### <span style="color:#0b486b"> II.2. TensorFlow 2.x's API and architecture</span>

The below image shows the API of TF 2.x. At the lowest level, each TensorFlow operation (op for short) is implemented using highly efficient C++ code. Many operations have multiple implementations called kernels: each kernel is dedicated to a specific device type, such as CPUs, GPUs, or even TPUs (tensor processing units).

<img src='images/TF_API.png' align=center width=600>


The architecture of TF 2.x is shown in the following image. Most of the time your code will use the high-level APIs (especially tf.keras and tf.data); but when you need more flexibility, you will use the lower-level Python API, handling tensors directly. 

<img src='images/TF_Architecture.png' align=center width=600>

### <span style="color:#0b486b"> II.3. Eager Execution in TF 2.x</span>

#### Define, Change and Execute on-the-fly 
For example, you can define a variable, then modify it and showing the current value of this variable without calling any session or global_variables_initializer function as in TF 1.x  

In [None]:
import tensorflow as tf 

v = tf.Variable(tf.ones([2,3], dtype= tf.float32)) 
print('--------')
print('Define a variable which has initial value is a tensor one with shape [2,3]')
print(v)

 
v.assign_add(10. * tf.ones_like(v))
print('--------')
print('In place increasing by 10 ')
print(v)

 
v[0,0].assign(-1)
print('--------')
print('Modify individual cells in the tensor, e.g., cell [0,0] to -1 ')
print(v)


# Modify a specific cell inside a forloop 
print('--------')
for i in range(v.shape[1]): 
#     v[:,i] = i # Raise Error: 'ResourceVariable' object does not support item assignment 
#     v[:,i].assign(i) # does not match r-value shape []. Automatic broadcasting not yet implemented 
    v[0,i].assign(i) # Working 
    print('-- iter = {} --'.format(i))
    print(v)


#### Variable Scope and Reuse Variable in TF 2.x 
In TF 1.x you need to manage the variable scope of each variable, especially when reusing it. In TF 2.x, it is more flexible.    

In [None]:
# Example in TF 1.x 

#import tensorflow as tf 
#tf.compat.v1.disable_v2_behavior()
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# Scenario 1: an input x that shared with two weights W1 and W2 
print('-----------------------------')
x = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
W1 = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
W2 = tf.Variable(tf.random.normal(shape=[2,5], mean=0., stddev=0.1))
v1 = tf.matmul(x, W1)
v2 = tf.matmul(x, W2)

print('v1', v1)
print('v2', v2)

# Scenario 2: two input x1 and x2 multiple with the same weight W 
print('-----------------------------')
x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
W = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
v1 = tf.matmul(x1, W)
v2 = tf.matmul(x2, W)

print('v1', v1)
print('v2', v2)

# Scenario 3: Define a function that is matrix multiplication with pre-defined weight W 
print('-----------------------------')
W = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
def my_matmul(x): 
    return tf.matmul(x, W)

x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
v1 = my_matmul(x1)
v2 = my_matmul(x2)
print('v1', v1)
print('v2', v2)

# Scenario 4: Define a function that is matrix multiplication with weight W is defined inside the function. 
# later in this unit, you will see some common practices, e.g., a convolution layer 
print('-----------------------------')
def my_matmul2(x): 
    with tf.variable_scope("W", reuse=tf.AUTO_REUSE):
        W = tf.get_variable(name='W', shape=[2,3], dtype= tf.float32, initializer= tf.initializers.random_normal(0,0.1))
    return tf.matmul(x, W)
x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
v1 = my_matmul2(x1)
v2 = my_matmul2(x2)
print('v1', v1)
print('v2', v2)


You need to restart kernel if enabling v1 behavior before .

In [None]:
# Reproduce in TF 2.x 
# Need to restart kernel if enabling v1 behavior before 

#import tensorflow as tf 
#tf.compat.v1.enable_v2_behavior()
import tensorflow.compat.v1 as tf
tf.enable_v2_behavior()

# Scenario 1: an input x that shared with two weights W1 and W2 
print('-----------------------------')
x = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
W1 = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
W2 = tf.Variable(tf.random.normal(shape=[2,5], mean=0., stddev=0.1))
v1 = tf.matmul(x, W1)
v2 = tf.matmul(x, W2)

print('v1', v1)
print('v2', v2)

# Scenario 2: two input x1 and x2 multiple with the same weight W 
print('-----------------------------')
x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
W = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
v1 = tf.matmul(x1, W)
v2 = tf.matmul(x2, W)

print('v1', v1)
print('v2', v2)

# Scenario 3: Define a function that is matrix multiplication with pre-defined weight W 
print('-----------------------------')
W = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
def my_matmul(x): 
    return tf.matmul(x, W)

x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
v1 = my_matmul(x1)
v2 = my_matmul(x2)
print('v1', v1)
print('v2', v2)

# Scenario 4: Define a function that is matrix multiplication with weight W is defined inside the function. 
# later in this unit, you will see some common practices, e.g., a convolution layer 
print('-----------------------------')
def my_matmul2(x): 
    W = tf.Variable(tf.random.normal(shape=[2,3], mean=0., stddev=0.1))
    return tf.matmul(x, W)

x1 = tf.Variable(tf.random.normal(shape=[4,2], mean=0., stddev=0.1))
x2 = tf.Variable(tf.random.normal(shape=[1,2], mean=0., stddev=0.1))
v1 = my_matmul2(x1)
print('v1', v1)
print('W', W)

v2 = my_matmul2(x2)

print('v2', v2)
print('W', W)
print('Using the same W in two calls')

#### <span style="color:#0b486b"> AutoGraph and Tracing</span>
To fulfill the autograph feature, TF 2.x need to parse and analyze Python functions to capture all the control flow statements, such as *for loops, while loops*, and *if* statements, as well as *break, continue*, and *return statements*. 

After analyzing the function’s code, AutoGraph outputs an upgraded version of that function in which all the control flow statements are replaced by the appropriate TensorFlow operations, such as `tf.while_loop()` ,`for loops`, and `tf.cond()` for if statements. 

In the below figure, the function `sum_square(n)` is analyzed and transformed to the function `tf_sum_square(n)`, which is more convenient to be executed in the graph mode in the next step. When you invoke the former function: `sum_square(tf.constant(5))` for example, the upgraded function `tf__sum_squares()` function will be called with a symbolic tensor of type int32 and shape []. The function will run in graph mode, meaning that each TensorFlow operation will add a node in the graph to represent itself and its output tensor(s) (as opposed to the regular mode, called eager execution, or eager mode). In the following figure, you can see the `tf__sum_squares()` function being called with a symbolic tensor as its argument (in this case, an int32 tensor of shape []) and the final graph being generated during tracing. 

<img src='images/AutoGraph.png' align=center width=700>

### <span style="color:#0b486b"> II.4. Placeholder Replacement and The Data API in TF 2.x </span>

Tensorflow is designed to work with tensors, not numpy arrays. Therefore, we need data loading and processing operations to convert a raw data (i.e., text, images in digital form) to the tensor format. 

In TF 1.x we use a placeholder as a gate to feed data to train deep learning models. However, in TF 2.x, it is replaced by the Data API. Later in this unit, you will learn more about data processing operations, such as, batching, bufferring, transformation, etc (more detail in references [1,2] for curious learners). In this early introduction, we just show some basic data operations that replace placeholder. 

References:

[1] Chapter 13 in *Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems** by Geron Aurelien

[2] Tensorflow tutorial in the link https://www.tensorflow.org/guide/data 


In [None]:
import tensorflow as tf
import numpy as np
# Create a tensor with a numpy array
a = np.array([2., 4., 5.]) # dtype is float64
t = tf.constant(a) # dtype is float64

print('a', a, type(a), a.dtype)
print('t', t, type(t), t.dtype)

# Convert a tensor back to a numpy array 
b = t.numpy() # dtype is float64
print('b', b, type(b), b.dtype)

# convert to a tensor with function 
# This function converts Python objects of various types to Tensor objects. 
# It accepts Tensor objects, numpy arrays, Python lists, and Python scalars.
list_t = tf.convert_to_tensor([1.,2.,3.])
print('list_t', list_t, type(list_t), list_t.dtype)

# working with string 
s = 'Hello there'
string_t = tf.constant(s)
print('string_t', string_t, type(string_t), string_t.dtype)

Notice that NumPy uses 64-bit precision by default, while TensorFlow uses 32-bit. This is because 32-bit precision is generally more than enough for neural networks, plus it runs faster and uses less RAM. So when you create a tensor from a NumPy array, make sure to set dtype=tf.float32

In [None]:
# Some functions can work with numpy array directly 
print(tf.square(a))

# We can multiply a numpy array with a tensor 
print(tf.multiply(a, t))


---
### <span style="color:#0b486b"> <div  style="text-align:center">**THE END**</div> </span>