# Review some basic concepts

This tutorial uses some materials from the course CS 20SI: Tensorflow for Deep Learning Research, University of Stanford.

### 1. Data Flow graph
TensorFlow performs calculations in a directed **graph**. Basically, we need to assemble a graph and use a **session** to execute **operations** in the graph. 
<img src="resources/im1.png">

In [None]:
import tensorflow as tf
# initialize a and b
a = 5
b = 3
# define operations on the graph
c = tf.multiply(a,b)
d = tf.add(a,b)
e = tf.add(c,d)
# create a session and execute
sess = tf.Session()
output = sess.run(e)
print('Output of e: {0}'.format(output))
sess.close()

Why graphs? **To save computations** (only run subgraphs that leed to values you want to fetch).

In [None]:
# define a useless operation
useless = tf.add(a,c)
# create a session and execute
sess = tf.Session()
output = sess.run(e)
print('Output of e: {0}'.format(output))
print('Output of useless: {0}'.format(useless))
sess.close()

*useless* is only calculated only if we pass it to sess.run

In [None]:
with tf.Session() as sess: # writing as this way frees us from calling sess.close()
    sess = tf.Session()
    output = sess.run([e,useless])
    print('Output of e: {0}'.format(output[0]))
    print('Output of useless: {0}'.format(output[1]))

A graph can be separated into **multiple chunks** which can be run **simultaneously** on multiple CPUs, GPUs or devices.

What if we want to add more than one graph? You can, but **SHOULD NOT**. The session only run the default graph. But if you really want:

In [None]:
# this is to remove everything we added to the default graph up to now
tf.reset_default_graph() 
g = tf.Graph()
# if we don't indicate g as default, tf will add the node to default graph
with g.as_default(): 
    a = tf.add(3,5)
# and we have to indicate g as a parameter of tf.Session()
with tf.Session(graph=g) as sess: 
    print('Output of a: {0}'.format(sess.run(a)))

### 2. Using TensorBoard

In [None]:
import tensorflow as tf
a = tf.constant(2)
b = tf.constant(3)
x = tf.add(a, b)
with tf.Session() as sess:
     # add this line to use TensorBoard.
    writer = tf.summary.FileWriter('./graphs', sess.graph)
    print('Output of x: '.format(sess.run(x))
writer.close() # close the writer when you’re done using it

<img src="resources/im2.png">

We can name the nodes in order to have clearer visualization.

In [None]:
import tensorflow as tf
a = tf.constant(2, name='a')
b = tf.constant(3, name='b')
x = tf.add(a, b, name='add')
with tf.Session() as sess:
     # add this line to use TensorBoard.
    writer = tf.summary.FileWriter('./graphs', sess.graph)
    print('Output of x: '.format(sess.run(x))
writer.close() # close the writer when you’re done using it

<img src="resources/im3.png">

### 3. Basic operators

#### 3.1 Constants

**tf.constant(value, dtype=None, shape=None, name='Const', verify_shape=False)**

Constants are used to store constant data. Whenever a constant is declared, it is registered as a component of the graph and loaded together with the graph. That is why constant should be used only to store primitive data type.

In [None]:
import tensorflow as tf
a = tf.constant([2, 2], name="a")
b = tf.constant([[0, 1], [2, 3]], name="b")
x = tf.add(a, b, name="add")
y = tf.multiply(a, b, name="mul")
with tf.Session() as sess:
    x, y = sess.run([x, y])
    print('x = ', x)
    print('y = ', y)

TensorFlow provides some **operations** to generate constants:
<img src="resources/im4.png">

To generate sequences:
<img src="resources/im5.png">

Or random tensors:
<img src="resources/im6.png">

In [None]:
tf.reset_default_graph()
a = tf.ones((2,4), dtype=tf.bool)
b = tf.zeros_like(a, dtype=tf.int8)
c = tf.linspace(1.0,10.0,12)
d = tf.random_normal((2,5), mean=3.0, stddev=5, seed=128)
with tf.Session() as sess:
    print('\na:')
    print(sess.run(a))
    print('\nb:')
    print(sess.run(b))
    print('\nc:')
    print(sess.run(c))
    print('\nd:')
    print(sess.run(d))

Constants are **stored on the graph**. If there are plenty of constants, it makes graph loading expensive. For large data, variables should be used instead.

In [None]:
tf.reset_default_graph()
a = tf.constant([[1,2],[3,4]], dtype=tf.int16, name='a')
with tf.Session() as sess:
    print(sess.graph.as_graph_def())
    print('Value of constant a: \n', a.eval())

Do you see the entry **tensor_content**? Now we see what happens to **variable**.

In [None]:
tf.reset_default_graph()
a = tf.Variable([[1,2],[3,4]], dtype=tf.int16, name="Va")
with tf.Session() as sess:
    print(sess.graph.as_graph_def())
    print('Value of constant a: \n', a.eval())

We also see the node named **Va/initial_value** containing **tensor_content**, but value of variable *a* has not loaded yet.

#### 3.2 Variables

When you train a model, you use variables to hold and update parameters.

In [None]:
import tensorflow as tf
# create variable a with scalar value
a = tf.Variable(2, name="scalar")
# create variable b as a vector
b = tf.Variable([2, 3], name="vector")
# create variable c as a 2x2 matrix
c = tf.Variable([[0, 1], [2, 3]], name="matrix")

Variables need to be initialized. The easiest way is initializing all variables at once.

In [None]:
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print('Value of variable a: ', sess.run(a))

or initializing only a subset of variables:

In [None]:
init_ab = tf.variables_initializer([a, b], name="init_ab")
with tf.Session() as sess:
     sess.run(init_ab)

or initializing a single variable:

In [None]:
W = tf.Variable(tf.ones([5,10]))
with tf.Session() as sess:
    sess.run(W.initializer)

After initializing, a variable can be read, saved and restored. This is an easy way to save and load a model.

In [None]:
tf.reset_default_graph()
W = tf.Variable(tf.ones([5,10], name='W'))
saver = tf.train.Saver({'W': W})
with tf.Session() as sess:
    sess.run(W.initializer)
    # read variable
    print(W.eval())
    # save variable
    save_path = saver.save(sess, "/tmp/W.ckpt")
    print('Model saved in file: ', save_path)

In [None]:
tf.reset_default_graph()
# the name must be identical to the saved variable
W = tf.Variable(tf.zeros([5,10]), name='W') 
saver = tf.train.Saver()
with tf.Session() as sess:
    # restore variable
    saver.restore(sess, "/tmp/W.ckpt")
    # read variable
    print(W.eval())
    # although W is defined as zeros, 
    # but after restored from disk, it contains ones

**assign()** function is used to assign a new value to a variable. And infact, **initializing** a variable is an **assigning** operation.

In [None]:
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
    sess.run(W.initializer)
    print('Value of W: ', W.eval()) # >> ????

Variables are defined in the graph, but different to **constants**, content of variables are loaded only when we invoke **initialization**. Each session maintains its own copy of variables.

In [None]:
W = tf.Variable(10)
sess1 = tf.Session()
sess2 = tf.Session()
sess1.run(W.initializer)
sess2.run(W.initializer)
print(sess1.run(W.assign_add(10))) # >> 20
print(sess2.run(W.assign_sub(2))) # >> 8
sess1.close()
sess2.close()

#### 3.3 Place holder

A TF program often has two phases:
* Assemble a graph
* Use a session to execute operations in the graph

Can we assemble a graph first without knowing the values needed for computation? For example, can we define f(x,y) = 2x + y without knowing x,y?

In [None]:
tf.reset_default_graph()
x = tf.placeholder(tf.float16, shape=[3])
y = tf.placeholder(tf.float16, shape=[3])
f = 2*x + y
with tf.Session() as sess:
    print(sess.run(f)) # error because x and y are unknown

Feed values to placeholders via dictionary.

In [None]:
tf.reset_default_graph()
x = tf.placeholder(tf.float16, shape=[3])
y = tf.placeholder(tf.float16, shape=[3])
f = 2*x + y
with tf.Session() as sess:
    print(sess.run(f, {x:[1,2,3], y:[4,5,6]}))

#### 3.4 Lazy loading
Lazy loading is a typical mistake when working with TF. The following code is **normal loading**.

In [None]:
x = tf.Variable(10, name='x')
y = tf.Variable(20, name='y')
z = tf.add(x, y) # you create the node for add node before executing the graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for _ in range(10):
        sess.run(z)

<img src='./resources/im7.png'>

And the following code is **lazy loading**. Someone decides to save one line of code.

In [None]:
x = tf.Variable(10, name='x')
y = tf.Variable(20, name='y')
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for _ in range(10):
        sess.run(tf.add(x, y))

<img src='./resources/im8.png'/>

**Lazy loading** is bad for reading graph. **Lazy loading** also adds redundant nodes in the graph, making your graph flooded.

<img src='./resources/im10.png'/>

### 4. Exercises

In [None]:
"""
Simple TensorFlow exercises
You should thoroughly test your code
"""

import tensorflow as tf
tf.reset_default_graph()

###############################################################################
# 1a: Create two random 0-d tensors x and y of any distribution.
# Create a TensorFlow object that returns x + y if x > y, and x - y otherwise.
# Hint: look up tf.cond()
# I do the first problem for you
###############################################################################

x = tf.random_uniform([])  # Empty array as shape creates a scalar.
y = tf.random_uniform([])
out = tf.cond(tf.less(x, y), lambda: tf.add(x, y), lambda: tf.subtract(x, y))
with tf.Session() as sess:
    print(sess.run(out))

In [None]:
###############################################################################
# 1b: Create two 0-d tensors x and y randomly selected from -1 and 1.
# Return x + y if x < y, x - y if x > y, 0 otherwise.
# Hint: Look up tf.case().
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1c: Create the tensor x of the value [[0, -2, -1], [0, 1, 2]] 
# and y as a tensor of zeros with the same shape as x.
# Return a boolean tensor that yields Trues if x equals y element-wise.
# Hint: Look up tf.equal().
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1d: Create the tensor x of value 
# [29.05088806,  27.61298943,  31.19073486,  29.35532951,
#  30.97266006,  26.67541885,  38.08450317,  20.74983215,
#  34.94445419,  34.45999146,  29.06485367,  36.01657104,
#  27.88236427,  20.56035233,  30.20379066,  29.51215172,
#  33.71149445,  28.59134293,  36.05556488,  28.66994858].
# Get the indices of elements in x whose values are greater than 30.
# Hint: Use tf.where().
# Then extract elements whose values are greater than 30.
# Hint: Use tf.gather().
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1e: Create a diagnoal 2-d tensor of size 6 x 6 with the diagonal values of 1,
# 2, ..., 6
# Hint: Use tf.range() and tf.diag().
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1f: Create a random 2-d tensor of size 10 x 10 from any distribution.
# Calculate its determinant.
# Hint: Look at tf.matrix_determinant().
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1g: Create tensor x with value [5, 2, 3, 5, 10, 6, 2, 3, 4, 2, 1, 1, 0, 9].
# Return the unique elements in x
# Hint: use tf.unique(). Keep in mind that tf.unique() returns a tuple.
###############################################################################

# YOUR CODE

In [None]:
###############################################################################
# 1h: Create two tensors x and y of shape 300 from any normal distribution,
# as long as they are from the same distribution.
# Use tf.less() and tf.select() to return:
# - The mean squared error of (x - y) if the average of all elements in (x - y)
#   is negative, or
# - The sum of absolute value of all elements in the tensor (x - y) otherwise.
# Hint: see the Huber loss function in the lecture slides 3.
###############################################################################

# YOUR CODE

# END