# Modern Data Science 
**(Module 05: Deep Learning)**

---
- Materials in this module include resources collected from various open-source online repositories.
- You are free to use, change and distribute this package.

Prepared by and for 
**Student Members** |
2006-2018 [TULIP Lab](http://www.tulip.org.au), Australia

---


# Session A - Deep Learning with TensorFlow

**The purpose of this session is to demonstrate how to work with an open source software library for developing deep neural networks apllications, called TensorFlow. In this practical session, we present the following topics:**

1. What is TensorFlow
2. How to install TensorFlow
3. A quick tour of TensorFlow

** References and additional reading and resources**
- [Installing Tensorflow on Windows](https://www.tensorflow.org/install/install_windows)
- [Tensorflow API documentations](https://www.tensorflow.org/versions/master/api_docs/python/)
- [Examples with Tensorflow](https://www.tensorflow.org/versions/master/get_started/)
---




## <span style="color:#0b486b">1. Getting started with TensorFlow</span>

TensorFlow is a powerful open source software library for numerical computation, particularly well-suited for large-scale Machine Learning and highly-optimized for Deep Learning. Its basic principle is simple: you first define in Python a graph of computations to perform, and then TensorFlow takes that graph and runs it efficiently using optimized C++ code.

TensorFlow has many advanced features. The **coolest** are:
* ***Programmer friendly***: The front-end high-level Python API of TensorFlow offers much more flexibility (at the cost of higher complexity) to efficiently create all sorts of computations, including any neural network architecture you can come up with. In addition, there are many higher-level packages that serve as TensorFlow's wrappers to provide even simpler APIs such as TF.Learn ([http://tflearn.org/](http://tflearn.org/), compatible with Scikit-Learn), Keras ([https://keras.io/](https://keras.io/)), to name a few. You can use these APIs to train various types of neural networks in just a few lines of code.
* ***Machine learner friendly***: TensorFlow automatically takes care of computing the gradients of the functions you define. This is called automatic differentiating (or `autodiff`).
* ***TensorBoard***: TensorFlow also comes with a great visualization tool called *TensorBoard* that allows you to browse through the computation graph, view learning curves, and more. This feature is extremely useful for researchers.
* ***Highly-optimized back-end***: TensorFlow includes highly efficient C++ implementations of many machine learning operations, particularly those needed to build neural networks. There is also a C++ API to define your own high-performance operations. * TensorFlow can train a network with millions of parameters on a training set composed of billions of instances with millions of features each.
* ***Device switchable***: You can switch between computation on CPU and GPU with one line of code: `tf.device('cpu')` or `tf.device('gpu:x')`.
* ***Parallelized and Distributed***: It is possible to break up the graph into several chunks and run them in parallel across multiple CPUs or GPUs. Moreover, TensorFlow also supports distributed computing, so you can train colossal neural networks on humongous training sets in a reasonable amount of time by splitting the computations across hundreds of servers.
* ***Cross-platform***: TensorFlow can run not only on Windows, Linux, and macOS, but also on mobile devices, including both iOS and Android.
* Last but not least, TensorFlow was developed by **Google Brain** team, and have been being long-term supported and maintained by **Google**. It powers many of Google’s large-scale services, such as Google Cloud Speech, Google Photos, and Google Search.

## <span style="color:#0b486b">2. Installation</span>


### 2.1 How to install TensorFlow

<span style="color:blue">**Step 1.**</span> Install Anaconda (if you have Anaconda installed, you can skip this step)
  
<span style="color:blue">**Step 2.**</span> Install TensorFlow on Windows with CPU:
**`pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.2.1-cp36-cp36m-win_amd64.whl`**

If you want to install Tensorflow on other OSs and/or with GPU, you can follow the instructions in [this guide](https://www.tensorflow.org/install/).

### 2.2<span style="color:#0b486b"> Testing the installation</span>

<span style="color:blue">**Step 1.**</span> Import TensorFlow and print out the version

In [None]:
import tensorflow as tf
print(tf.__version__)

<span style="color:blue">**Step 2.**</span> Test TensorBoard

In [None]:
import tensorflow as tf

g = tf.Graph()

with g.as_default():
    a = tf.constant(5., name='a')
    b = tf.constant(2., name='b')
    c = tf.multiply(a, b, name='c')

log_dir = 'tf_logs/example00'
writer = tf.summary.FileWriter(log_dir, g)  # write the graph to a event file in folder log_dir

with tf.Session(graph=g) as sess:
    print(sess.run(c))

writer.close()  # Close the writer when you're done with it

In [None]:
!ls tf_logs/example00

* Now open Anaconda prompt and change to the directory that contains your notebook (and the `tf_logs` folder) then type:  
`tensorboard --logdir=tf_logs/example00`

* The terminal will display: Starting Tensorboard b'47' at http://0.0.0.0:6006 or http://[computer_name]:6006
* Open your browser and go to http://localhost:6006/. Click "Graphs" and you will see:  

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example00/TensorBoard.jpg", width=800>

* If you want to open tensorboard at **another** port such as 9009, type the following in the terminal:  
`tensorboard --logdir='tf_logs/example00' --port=9009`  

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/note.gif" width="37", align="left"></img> *Some Windows users may have trouble with tensorboard, type the following in the terminal instead:*<br>
`tensorboard --logdir=foo:tf_logs/example00`

## 2. A quick tour of TensorFlow

After Tensorflow is installed, we can import the TensorFlow as follows:

In [None]:
import tensorflow as tf

Let's go with a very simple piece of code first!

In [None]:
x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x * x * y + y + 2

The code does not actually perform any computation yet. It just creates a computation graph. In fact, even the variables are not initialized yet.

### <span style="color:#0b486b">2.1. TensorFlow session</span>

To evaluate this graph, you need to open a TensorFlow ***session*** and use it to initialize the variables and evaluate the function *`f`*. A TensorFlow session takes care of placing the operations onto devices such as CPUs and GPUs and running them, and it holds all the variable values.

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/note.gif" width="20", align="left"></img> &nbsp; In distributed TensorFlow, variable values are stored on the servers instead of the session.

The following code creates a session, initializes the variables, evaluates function *`f`* then closes the session (which frees up resources):

In [None]:
sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)
print(result)
sess.close()

* **`with`** block: If you don't want to repeat *`sess.run()`* all the time, you can use *`with`* statement as below:

In [None]:
with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()
    print(result)

Inside the *`with`* block, the session is set as the default session. Calling *`x.initializer.run()`* is equivalent to calling *`tf.get_default_session().run(x.initializer)`*, and similarly *`f.eval()`* is equivalent to calling *`tf.get_default_session().run(f)`*. This makes the code easier to read.<br> 

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/note.gif" width="20", align="left"></img> &nbsp; We don't need to call *`sess.close()`* here because the *`with`* context manager does it automatically.

* **Interactive session**: Alternatively, you can even get rid of *`with`* block by creating an interactive session (*`InteractiveSession`*) when you are working with Jupyter notebook or a Python shell. The only difference from a regular Session is that when an *`InteractiveSession`* is created, it automatically sets itself as the default session. However, you do need to close the session manually when you are done.

In [None]:
sess = tf.InteractiveSession()
x.initializer.run()
y.initializer.run()
print(f.eval())
sess.close()

* **Global variables initialization**: Instead of manually running the initializer for every single variable, you can use the *`global_variables_initializer()`* function, and run it.

In [None]:
init = tf.global_variables_initializer()  # prepare an init node
with tf.Session() as sess:
    init.run()  # actually initialize all the variables
    result = f.eval()
    print(result)

### <span style="color:#0b486b">2.2. Managing TensorFlow graphs</span>

Any node you create is automatically added to the default graph. 

In [None]:
x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()

In most cases this is fine, but sometimes you may want to manage multiple independent graphs. You can do this by creating a new *`Graph`* and temporarily making it the default graph inside a *`with`* block, like so:

In [None]:
graph = tf.Graph()
with graph.as_default():
    x2 = tf.Variable(2)
x2.graph is graph

In [None]:
x2.graph is tf.get_default_graph()

Sometime the default graph could have duplicated nodes (variables). We can reset to clean the graph as follows:

In [None]:
tf.reset_default_graph()

### <span style="color:#0b486b">2.3. Lifecycle of a node value</span>

When you evaluate a node, TensorFlow automatically determines the set of nodes that it depends on and it evaluates these nodes first. Let's look at the following code:

In [None]:
w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
    print(y.eval())  # 10
    print(z.eval())  # 15

The code evaluates *`y`* and *`z`*. For *`y`*, TensorFlow automatically detects that *`y`* depends on *`x`*, which depends on *`w`*, so it first evaluates *`w`*, then *`x`*, and then *`y`*. The schedule is: *`w -> x -> y`*. Likewise, the order for evaluating *`z`* is: *`w -> x -> z`*. It is important to note that it will ***not reuse*** the result of the previous evaluation of *`w`* and *`x`*. In short, the preceding code evaluates *`w`* and *`x`* ***twice***.<br>

All node values are dropped between graph runs, except variable values, which are maintained by the session across graph runs. A variable starts its life when its initializer is run, and it ends when the session is closed.

If you want to evaluate *`y`* and *`z`* more efficiently, i.e., without evaluating *`w`* and *`x`* twice as in the previous code, you must ask TensorFlow to simultaneously evaluate both *`y`* and *`z`* in just one graph run as follows:

In [None]:
with tf.Session() as sess:
    y_val, z_val = sess.run([y, z])
    print(y_val)  # 10
    print(z_val)  # 15

Sometimes, we will have two **independent** operations (ops) but you’d like to specify which operation (op) should be run **first**. You can create context manager that specifies control dependencies for all operations constructed within the context. For example, the following codes call for an increment of *`global_step`* each time we compute *`learning_rate`*:

In [None]:
tf.reset_default_graph()

starter_lr = 1.
decay_rate = 0.9
global_step = tf.Variable(0., trainable=0)
incr = tf.assign(global_step, global_step + 1)

with tf.control_dependencies([incr]):
    learning_rate = starter_lr * tf.pow(decay_rate, global_step)
    
with tf.Session() as sess:
    global_step.initializer.run()
    for i in range(5):
        print('Global Step %d: Learning rate = %f' % (global_step.eval(), learning_rate.eval()))

**Without** controlling dependences: *`global_step`* will stay at `0.0`, and *`learning_rate`* will be `1.0`:

In [None]:
tf.reset_default_graph()

starter_lr = 1.
decay_rate = 0.9
global_step = tf.Variable(0., trainable=0)
incr = tf.assign(global_step, global_step + 1)

learning_rate = starter_lr * tf.pow(decay_rate, global_step)
    
with tf.Session() as sess:
    global_step.initializer.run()
    for i in range(5):
        print('Global step %d: Learning rate = %f' % (global_step.eval(), learning_rate.eval()))
        

### <span style="color:#0b486b">2.4. Placehoder</span>

Suppose that you are evaluating a function *`y`* that takes an input *`x`*. You want to do this many times and change the input *`x`* at each time (as in an iterative algorithm when we want to replace the input data at every iteration). The simplest way to do this is to use *`placeholder`* nodes. These nodes are special because they don’t actually perform any computation, they
just output the data you tell them to output at runtime. They are typically used to pass the training data to TensorFlow during training. Let's look at a simple example:

In [None]:
import numpy as np
import tensorflow as tf
tf.reset_default_graph()

x = tf.placeholder(tf.float32, shape=[None, 3])
y = x + 2
with tf.Session() as sess:
    print(y.eval(feed_dict={x: np.ones([1, 3])}))  #  feed 1x3 array 
    print(y.eval(feed_dict={x: np.zeros([2, 3])})) #  feed 2x3 array 

If you don’t specify a value at runtime for a placeholder, you get an exception. Try this:

In [None]:
with tf.Session() as sess:
    print(y.eval())

### <span style="color:#0b486b">2.5. Save and restore models</span>

Once you have trained your model, you should save its parameters to disk so that you can come back to it whenever you want, use it in another program, compare it to other models, and so on. Moreover, you probably want to save checkpoints at regular intervals during training so that if your computer crashes or encounters power-outage during training you can continue from the last checkpoint rather than start over from scratch.

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/note.gif" width="20", align="left"></img> You might get into error if the folder path `models/example01` does not exist. You can create this folder in the directory containing this notebook.


In [None]:
tf.reset_default_graph()

theta = tf.Variable(tf.zeros([5]), name='theta')
train_op = tf.assign(theta, theta + 1.)

init = tf.global_variables_initializer()
saver = tf.train.Saver()

n_epochs = 200
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        sess.run(train_op)
        if (epoch + 1) % 100 == 0:  # checkpoint every 100 epochs
            saver.save(sess, "models/example01/save_and_restore.ckpt")
        
    best_theta = theta.eval()
    print(best_theta)

Restoring a model is just as easy: you create a `Saver` at the end of the construction phase just like before, but then at the beginning of the execution phase, instead of initializing the variables using the `init` node, you call the `restore()` method of the `Saver` object:

In [None]:
tf.reset_default_graph()

theta = tf.Variable(tf.zeros([5]), name='theta')
train_op = tf.assign(theta, theta + 1.)

init = tf.global_variables_initializer()
saver = tf.train.Saver()

n_epochs = 200
with tf.Session() as sess:
    saver.restore(sess, "models/example01/save_and_restore.ckpt")
    for epoch in range(n_epochs):
        sess.run(train_op)
        if (epoch + 1) % 100 == 0:  # checkpoint every 100 epochs
            saver.save(sess, "models/example01/save_and_restore_cont.ckpt")
        
    best_theta = theta.eval()
    print(best_theta)

After restoring and training for another 200 steps, best_theta is now [400.  400.  400.  400.  400.]

### <span style="color:#0b486b">2.6. Visualize computational graph and learning curves in TensorBoard</span>

Normally, we rely on the `print()` function and `matplotlib` to visualize progress during training. There is a better way, i.e., using TensorBoard. If you feed it some training stats, it will display nice interactive visualizations of
these stats in your web browser (e.g., learning curves). You can also provide it the graph’s definition and
it will give you a great interface to browse through it. This is very useful to identify errors in the graph, to
find bottlenecks, and so on. Let's visualize learning rate and global step in the example above.

In [None]:
# construction
tf.reset_default_graph()

starter_lr = 1.
decay_rate = 0.9
global_step = tf.Variable(0., trainable=0)
incr = tf.assign(global_step, global_step + 1)

with tf.control_dependencies([incr]):
    learning_rate = starter_lr * tf.pow(decay_rate, global_step)

Here, we construct the graph as normal:

In [None]:
tf.summary.scalar('learning_rate', learning_rate)
tf.summary.scalar('global_step', global_step)
merged = tf.summary.merge_all() # Merges all summaries collected in the default graph

The first two lines create two summary *`ops`* in the graph that will evaluate the *`learning_rate`* and *`global_step`* value and write them to a TensorBoard compatible binary log string called a summary. The third line creates a node that merges all summaries collected in the default graph. In the execution phase, you'll need to evaluate the merged node regularly during training (e.g., every 10 mini-batches). This will output a summary that you can then write to the events file using the *`file_writer`*.

In [None]:
import time
logdir = "tf_logs/example01/model-at-{}".format(time.strftime('%Y-%m-%d_%H.%M.%S'))
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

You need to use a different log directory every time you run your program, or else TensorBoard will merge stats from different runs, which will mess up the visualizations. The simplest solution for this is to include a timestamp in the log directory name. Now's the execution phase:

The first line creates a node in the graph that will evaluate the *MSE* value and write it to a TensorBoard compatible binary log string called a summary. Then you need to update the execution phase to evaluate the *`mse_summary`* node regularly during training
(e.g., every 10 mini-batches). This will output a summary that you can then write to the events file using
the *`file_writer`*. Here is the updated code:

In [None]:
with tf.Session() as sess:
    global_step.initializer.run()
    for i in range(50):
        merged_ = merged.eval()
        file_writer.add_summary(merged_, i + 1)

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/warning.png" width="40", align="left"></img> In actual Deep Learning implementation logging training stats at every single training step, as this would significantly slow down training. Instead, you should log 200 iterations for example.

Finally, you want to close the FileWriter at the end of the program:

In [None]:
file_writer.close()

Great! Now it’s time to fire up the TensorBoard server. You need to activate your virtual environment
if you created one, then start the server by running the *`tensorboard`* command, pointing it to the root log
directory. This starts the TensorBoard web server, listening on port 6006 (which is “goog” written upside
down :) )
Next open a browser and go to http://0.0.0.0:6006/ (or http://localhost:6006/). Welcome to
TensorBoard! In the Scalars tab, you'll see *`global_step`* and *`learning_rate`*:

<img src='https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example01/learning_rate.png' width=300>

### <span style="color:#0b486b">II.7. Name Scopes</span>

When dealing with more complex models such as neural networks, the graph can easily become cluttered with thousands of nodes. To avoid this, you can create name scopes to group related nodes. The following code defines two operations sum and product under two scopes:

In [None]:
tf.reset_default_graph()

a = tf.placeholder(tf.float32, shape=(), name='a')
b = tf.placeholder(tf.float32, shape=(), name='b')

with tf.name_scope('calc'):
    c = a + b
    d = a + b
    e = a * b
    f = tf.add(a, b, name='sum')

In [None]:
print(c.op.name)
print(d.op.name)
print(e.op.name)
print(f.op.name)

The name of each op defined within the scope is now prefixed with "calc/". `c`, `d`, and `e` get generic names while `d` gets the name 'sum' that we passed into the operation. Note that `d` and `c` has the same generic name *`add`* but `d` is defined created later so it get the name *`calc/add_1`*.

For more detail about *`tf.name_scope`*, read: https://www.tensorflow.org/api_docs/python/tf/Graph#name_scope.

### <span style="color:#0b486b">II.8. Modularity</span>

Suppose you want to create a graph that adds the output of two [rectified linear units (ReLU)](https://en.wikipedia.org/wiki/Rectifier_(neural_networks). A ReLU computes a linear function of the inputs, and outputs the result if it is positive, and 0 otherwise, as shown in the following equation:

$$h_{\mathbf{W}, \mathbf{b}}(\mathbf{x})=\max(\mathbf{W}^T\mathbf{x} + \mathbf{b}, 0)$$

The following does the job but quit repetitive:

In [None]:
tf.reset_default_graph()

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")
w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")

b1 = tf.Variable(0.0, name="bias1")
b2 = tf.Variable(0.0, name="bias2")

z1 = tf.add(tf.matmul(X, w1), b1, name="z1")
z2 = tf.add(tf.matmul(X, w2), b2, name="z2")

relu1 = tf.maximum(z1, 0., name="relu1")
relu2 = tf.maximum(z1, 0., name="relu2")

output = tf.add(relu1, relu2, name="output")

logdir = 'tf_logs/example01/modularity'
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
file_writer.close()

<img src='images/example01/graph01.PNG' width=500>

The graph looks unorganized and hard to follow! Suppose we want to creates many ReLUs and outputs their sum, the graph is very messy and is hard to follow.

The following code creates five ReLUs and outputs their sum (note that *`add_n()`* creates an operation that will compute the sum of a list of tensors):

In [None]:
tf.reset_default_graph()

def relu(X):
    w_shape = (int(X.get_shape()[1]), 1)
    w = tf.Variable(tf.random_normal(w_shape), name="weights")
    b = tf.Variable(0.0, name="bias")
    z = tf.add(tf.matmul(X, w), b, name="linear")
    return tf.maximum(z, 0., name="relu")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

logdir = 'tf_logs/example01/modularity_clean'
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
file_writer.close()

Note that when you create a node, TensorFlow checks whether its name already exists, and if it does, TensorFlow appends an underscore followed by an index to make the name unique. So the first ReLU contains nodes named "weights", "bias", "z", and "relu" (plus many more nodes with their default name, such as "MatMul"); the second ReLU contains nodes named "weights_1", "bias_1", and so on; the third ReLU contains nodes named "weights_2", "bias_2", and so on. TensorBoard identifies such series and collapses them together to reduce clutter.

<img src='https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example01/graph02.PNG' width=500>

Let's try using name scopes to see if we can make the graph clearer!

In [None]:
tf.reset_default_graph()

def relu(X):
    with tf.name_scope('relu'):
        w_shape = (int(X.get_shape()[1]), 1)
        w = tf.Variable(tf.random_normal(w_shape), name="weights")
        b = tf.Variable(0.0, name="bias")
        z = tf.add(tf.matmul(X, w), b, name="linear")
        return tf.maximum(z, 0., name="relu")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

logdir = 'tf_logs/example01/modularity_clearer'
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
file_writer.close()

<img src='https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example01/graph03.PNG' width=600>

The graph now looks much clearer. Notice that TensorFlow also gives the name scopes unique names by appending _1, _2, and so on. If we expand one of the relu, we'll see:

<img src='https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example01/graph04.PNG' width=600>

<img src="https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/note.gif" width="20", align="left"></img> &nbsp; Weights and bias are now within the name_scope relu2 so we don't see _2 appended to the name.

### Sharing Variables

Suppose you want to control the ReLU threshold (currently hardcoded to 0) using a shared threshold variable for all ReLUs. You can use the *`get_variable()`* function to create the shared variable if it does not exist yet, or reuse it if it already exists. The desired behavior (creating or reusing) is controlled by an attribute of the current *`variable_scope()`*. For example, the following code will create a variable named *`"relu/threshold"`* (as a scalar, since *`shape=()`*, and using
0.0 as the initial value):

In [None]:
with tf.variable_scope("relu"):
    threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))

<img src="https://github.com/tuliplab/mds/blob/master/Jupyter/image/warning.png" width="40", align="left"></img> If the variable has already been created by an earlier call to *`get_variable()`*, this code will raise an exception. This behavior prevents reusing variables by mistake. If you want to reuse a variable, you need to explicitly say so by setting the variable scope’s reuse attribute to *`True`*.


In [None]:
with tf.variable_scope("relu", reuse=True):
    threshold = tf.get_variable("threshold")

This code will fetch the existing *`"relu/threshold"`* variable, or raise an exception if it does not exist or if it was not created using *`get_variable()`*. Alternatively, you can set the reuse attribute to *`True`* inside the block by calling the scope’s *`reuse_variables()`* method:

In [None]:
with tf.variable_scope("relu") as scope:
    scope.reuse_variables()
    threshold = tf.get_variable("threshold")

Now you have all the pieces you need to make the *`relu()`* function access the threshold variable without having to pass it as a parameter:

In [None]:
tf.reset_default_graph()

def relu(X):
    with tf.variable_scope('relu', reuse=True):
        threshold = tf.get_variable("threshold")
        w_shape = (int(X.get_shape()[1]), 1)
        w = tf.Variable(tf.random_normal(w_shape), name="weights")
        b = tf.Variable(0.0, name="bias")
        z = tf.add(tf.matmul(X, w), b, name="linear")
        return tf.maximum(z, threshold, name="max")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

with tf.variable_scope("relu"): # create the variable
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
    
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name="output")

logdir = 'tf_logs/example01/customized_relu'
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
file_writer.close()

<img src='images/example01/graph05.PNG' width=800>

This code first defines the *`relu()`* function, then creates the *`relu/threshold`* variable (as a scalar that will later be initialized to 0.0) and builds five ReLUs by calling the *`relu()`* function. The *`relu()`* function reuses the *`relu/threshold`* variable, and creates the other ReLU nodes. Therefore, we see the "scalar" connections between the first created ReLU and the other 5 ReLUs, meaning that 5 ReLUs reuse the threshold scalar from the first created ReLU.

To avoid creating the first unused ReLU. We can pass the reuse argument to the function *`relu()`*. We'll set *`reuse = True`* when after the first ReLU is created.

In [None]:
tf.reset_default_graph()

def relu(X, reuse=False):
    with tf.variable_scope('relu') as scope:
        if reuse:
            scope.reuse_variables()
            
        threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
        
        w_shape = (int(X.get_shape()[1]), 1)
        w = tf.Variable(tf.random_normal(w_shape), name="weights")
        b = tf.Variable(0.0, name="bias")
        z = tf.add(tf.matmul(X, w), b, name="linear")
        return tf.maximum(z, threshold, name="max")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

relus = [relu(X, reuse=i>0) for i in range(5)]
output = tf.add_n(relus, name="output")

logdir = 'tf_logs/example01/customized_relu2'
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
file_writer.close()

<img src='https://raw.githubusercontent.com/tuliplab/mds/master/Jupyter/image/dl/example01/graph06.PNG' width=600>

This time, exactly 5 ReLUs are created, with *`ReLU_[1,2,3,4]`* reuse the threshold scalar from the first ReLU.