# Machine Learning (Summer 2018)

## Practice Session 8

June, 5th 2018

Ulf Krumnack

Institute of Cognitive Science
University of Osnabrück

## Today's Session

* exercise sheet 09
* TensorFlow basics

# TensorFlow Basics



Prerequisites:
* you have to install TensorFlow into your conda environment (`ml`)
```sh
conda install tensorflow
```
(there is also a package `tensorflow-gpu` that includes CUDA based GPU computation).

In [None]:
import tensorflow as tf
tf.InteractiveSession()
tf.__version__

## Workflow of TensorFlow

Tensorflow is based on computational graphs:
1. Define a graph
2. Run the graph in a session

## Computational graphs



In [None]:
tf.reset_default_graph()

one = tf.constant(1, name="a")
two = tf.constant(2, name="b")
three = tf.constant(3, name="c")

# sum = tf.add(one,two, name="our_sum")
#sum2 = one + two
mult = (one + two) * three

### Inspecting the graph

In [None]:
tf.get_default_graph().get_operations()

### Visualizing the Graph

TensorFlow provides its own tool for visualization tasks: TensorBoard

Idea:
1. store data to visualize
2. run TensorBoard to visualize this data

Here: we want to visualize the graph structure

Step 1: Create a log directory (make sure that it is empty):

In [None]:
log_dir = 'mylog'

def prepare_log_dir():
    if tf.gfile.Exists(log_dir):
        tf.gfile.DeleteRecursively(log_dir)
    tf.gfile.MakeDirs(log_dir)

prepare_log_dir()

Step 2: Write the graph info into that directory

In [None]:
writer = tf.summary.FileWriter(log_dir, tf.get_default_graph())
writer.close()

Step 3: Start TensorBoard to view the graph

In [None]:
!tensorboard --logdir=mylog

## Sessions

A session places a graph onto a computation device and provides methods to execute it.

In [None]:
sess = tf.Session()

To evaluate a graph we have to run it in a session:

In [None]:
result = sess.run(one)
print(result)

This code will compute the value of the tensor `sum` by
* first computing all predecessors of `sum`
* applying the operation (+) to these values

The value of the target tensor(s) is returned.

An alternative notation:

In [None]:
result = mult.eval(session=sess)
print(result)

The main difference is that `run` allows to evaluate multiple tensors:

In [None]:
result = sess.run((one,mult,two))
print(result)

## Placeholders

Tensorflow allows to put placeholders in a graph:
* a placeholder is an empty node in the graph
* a value for a placeholder is provided when the graph is run in a session

Placeholders can be created with the `tf.placeholder` method:

In [None]:
tf.reset_default_graph()

param_x = tf.placeholder(dtype=tf.float32, name='x')
param_y = tf.placeholder(dtype=tf.float32, name='y')

sum_params = param_x + param_y

When visualizing the graph, placeholders are depicted as follows:

In [None]:
prepare_log_dir()

writer = tf.summary.FileWriter(log_dir, tf.get_default_graph())
writer.close()

In [None]:
!tensorboard --logdir=mylog

To provide values for the placeholders upon execution, a `feed_dict` has to be provided:

In [None]:
with tf.Session() as sess:
    result = sess.run(sum_params, feed_dict={param_x: 3, param_y: 2})
print(result)

This is also possible when using the `eval` method:

In [None]:
with tf.Session() as sess:
    result = sum_params.eval(session=sess, feed_dict={param_x: 1, param_y: 2})
print(result)

Instead of scalars, one can also provide vectors (or other arrays) for the placeholders:

In [None]:
with tf.Session() as sess:
    result = sum_params.eval(session=sess, feed_dict={param_x: [1,2,3], param_y: 3})
print(result)

Exercise:
* Implement the function $f(x)\mapsto (x-4.2)^2$ as a computational graph
* Run the graph to evaluate the $f(0)$, $f(4)$ and $f(5)$
* Plot the function using matplotlib

In [None]:
t = [1,2,'*']
plt.figure()
plt.plot(*t)
plt.show()

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = tf.placeholder(dtype=tf.float32, name="x")
c = tf.constant(4.2)

f = (x-c)**2

xvals = np.linspace(0,10,100)
sess = tf.Session()
plt.figure()
plt.plot(*sess.run((x,f),feed_dict={x:xvals}))
plt.show()
# YOUR CODE HERE

## Variables

TensorFlow variables are nodes that hold a value that can change:
* they have to be given an initial value upon creation
* the value of a variable can be changed
* the main idea of variables is to hold model parameters

Upon creation, an initial value has to be provided:
* in practice, this often is some random number

In [None]:
tf.reset_default_graph()

var_theta = tf.Variable(3, name="theta")
var_rand = tf.Variable(tf.random_normal(shape = []))

The variables of a graph have to be initialized before they can be evaluated:
* this can be achieved with the method `tf.global_variables_initializer()`

In [None]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run((var_theta, var_rand)))

## Gradient descent

In [None]:
tf.reset_default_graph()

x = tf.Variable(tf.zeros(dtype=tf.float32, shape=[]), trainable=True, name='X')
c = tf.placeholder(dtype=tf.float32)
f = (x - c) ** 2

In [None]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print("f({}) = {}".format(*sess.run((x,f), feed_dict={c: 4.2})))

As Tenserflow knows the function definition, it can automatically compute the gradients:

In [None]:
gradient = tf.gradients(f,x)

The gradient can be evaluated like every other tensor:

In [None]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print("f({0}) = {1}, f'({0})={2}".format(*sess.run((x,f,gradient), feed_dict={c: 4.2})))

TensorFlow provides optimizers that are based on gradients, e.g. the `GradientDescentOptimizer`.
* note: this implicitly calls `tf.gradient`, so no need to call it explicitly

In [None]:
optimizer = tf.train.GradientDescentOptimizer(0.1)
train_step = optimizer.minimize(error)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(30):
        print("f({0}) = {1}, f'({0})={2}".format(*sess.run((x,f,gradient), feed_dict={c: 4.2})))
        sess.run(train_step, feed_dict={c: 4.2})
        
    print(x.eval(sess))

**Exercise**: Create a MLP with Tensorflow
1. Use a 2-3-2 layer architecture
2. Use `tf.sigmoid` as activation function
3. User $\frac12\sum(out-target)^2$ as error function

In [None]:
tf.reset_default_graph()

x = tf.placeholder(dtype=tf.float32, shape=[2,None], name="x")
W1 = tf.Variable(tf.zeros(shape=[2, 3]), name="w1")
b1 = tf.Variable(tf.zeros(shape=[3]), name="b1")
L1 = tf.sigmoid(tf.matmul(x,W1) + b1)

W2 = tf.Variable(tf.zeros(shape=[3, 2]), name="w2")
b2 = tf.Variable(tf.zeros(shape=[2]), name="b2")
out = tf.sigmoid(tf.matmul(x,W2) + b2)

y = tf.placeholder(dtype=tf.float32, shape=[2,None], name="y")

error = tf.reduce_sum((out-y)**2)*.5

Further Reading:

* [TensorFlow webpage](https://www.tensorflow.org/tutorials/) with many [Tutorials](https://www.tensorflow.org/tutorials/)