# Tensorflow: introduction

### Why tensorflow?

As of today (end of 2017), there are only two widely used, up to date deep learning frameworks: [tensorflow](https://www.tensorflow.org/) and [pytorch](http://pytorch.org/). Both are python frameworks and follow similar design principles and functionality. In this course, we will be using tensorflow, as it is more mature and supported on more platforms. However, the main concepts carry over unmodified.

### What is tensorflow?

Tensorflow is a framework to define computations, automatically differentiate them and execute them in an efficient manner (potentially on GPUs). In order to do so, tensorflow splits the process of creating an algorithm into two steps: first *defining* the computation to be executed, and then running or *executing* the computation (on CPU or GPU).

### Installing tensorflow

For CPU, the installation is straightforward. We have provided an updated environment file. You may upgrade to that environment file by placing the environment file AdvML.yml in your working directory and running:
```bash
conda env update -f AdvML.yml
```

If you wish to install it directly into your environment, simply run `pip install tensorflow` with the environment activated.

## Defining the computation: dataflow graphs

Tensorflow uses the concepts of graphs `tf.Graph` to define the computation to be executed. The nodes in the graphs are operations, and data flows along the edges of the graph.

![tensorflow graph example](https://www.tensorflow.org/images/tensors_flowing.gif)

The first step in creating a model will be defining the graph operations that correspond to the model. Tensorflow provides many convenience functions to define the nodes of the graphs, which are called *ops*.

In [1]:
import tensorflow as tf
import numpy as np

In [2]:
# In numpy, variables store values, and operators execute operations

a = np.array([5])
b = np.array([6])

c = np.add(a, b)
c

array([11])

In [3]:
# In tensorflow, on the other hand, operators define the result of the operation

a_tf = tf.constant(5)
b_tf = tf.constant(6)

c_tf = tf.add(a_tf, b_tf)
c_tf # Here, c_tf is not the sum, but rather, represents the operation of summing

<tf.Tensor 'Add:0' shape=() dtype=int32>

In [4]:
# In order to obtain the value, we must ask tensorflow to execute the operation
# We will go in much more detail later.

with tf.Session() as sess:
    print(sess.run(c_tf))

11


### The default graph

Notice how we have not specified any graph in our code above? By default, tensorflow creates a global graph for us. However, this may sometimes be undesirable, and it is often better to be explicit in creating a graph and setting it as the default graph using the `as_default_graph` and the `with` construct. In particular, we should be careful when running things in a notebook, as creating tensors is not idempotent!

In [5]:
# If you run this multiple times, you will see the name
# of the generated constant change!

# This is because a constant already exists with the given name,
# and the name must be unique.
x = tf.constant(4, name='a')
x

<tf.Tensor 'a:0' shape=() dtype=int32>

In [6]:
# This can be particularly bad with get_variable, which will not
# try to rename your variables (it is better)

# run twice to get an error

x = tf.get_variable('a', initializer=0)

In [7]:
# Here we explicitly set a default graph
# for the duration of the with block.

# This is usually much better and allows us to
# avoid some hard to diagnose errors.

with tf.Graph().as_default():
    x = tf.constant(4, name='a')
x

<tf.Tensor 'a:0' shape=() dtype=int32>

## Executing the computation: sessions

Tensorflow uses the notion of a `tf.Session` to execute the computation. The session is used to manage the resources allocated to the execution, which is particularly important in the case of GPU execution. In particular, variables have no particular value outside of a session.

In most cases, it is best to separate the definition of the computation from its execution. We will thus usually create a graph completely, then create a session and execute the required computation.

When working in notebooks, it may be useful to use the `tf.InteractiveSession`, which registers itself as the default session.

Note that each graph may only correspond to one session, and we must create the session with the default graph being in scope.

In [8]:
with tf.Graph().as_default():
    x = tf.constant('Hello world')
    with tf.Session() as sess:
        print(sess.run(x))

b'Hello world'


## Tensors

Tensorflow is mostly designed to compute on tensors: multidimensional arrays of numerical data. They are conceptual similar
to numpy's `np.ndarray`. Each tensor is characterised by two properties: its datatype (most often `tf.float32` or `tf.int32`), and its shape (e.g. 10 x 4, 20 x 10 x 3, 64 x 32 x 32 x 3). Each operation takes tensors as input, and produce tensors as output.

Additionally, all tensors can be named in tensorflow. This does not change its computational meaning, but can be very useful when trying to debug models and visualizing the created graphs.

Two types of tensors will be particularly notable, `tf.Variable` and `tf.Placeholder`.

### Variables

Variables can be assigned values, and will remember those values across calls to `session.run`. This is the type we will use to define weights and biases in our networks.

### Placeholder

Placeholders are given values at runtime (when we call `session.run`). They allow us to pass arguments to the tensorflow operations we are about to use. We will often use them to provide the training data.

## Operation on tensors

Tensorflow defines numerous operation on tensors, including pointwise arithmetic, tensor dot product, convolutions, etc. Tensorflow also overloads the normal python operators, however be careful of the exact semantics of the overload if you do use them. Operation on tensors provide one or more tensors as output.

In [9]:
# Multiplying 2 x 2 tensors in tensorflow

a = tf.constant([[0, 1], [2,3]], dtype=tf.float32)
b = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)

matmul = tf.matmul(a, b)

with tf.Session() as sess:
    print(sess.run(matmul))

[[  3.   4.]
 [ 11.  16.]]


In [10]:
# An example in using variables

with tf.Graph().as_default():
    # create a variable with initial value 0
    a = tf.get_variable('variable_1', initializer=0)
    
    # this operation represents assigning the value 1 to a
    # Note that it does not run it!
    assign_to_a = tf.assign(a, 1)
    
    with tf.Session() as sess:
        # Note that variables must be initialized before they are used
        sess.run(tf.global_variables_initializer())
        print(sess.run(a))
        sess.run(assign_to_a)
        print(sess.run(a))

0
1


In [11]:
# An example in using placeholders

with tf.Graph().as_default():
    # We need to define the shape of the placeholder.
    # Use none to indicate that the extent is unknown
    a = tf.placeholder(tf.float32, shape=[None])
    
    sum_a = tf.reduce_sum(a)
    
    with tf.Session() as sess:
        # We provide the data when running using the feed_dict
        print(sess.run(sum_a, feed_dict={a: [1, 2, 3]}))
        print(sess.run(sum_a, feed_dict={a: [4, 5, 6]}))

6.0
15.0


## Random variables

Tensorflow is also able to generate random variables, by defining random tensors. These variables produce a new value according to the given distribution every time they are run. It is possible to control the sequence of generated random variables through a seed, although one must be careful that the graph is identical.

In [14]:
with tf.Graph().as_default():
    x = tf.random_normal(shape=[10])
    
    with tf.Session() as sess:
        print(sess.run(x))

[ 0.29278779  0.26010638  0.4999249  -1.2256645  -0.82805097  0.73793298
  0.85611039  0.47202969  0.58499163  0.99922848]


# Putting it together: some short examples

We will run some short (non-statistical) examples in tensorflow to illustrate the concepts we have learned above.

### Computing the sum of the first 100 integers

In [15]:
# Let us compute the sum of the 100 first integers in tensorflow

tf.reset_default_graph()

# define our variables
current = tf.get_variable('current', initializer = 0)
running_sum = tf.get_variable('sum', initializer = 0)

# now define the steps to be taken.

increment_sum = running_sum.assign_add(current)
increment_current = current.assign_add(1)

# At this point, we have only defined the operations, but not executed anything.
# We need to instantiate a session to execute the operations.

In [16]:
# Let's now run it all:

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    for i in range(100):
        sess.run(increment_sum)
        sess.run(increment_current)
    
    print('Total sum is {0}'.format(sess.run(running_sum)))

Total sum is 4950


### Computing the mean of an array

In [17]:
# Let us compute the average of some data using tensorflow

tf.reset_default_graph()

# define the input

# This is a placeholder to receive the data for which we wish to compute
# the mean. We can think of it as an argument to a function.
data = tf.placeholder(tf.float32, [None])

# define an operation which computes the mean
mean = tf.reduce_mean(data)

In [18]:
# Let's now run it all

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    print(sess.run(mean, feed_dict={data: [1, 2, 1, 2]}))
    print(sess.run(mean, feed_dict={data: [3, 4, 3, 4]}))

1.5
3.5


### Generating a random array of a given size

In [24]:
tf.reset_default_graph()

# define the input for the size of the random variates
size = tf.placeholder(tf.int64, [1])

# define the operation that generates random normals
normals = tf.random_normal(size)

In [26]:
# Let's run it

# Note that as we have not executed anything yet,
# no random numbers have been generated.

# We thus expect different numbers every time we run the session

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    print(sess.run(normals, feed_dict={size: [2]}))
    print(sess.run(normals, feed_dict={size: [2]}))
    print(sess.run(normals, feed_dict={size: [3]}))

[ 0.44173703 -0.14030108]
[ 0.93984336  1.60454285]
[-1.56438577 -1.06462216 -0.09195652]
