# TensorFlow Fundamentals
This Jupyter Notebook contains a very fast introduction to TensorFlow (TF).
This introduciton is **VERY** fast, and leaves many topics untouched or under touched, but should give you enough of an introduction to be able to follow the later notebooks.
If you want a deeper dive the following are good places to start:

#### External resources
* [Official getting started material](https://www.tensorflow.org/get_started/) - collection of good tutorials from beginer to very advanced
* **[Python API Guides](https://www.tensorflow.org/api_guides/python/array_ops)** - GREAT place to look up TF works! Has some descriptions of how and why to use different parts.
* [Documentation](https://www.tensorflow.org/api_docs/python/) - Short and consice descriptions of how everything in TF works.
* [LearningTensorFlow.org](http://learningtensorflow.com/getting_started/) - A website dedicated to teaching TF. They have some good tutorials at varying levels.
* [Keynote (TensorFlow Dev Summit 2017)](https://www.youtube.com/watch?v=4n1AHvDvVvw&t=33s). 30 min video describing where TensorFlow is right now, and the underlying design principals.
* [TensorFlow at DeepMind (TensorFlow Dev Summit 2017)](https://www.youtube.com/watch?v=VdDmhOCw6J0). 20 min video describing how TensorFlow is used at DeepMind, and some of the results they have been able to achieve with it.

### What is TensorFlow
TensorFlow is a programming system in which you represent computations as graphs.
This graph is then compiled to efficient C/C++ code.
This added layer of abstraction makes TensorFlow very flexible.
TensorFlow can be interfaced using different languages, with Python being the most common, and it can the same code run seamlessly on different types of hardware.


![](tf_overview.png)


TensorFlow provides multiple APIs. 
The lowest level API, **TensorFlow Core**, provides you with fine-grained control.
Higher level APIs, such as `tf.contrib.learn`, are built on top of TensorFlow Core, are generally faster and easier to use.
They help manage data, training, and inference.
This guide begins with an introduction to **TensorFlow Core**.
Later, in other exercises, we will demonstrate how to use Tensorflow in a way that is closer to how it is used in the real world.

**NB**: The some of the API whose names contain `contrib` are still in development, and their interface may change.

To use TensorFlow you need to understand how TensorFlow:
* Represents computations as graphs.
* Executes graphs in the context of Sessions.
* Represents data as tensors.
* Uses feeds and fetches to get data into and out of arbitrary operations.



The two basic building blocks of TensorFlow are **tensors** and **operations** (called ops for short).
* **Tensors**: The edges in the graph
    * A Tensor is a typed multi-dimensional array, and are how information flows through the graph.
    * Tensors are used for data and parameters
* **Ops**: the nodes in the graph 
    * An op takes zero or more Tensors, performs some computation, and produces zero or more Tensors.

A TensorFlow graph is a description of computations. To compute anything, a graph must be launched in a `Session`. A Session places the graph ops onto `Devices`, such as CPUs or GPUs, and provides methods to execute them. These methods return tensors produced by ops as [numpy](http://www.numpy.org/) ndarray objects in Python, and as `tensorflow::Tensor` instances in C and C++.

TensorFlow can be used from C, C++, and Python programs, with Python being the most common and best supported.


# Part 1: The basics


First we import TensorFlow, and some other handy libraries.

In [None]:
## Python 2/3 compatability
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

## Import libraries
import numpy as np
import datetime
import os
import sys
sys.path.append(os.path.join('.', '..')) # Allow us to import shared custom 
                                         # libraries, like utils.py

import tensorflow as tf
import matplotlib.pyplot as plt
% matplotlib inline
from IPython.display import clear_output

import utils # contain various helper funcitons that aren't 
             # important to understand

### Basic operations

Let us begin with a simple example -- 2D linear regression: $y = ax + b$. Where $x$ is the input, $y$ is the output, and $a,~b$ are the parameters.

For starters let us compute $y$ when $a=2$, and $b=-1$ for a couple of different $x$.

In [None]:
## Building the computational graph

# In case we have already created something: clear it
tf.reset_default_graph()

# Create the two variables
# The 'name' argument indicates what TF should call the variable internally.
# It is a good idea to properly name your ops, as it makes debugging and later analysis much easier!
a = tf.Variable(2., name="a")
b = tf.Variable(-1., name="b")

# Create an x variable, with some numbers
x_values = [-2, -1, 0, 1, 2]
x_var = tf.Variable(x_values, name="xVariable", dtype=tf.float32)

# Define y
with tf.name_scope('yFromVariable'): # Give y a name
    y_from_var = a*x_var + b

print(y_from_var)

#### What just happened?
Why didn't `print(y_from_var)` print out the answer?

You might have expected `y_from_var` to be the results of `a*x_var + b`, but instead we got an **Tensor** object.
This is because before TensorFlow works by first creating a computational graph representation.
This graph can then be compiled, which can then be run.
This means that `y_from_var` is a TF **operation**, i.e. a node in the computational graph.

Lets compile and run `y_from_var` now!

In [None]:
# Create an operation that will initialize the graph.
init_op = tf.global_variables_initializer()

# TensorFlow operations are performed by 'Sessions'
with tf.Session() as sess:
    # Run the operation that initializes the graph. 
    # We almost always do this as the first thing.
    sess.run(init_op)
    
    # Compute y by running the operation
    y_output = sess.run(y_from_var)

print('y_output is a ' + str(type(y_output)) + '\n')

# Print the results
print('{:4s}  {:4s}'.format('x_var', '  y'))
for i in range(len(x_values)):
    s = "{:4.1f} : {:4.1f}"
    print(s.format(x_values[i], y_output[i]))


That is all well an good, but what if we wanted compute this for another $x$?
Right now we have defined `x_var` as a `tf.variable`.
This makes changing it cumbersome.
A better approach is using `tf.placeholder`, which we will do now.

A `placeholder` has 3 important arguments:
* **`dtype`** specifying what kind of data we are dealing with. Generally use **tf.float32**, as most GPU's are only optimized for 32 bit floating points
* **`shape`** lets TF know the dimensions of the variable. Writing `None` allows us to change the number of dimensions, without having to recompile the graph. This however prevents some optimization, so it should be specified when possible.
* **`name`** is what TF will call the placeholder internally. For instance this is the name it will use if the placeholder causes an error.

In [None]:
# Create x as a placeholder now!
x_ph = tf.placeholder(dtype=tf.float32, shape=[None], name="xPlaceholder")

# Define another y, using the placeholder x this time
with tf.name_scope('yFromPlaceholder'):
    y_from_ph = a*x_ph + b

x_new_values = [-0.2, -0.1, 0, 0.1, 0.2]

## Compute y
with tf.Session() as sess:
    sess.run(init_op)
    feed_dict = {x_ph : x_new_values}
    y_output = sess.run(y_from_ph, feed_dict=feed_dict)

    
# Print the results
print('{:4s}  {:4s}'.format(' x_ph', '  y'))
for i in range(len(x_values)):
    s = "{:4.1f} : {:4.1f}"
    print(s.format(x_new_values[i], y_output[i]))

#### What just happened?

This time, when we created the graph, we created $x$ as a `tf.placeholder`.
This means that `x_ph` simply stands in the place of real data.
So when we want to compute `y` for a particular value we simply **feed** that value into the graph, using a `feed_dict`.

If we wanted to change the values now, we simply need to feed a new value. 

These **variables** and **palceholders** can be a little hard to wrap your head around at first.
It can therefore be a good idea to read up on these:

#### External resources:
* **Variables**: 
    [Guide](https://www.tensorflow.org/programmers_guide/variables),
    [Documentation](https://www.tensorflow.org/api_docs/python/tf/Variable)
* **Placeholders**: 
    [Documentation](https://www.tensorflow.org/versions/r0.11/api_docs/python/io_ops/placeholders)

## Examining the graph with TensorBoard
When you execute the cell below you should see a graph that represents the work we have done so far.
Normally TensorBoard is opened in a separate browser window, but for now we will show it in-line.

Click a node to see its attributes and high level information.
**Double click a node to expand it**. Try double clicking on one of the y's.
Doing so will show you the operations that are necessary to compute $y$, i.e. a multiplication and an addition.
This is especially useful for examining the dimensions of your data as it flows through the graph.

The TensorBoard graph visualizer is a great tool for examining your model.
It is important to use `tf.name_scope` to dutifully name your variables properly!
Otherwise the graph visualizer quickly becomes unwieldy and useless.
This takes practice, but it is well worth it.
Propper usage of `tf.name_scope` also makes debugging easier, so it is a good habbit to get into.

In this example we embed the graph visualizer in Jupyter.
The visualizer isn't made for this, and not all features are present.
Normally you would access the graph visualizer through **TensorBoard**, as we do at the end of this notebook.
If you are interested in how to embed TensorBoard in the notebook see the `../utils.py` file.

There are many details that we didn't cover here, but here is a good place to start:

#### External resources
* **TensorBoard**: 
    [Graph visualization](https://www.tensorflow.org/get_started/graph_viz), 
    [Visualizing Learning](https://www.tensorflow.org/get_started/summaries_and_tensorboard), 
    [Embedding Visualization](https://www.tensorflow.org/get_started/embedding_viz).

In [None]:
## Launch TensorBoard, and visualize the TF graph
tmp_def = utils.rename_nodes(sess.graph_def, lambda s:"/".join(s.split('_',1)))
utils.show_graph(tmp_def)

**Mini-assignment**: 
Try and change the argument in `tf.name_scope` when defining `y_from_var` and `y_from_ph` to `y_variable`, and `y_placeholder`.
Then run code visualizing TensorBoard again.
* Notice what changed? Can you think of when this kind of thing is smart to do?

## <span style="color:red"> Exercise 1.1: Your first TensorFlow op</span>
For the first assignment you must implement Pythagoras' famous equation:
$$c = \sqrt{d^2 + e^2}$$

You should create $a$ and $b$ as placeholders, and then compute $c$ for $d = {3, 2, 1}$ and $e = {4,5,6}$.

It is **important** that you use TF ops for all the computations on the graph.
(e.g. use `tf.square` instead of `np.square`)
Otherwise TF can't optimize the code properly, and you risk it becoming VERY slow.
You can find the TF math ops that you need [here](https://www.tensorflow.org/versions/r0.11/api_docs/python/math_ops/basic_math_functions).

**NB**: You should not overwrite any of the variable names, as this we need them later!

In [None]:
e_values = [3,2,1]
d_values = [4,5,6]

## Your code here!
# 1) Define the placeholders and the graph

# 2) Start a session, and compute the output
c_output = [0, 0, 0] # Use this variable name as your output



In [None]:
# Print out the results, and validate you get the correct values
true_values = np.sqrt(np.square(e_values) + np.square(d_values))
assingment_1_success = True

for i in range(len(c_output)):
    assingment_1_success = False if not np.abs(true_values[i] - c_output[i]) < 1e-6 else assingment_1_success
    print('Corect value {:4.3f}, your value {:4.3f}. '.format(true_values[i], c_output[i]), end='')
    if not assingment_1_success: 
        print("Oops :(")
        print("\nSometihng went wrong, and the output isn't as expected.\
               \nGo back and have a look, or ask someone for help.")
        break
    print('Correct!')

    
if assingment_1_success:
    print('\nGood job! \nTake a break, strecht your legs, and then continue onwards!')
    

**Mini-assignment**:
After having successfully completed the assignment go back and run the code TensorBoard visualization code again.
* Did you remember to give your variables and placeholders meaningful names, or does everything look like a mess?

# Part 2: Linear Regression

Using TF as a substitute for numpy with extra steps is hardly inspiring.
Here in part 2 we will extend what we learnt in part 1 and begin introducing techniques relevant for deep learning.
We will do this through a linear regression example, but rather than use the ordinary least squares method we will use gradient descent.
Gradient descent is the backbone of the backpropagation algorithm, which is used in virtually all deep learning applications.

In case you aren't familiar with gradient descent Andrew Ng (one of the big names in AI) explains it very well in [this 10 min video](https://www.youtube.com/watch?v=F6GSRDoB-Cg) in his [machine learning course](https://www.coursera.org/learn/machine-learning) (which is great!).
But briefly put: You define a differentiable **loss function** that somehow measures how fare away your model is from the data.
Iteratively you then update your model to better fit the data by updating the parameters in the negative direction of the gradients.

For real valued data we often use the **mean squared error** (MSE) loss function:

$$ loss = \frac{1}{n} \sum^n (y_{true} - y_{estimated})^2$$


The first step is to get some data.
We will just create some artificial, well behaved data to start with.
We create both a **training set** and a **validation set**.
The training set is used to tune the model parameters, and the validation set is used to evaluate the model during training.
It is common to also have a **test set** which is used for a final evaluation of the model, when the training is complete.
We won't bother with the test set for now.

In [None]:
##### Creating the data
n_train = 30
n_valid = 30
train_input = np.linspace(-1, 1, n_train)
train_target = - train_input + np.random.randn(*train_input.shape) * 0.4

valid_input = np.linspace(-1, 1, n_valid)
valid_target = - valid_input + np.random.randn(*valid_input.shape) * 0.4


We will reuse the model from we defined in part 1.
Before we begin we will visualize the data and the initial (terrible) model.

In [None]:
with tf.Session() as sess:
    sess.run(init_op)
    feed_dict = {x_ph : train_input}
    y_output = sess.run(y_from_ph, feed_dict=feed_dict)

plt.plot(train_input, y_output, c='k', label='Model')
plt.scatter(train_input, train_target, c='b', alpha=0.5, label='Training set')
plt.scatter(valid_input, valid_target, c='r', alpha=0.5, label='Validation set')
plt.legend(loc=4)
plt.show()

Now we have that out of the way, we can train our model!

First we define the loss function, and the TF ops that will update out weights $a$ and $b$.

In [None]:
# We now also need a placeholder for y
y_ph = tf.placeholder(dtype=tf.float32, shape=[None], name='yPlaceholder')

# Define the loss function
with tf.name_scope('loss'):
    loss = tf.reduce_mean(tf.square(y_ph - y_from_ph))

# Define that we wish to use gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)

# And finally create the op that updates our weights a and b
# One of the (many) cool things about TF is that it automatically 
# differentiates the loss function.
train_op = optimizer.minimize(loss, var_list=[a, b])

# We can use TensorBoard to track the training progress. 
# In this case we wish to track the loss
summary_loss = tf.summary.scalar("performance/loss", loss)

In [None]:
# Define where we want to save the TensorBoard summaries
timestr = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M")
logdir = os.path.join('.', 'logdir', timestr)

train_dict={x_ph: train_input, y_ph: train_target}
valid_dict={x_ph: valid_input, y_ph: valid_target}

max_epoch = 50

with tf.Session() as sess:
    sess.run(init_op)

    # We use two summary writers. This is a hack that allows us to write to two
    # different folders, and thus show the graphs in the same plot.
    # Which TensorBoard doesn't otherwise allow
    summary_writer_train = tf.summary.FileWriter(os.path.join(logdir, 'train'), sess.graph)
    summary_writer_valid = tf.summary.FileWriter(os.path.join(logdir, 'valid'), sess.graph)

    for e in range(max_epoch):
        # Update the parameters, and compute the summary
        cost_train, summary_train, _ = sess.run([loss, summary_loss, train_op], feed_dict=train_dict)
        # Note that we don't use the train_op on the validation set!
        cost_valid, summary_valid = sess.run([loss, summary_loss], feed_dict=valid_dict)

        # Write the summaries
        summary_writer_train.add_summary(summary_train, e)
        summary_writer_valid.add_summary(summary_valid, e)

        ## Visualize the training montage!
        y_output = sess.run(y_from_ph, {x_ph: train_input})  
        plt.plot(train_input, y_output, c='k', label='Model')
        plt.scatter(train_input, train_target, c='b', alpha=0.5, label='Training set')
        plt.scatter(valid_input, valid_target, c='r', alpha=0.5, label='Validation set')
        plt.legend(loc=1)
        plt.title('epoch = {:2}. train error = {:4.2}. valid error = {:4.2}'.format(e, cost_train, cost_valid) )
        clear_output(wait=True)
        plt.show()

print('Training complete!')

**Mini-assignment**: The training error will generally be worse than the validation error.
* Why is that? 

*(Don't worry if you can't figure it out, we will cover this later)*

We can see how the loss changes as a function of epochs by opening up TensorBoard.
This plot is called the **learning curves**.

Open TensorBoard by, a terminal, typing:

    tensorboard logdir==<path/to/TensorBoard/logs>

In this case `<path/to/TensorBoard/logs>` is simply `logdir`, if you are already in the correct folder.
Accessed TensorBoard by opening your browser and entering the URL `localhost:6006`.

The learning curves can tell you a lot about how the network is doing, and how well the training is going
Have a look at the learning curves.
In this case they look close to ideal, that is:
* Training and validation loss follow each other closely
* The loss asymptotes to a low value

In many cases the learning curves won't be this nice, and later we will go into more depth about what can be read from them.

While in TensorBoard have a look at the **graphs** tab, and look at how it changed since the first time we looked at it.

# Credits
Created by Toke Faurby ([faur](https://github.com/Faur)).
