In [8]:
%run ../../common_methods/import_all.py

from scipy import optimize
from scipy.integrate import quad, odeint
from scipy.interpolate import interp1d
from scipy.signal import detrend
from scipy.spatial import distance
from matplotlib.legend_handler import HandlerLine2D
import tensorflow as tf

from common_methods.setup_notebook import set_css_style, setup_matplotlib, config_ipython
config_ipython()
setup_matplotlib()
set_css_style()

# The TensorFlow playground

## Main concepts

Calling the library `tf`, from a `import tensorflow as tf`.

### Components

* `tf.Graph`: the computational program, made of `tf.Tensor` and `tf.Operation` objects
* `tf.Session`: the runtime, runs the graph by running its operations (`tf.Session.run()`)
* *operations* manipulate tensors
* `tf.Tensor` the tensor object

Graph and Session are the main componenst of TF Core.

### Graph

The graph uses the [dataflow programming model](https://en.wikipedia.org/wiki/Dataflow_programming), which models a program as a directed graph and operations make data flow in it. 

Note that a graph can be defined upfront and populated, or otherwise TF creates a default graph and it adds stuff to the more you define it. 

A graph can be run and attached operations using a `with` statement, as in `g = tf.Graph(); with g.as_default(): ...`

### Tensors

A fancy name for a multi-dimensional array. Tensors are handles for values that "flow" in the graph. A tensor has a given shape and relates to data of a given type. Data types include string, int, float, etc. A tensor's rank is its number of dimensions. A tensor is evaluated in a session.

Tensors get a name, which is either the default coming from the operation that defines them (see later) or can be customised. 

#### Pre-existing tensors

These are special classes for tensor that exist.

* `tf.constant`: tensor containing constants, can be numbers or arrays but they don't get to change - correspond to a Const node in the graph
* `tf.Variable`: a variable tensor, that is, it can change via operations in the graph. A variable tensor exists outside of the session it is run into
* `tf.placeholder`: a placeholder acting like a promise to get some values later (see below for an example)
* others (...)

### Operations

* Each operation in a graph has a name, this can be given when creating the operation or gets default value from the API

### Layers

Layers are a way to group together operations. In a neural net framework, they are the layers of the network. For instance, `tf.layers.Dense` is a for a fully connected layer in a net with activation , but it can be used more generally beyond a net, see [here](https://www.tensorflow.org/guide/low_level_intro#layers).

### Devices

Where to run the operations (for instance on CPU or GPU) can be chosen by using `tf.device`

## Play with tensors

Note that given [signature](https://www.tensorflow.org/api_docs/python/tf/constant), the name of a constant tensor defaults to "Const". If a name is not unique, TF appends "\_x" to it for identification, x being the order of definition of that tensor. It also appends ":y", y being the index of the tensor in the operation it comes from ([ref](https://stackoverflow.com/questions/36150834/how-does-tensorflow-name-tensors)).

Tensors are named with the operation that defines/produces them. Note that calling things like `tf.constant` does call a `tf.Operation` which produces the tensor. Technically the name is given to the operation here, not to the tensor. 

In the little examples below, we're defining tensors and performing some operations; because no graph has been defined beforehand, they'll get added to the default graph.

In [9]:
# Define some tensors for numbers
a = tf.constant(3, dtype=tf.int32)
b = tf.constant(4)       # this will infer the type as int, if you put value "4.0" will infer type as float
c = tf.constant(5, dtype=tf.float32)
d = tf.constant(1.0)
e = tf.constant([1, 2], name='e')
f = tf.constant([[1, 2], [0, 1]], name='f')

a, b, b.dtype, d.dtype, c.dtype, c.name, a.name, e.name, f.name, tf.rank(f)

(<tf.Tensor 'Const:0' shape=() dtype=int32>,
 <tf.Tensor 'Const_1:0' shape=() dtype=int32>,
 tf.int32,
 tf.float32,
 tf.float32,
 'Const_2:0',
 'Const:0',
 'e:0',
 'f:0',
 <tf.Tensor 'Rank:0' shape=() dtype=int32>)

## Using tensors to perform simple numerical operations

Have tensorflow do something you could easily do witout it. To evaluate the operation, and display its result, you need a session. Note that much like `tf.add`, there's many opeations available in the API. 

Note that the operations may accept stuff not formatted as a tensor in input (tensor-like input), which gets transformed into tensor under the hood. See [docs](https://www.tensorflow.org/guide/graphs).

In [10]:
# summing them - returns a tensor
tot = a + b

type(tot)
tot
tot.dtype, tot.shape, tf.rank(tot)  # rank is 0 as it's a number

tensorflow.python.framework.ops.Tensor

<tf.Tensor 'add:0' shape=() dtype=int32>

(tf.int32, TensorShape([]), <tf.Tensor 'Rank_1:0' shape=() dtype=int32>)

In [11]:
# evaluate the result - you need a session

sess = tf.Session()
sess.run(tot)

7

In [12]:
# you can also add tensors like

tot2 = tf.add(a, b, name='add_op')   # name the operation, will appear in the graph
sess.run(tot2)

7

In [13]:
tot3 = tf.add(1, 2)
sess.run(tot3)

3

## Use placeholders

The following will be like defining a function that takes two args in input and returns the sum. It will be vealuated with a session which needs a `feed_dict` passed to know what the args are.

In [14]:
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
z = x + y

In [15]:
# this will execute the operation, with the passed values for x and y

sess.run(z, feed_dict={x: 1, y: 2.5})

3.5

In [16]:
# and of course can do same with add instead
s = tf.add(x, y)
sess.run(s, feed_dict={x: 1, y: 2.5})

3.5

## Showing how to use devices

Note that TF will choose the GPU if available, but you can decide to ship some operations to the CPU instead. You can customise the choices over which device runs what of your graph.

You can use multiple devices for computation, see [here](https://www.tensorflow.org/guide/using_gpu#using_multiple_gpus).

In [17]:
# Ship operations to the CPU
with tf.device("/device:CPU:0"):
  z = x + y

# Ship operations to the GPU
with tf.device("/device:GPU:0"):
  z = x + y

## Distributed computation

You can ship operations to different devices in a distributed setting. "ps" is the server node, "worker" are slave nodes.

In [18]:
with tf.device("/job:ps/task:0"):
  z = x + y

with tf.device("/job:ps/task:1"):
  z1 = x + y

with tf.device("/job:worker"):
  z2 = x + y

## Training a small model

You need to add each element one by one. This example is totally taken from the tutorial listed in the refs, it trains a linear regression. To do it we'll use

* A `tf.layers.Dense` as the linear model as it has a linear activation per default (differnet activations can be set)

We will create a graph for it rather than using the default one, it is run for TB below so can be visualised. 

In [19]:
# create the graph
g = tf.Graph()

with g.as_default():

    # x and y training data, as matrices of single-number arrays
    x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)
    y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)

    # Using a Dense layer as the linear model
    linear_model = tf.layers.Dense(units=1)

    # Define the operation (set of as it's a layer) to run model for prediction
    y_pred = linear_model(x)

    # Define the loss as a MSE
    loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)

    # Define a GD as optimizer with chosen learning rate, and the operation to train
    optimizer = tf.train.GradientDescentOptimizer(0.01)
    train = optimizer.minimize(loss)

    init = tf.global_variables_initializer()

    sess = tf.Session()
    sess.run(init)
    for i in range(100):
      _, loss_value = sess.run((train, loss))
      print(loss_value)

    print(sess.run(y_pred))

35.55902
24.81181
17.353725
12.177895
8.585681
6.092307
4.3614025
3.1595626
2.324834
1.7448418
1.3416101
1.0610335
0.86556923
0.72916746
0.6337526
0.5667821
0.5195532
0.48602697
0.4620135
0.444605
0.4317841
0.42215088
0.41473395
0.4088591
0.40405878
0.40000832
0.39648253
0.39332497
0.3904272
0.387714
0.38513297
0.38264787
0.38023347
0.37787226
0.37555212
0.37326446
0.3710034
0.36876485
0.36654592
0.36434466
0.3621595
0.3599894
0.35783386
0.35569218
0.3535639
0.351449
0.34934697
0.3472577
0.34518117
0.34311706
0.34106544
0.3390262
0.33699912
0.33498418
0.3329813
0.33099043
0.32901144
0.32704437
0.32508904
0.32314533
0.3212133
0.3192928
0.31738386
0.31548625
0.3136
0.311725
0.30986124
0.30800864
0.30616716
0.30433658
0.30251706
0.30070835
0.2989104
0.2971233
0.2953468
0.293581
0.29182574
0.29008093
0.2883466
0.28662264
0.28490892
0.2832055
0.28151226
0.27982914
0.2781561
0.27649307
0.2748399
0.27319673
0.2715633
0.26993972
0.26832575
0.26672146
0.2651268
0.26354167
0.26196596
0.26039967


## More on sessions

Sessions define the context in which the operation runs. In this sense, it can be run within a `with` statement (`with tf.Session() as sess: ...`), otherwise you need to call a `.close()` on it when done. It can be started locally or on remote.

## Keras

Keras is a high-level API made to ascertain some logics above. It can be called from TF as `tf.keras` and you can use pretty much the same code you'd use in Keras. Note that Keras has a callback for TB so that can be used. 

Also note this very nice thing that you can run Keras jobs on multiple GPUs, see [docs](https://www.tensorflow.org/guide/keras#multiple_gpus).

## Estimators

Estimators in TF are classed as high-level API. Estimators are full models that you can train and evaluate, then use for predictions. They're useful so all the logging and low-level preparation of the objects that go into building your models are handled for you. An estimator is callable as `tf.estimators.Estimator`, and there are some pre-made ones, like a linear regressor, which are sub-classes of this base class (find them under `tf.estimator`). Otherwise you can create your own custom ones, as instances of the base class.

Using estimators entails:

1. Creating input functions and choosing features to use
2. Defining the estimator to use those features and its hyperparams
3. Call methods on the estimator to perform operations

Note that running a job via an estimator does not require you to write the summary, as it'll be written to a `model_dir` by the estimator itself on its own, but you can customise this as a kwarg of the Estimator. Also, it will use a default config (containing info how often to) unless you soecify one in param `config`. If you want to customise this, you need to pass your own `tf.estimator.RunConfig` instead.

Note that a FileWriter is not needed with estimators (unless you need to set it yourself) as it's created automatically. It will merge and log metrics in TB every 100 steps default. The `model_dir` contains where stuff is stored for the model, and it is created when the estimator is istantiated.

### Input functions

Input functions are functions to create a `tf.data.Dataset` object to contain features (as dictionary) and labels (as array). An simple input function is for instance (for an IRIS dataset classification):

```py
def input_evaluation_set():
    features = {'SepalLength': np.array([6.4, 5.0]),
                'SepalWidth':  np.array([2.8, 2.3]),
                'PetalLength': np.array([5.6, 3.3]),
                'PetalWidth':  np.array([2.2, 1.0])}
    labels = np.array([2, 1])
    return features, labels
```

But it is usually convenient to use the Dataset API

### Datasets

`tf.data.Dataset` is the API to parse data and creating sets for training and evaluating. It is convenient as it handles cases and formattings for you. For instance it is useul top fetch many data files into a single object.


### Runconfig

The config tells things like how often to log data for the summaries and stuff like. You can attach it to an estimator via the `config` parameter, and customise its params, like how often to record steps data.

### Custom estimators

In tensorflow/models, there are two examples that do the same thing (classification on the Iris dataset), except that [one](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) uses a pre-made estimator as a DNNClassifier, the [other](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) builds it as custom.

You need to build a custom estimator if the pre-existing ones are not good for your task, like you need to build a network in a specific way, or report your own custom metrics. Stuff like writing the input functions to sort the features stays the same between the two methods of estimators, pre-made or custom. On top of that, for custom estimatora you need to write a model function, which is what builds the actual model. A model function has signature 

```py
def model_fn(features, labels, mode, params):
```

with the first two args from the input function, mode as an instance of `tf.estimator.ModeKeys` (tells if used for training, evaluating or inferring) and params containins further configuration.

In all three modes the model function needs to return a `tf.estimator.EstimatorSpec`.

* For training mode, you need to set an optimizer, from `tf.train`, the `EstimatorSpec` will contain the trainig step and the loss
* For evaluation mode, the `EstimatorSpec` contains the loss and metrics - metrics are in `tf.metrics` and you can decide which ones to track, or craft custom ones
* For prediction mode, the `EstimatorSpec` 

### TPU support

Estimators with TPU support are in [here](https://www.tensorflow.org/api_docs/python/tf/contrib/tpu/TPUEstimator), they have their own class inheriting from `Estimator`.

## Checkpoints

Checkpoints are stored when training, you can customise how often. If the model is trained again, last checkpoint gets loaded.

> TODO does this mean training resumes to where it was?

## Protocol buffers

It's a Google way to serialise structured data. tensorflow/models contains some text files in protos, which when executed with `protoc object_detection/protos/*.proto --python_out=.` generate python code whose purpose is to do this serialisation. 

## Tensorboard

It's a UI to visualise learning. It has several tabs, to visualise the graph. It is under continuous development and several features are planned to be added.

* `tf.summary.FileWriter` wirtes the data for TB to display.
* There are multiple tabs, coming from the summary

You can run TB after having created the summary, by running `tensorboard --logdir=<dir_name>` from the command line. The summary will create a file named "events.out.tfevents.{timestamp}.{hostname}" in the directory it is generated and this the path you need to give the tensorboard command. TB runs on port 6006 by default.

### The summary

It is a TF operation served to TB and is generated in multiple types, which correspond to different tabs in the UI:

* scalar: displays the metrics of the model to track how it is performing while it trains/evaluates
* image (visualises what the model is learning a set of images)
* audio
* histogram/distribution: charts here are made by a guy who was at the NYT earlier, give aggregated data
* tensor: under dev - new thing 

A summary runs protocol buffers.

### Displaying the graph

* The graph can be structured in the code for best display - nodes can be given labels, otherwise you risk not understanding anything of it
* Colour-coding is applied to tag elements that are of the same kind
* Subgraphs are clustered together, you can click on them to expand

### Hyperparams search

* Can generate its own summary

### Embedding viz

Projects high-dim data to 3 dimensions, but needs to be called specifically. For instance allows to do vizs of data with PCA/t-SNE. 

In [20]:
# To create a writer for a custom graph - using the g one above

writer = tf.summary.FileWriter('.')
writer.add_graph(g)

## Resources

* [Intro to the low-level API, from Tensorflow official](https://www.tensorflow.org/guide/low_level_intro)
* Some parts on Tensorboard has been derived from the [Dev Summit 2017 video](https://www.tensorflow.org/guide/summaries_and_tensorboard) they have on the official TF page
* On [tensors](https://www.tensorflow.org/guide/tensors), on [graphs and sessions](https://www.tensorflow.org/guide/graphs)
* On using devices and [GPUs specifically](https://www.tensorflow.org/guide/using_gpu)
* On the [Keras TF API](https://www.tensorflow.org/guide/keras)
* On [TF Estimators](https://www.tensorflow.org/guide/premade_estimators) and especially on how to build a custom one, an [example](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py)
* Walkthrough of the MNIST [example](https://jhui.github.io/2017/03/12/TensorBoard-visualize-your-learning/)
* On [Protocol Buffers](https://developers.google.com/protocol-buffers/?hl=en)