<a href="https://colab.research.google.com/github/finiteautomata/ml-examples/blob/master/notebooks/tensorflow/low_level_introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to TensorFlow

Taken [from here](https://www.tensorflow.org/guide/low_level_intro)

## Tensor Values

The central unit of data in TensorFlow is the tensor. A tensor consists of a set of primitive values shaped into an array of any number of dimensions. A tensor's rank is its number of dimensions, while its shape is a tuple of integers specifying the array's length along each dimension. Here are some examples of tensor values:

In [1]:
3. # a rank 0 tensor; a scalar with shape [],
[1., 2., 3.] # a rank 1 tensor; a vector with shape [3]
[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3]
[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]

[[[1.0, 2.0, 3.0]], [[7.0, 8.0, 9.0]]]

## TensorFlow Core Walkthrough
You might think of TensorFlow Core programs as consisting of two discrete sections:

- Building the computational graph (a `tf.Graph`).
- Running the computational graph (using a `tf.Session`).

## Graph
A computational graph is a series of TensorFlow operations arranged into a graph. The graph is composed of two types of objects.

- `tf.Operation` (or "ops"): The nodes of the graph. Operations describe calculations that consume and produce tensors.
- `tf.Tensor`: The edges in the graph. These represent the values that will flow through the graph. Most TensorFlow functions return tf.Tensors.


In [2]:

import numpy as np
import tensorflow as tf

a = tf.constant(3.0, dtype=tf.float32)
b = tf.constant(4.0) # also tf.float32 implicitly
total = a + b
print(a)
print(b)
print(total)

Tensor("Const:0", shape=(), dtype=float32)
Tensor("Const_1:0", shape=(), dtype=float32)
Tensor("add:0", shape=(), dtype=float32)


As you can see, it does not print constants 3.0, 4.0, and 7.0, but just `tf.Tensor` objects: these are the computational graphs.

## TensorBoard
TensorFlow provides a utility called TensorBoard. One of TensorBoard's many capabilities is visualizing a computation graph. You can easily do this with a few simple commands.

First you save the computation graph to a TensorBoard summary file as follows:

In [3]:
!pip install tensorboardcolab

from tensorboardcolab import TensorBoardColab

tbc = TensorBoardColab()




Using TensorFlow backend.


Wait for 8 seconds...
TensorBoard link:
http://00912e05.ngrok.io


In [0]:
summary_writer = tbc.get_writer()

summary_writer.add_graph(tf.get_default_graph())

summary_writer.flush()

## Session

We need sessions to evaluate tensors. A session encapsulates the state of the TensorFlow runtime, and runs TensorFlow operations. If a `tf.Graph` is like a `.py` file, a `tf.Session` is like the python executable.

In [5]:
sess = tf.Session()
print(sess.run(total))

7.0


During a call to `tf.Session.run` any `tf.Tensor` only has a single value. For example, the following code calls `tf.random_uniform` to produce a `tf.Tensor` that generates a random 3-element vector (with values in [0,1)):



In [6]:
vec = tf.random_uniform(shape=(3,))
out1 = vec + 1
out2 = vec + 2
print(sess.run(vec))
print(sess.run(vec))
print(sess.run((out1, out2)))

summary_writer.add_graph(tf.get_default_graph())
summary_writer.flush()

[0.5937327 0.6667255 0.8753474]
[0.9574045 0.2339592 0.9430821]
(array([1.773427 , 1.2733477, 1.3925507], dtype=float32), array([2.773427 , 2.2733479, 2.3925507], dtype=float32))


## Placeholders

As it stands, the graph is not interesting: we need some way to accept external inputs. These are known as **placeholders** 

In [0]:
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
z = x + y

In [0]:
summary_writer.add_graph(tf.get_default_graph())
summary_writer.flush()

This is like defining

$$ z(x, y) = x + y $$

To evaluate this, we must provide values for $x$ and $y$

In [9]:
print(sess.run(z, feed_dict={x: 3, y: 4.5}))
print(sess.run(z, feed_dict={x: [1, 3], y: [2, 4]}))

7.5
[3. 7.]


Mmmm, as we can see, we could provide data of any dimension

The only difference between placeholders and other tf.Tensors is that placeholders throw an error if no value is fed to them.

## Datasets

Placeholders work for simple experiments, but tf.data are the preferred method of streaming data into a model.

To get a runnable `tf.Tensor` from a `Dataset` you need first to convert it to an `Iterator`.

In [0]:
my_data = [
    [0, 1,],
    [2, 3,],
    [4, 5,],
    [6, 7,],
]
slices = tf.data.Dataset.from_tensor_slices(my_data)
next_item = slices.make_one_shot_iterator().get_next()

In [0]:


summary_writer.add_graph(tf.get_default_graph())
summary_writer.flush()

We can call `next_item` until it raises `tf.errors.OutOfRangeError`

In [12]:
while True:
  try:
    print(sess.run(next_item))
  except tf.errors.OutOfRangeError:
    break

[0 1]
[2 3]
[4 5]
[6 7]


If the dataset depends on stateful operations, we need to initialize it first

In [13]:
r = tf.random_normal([10,3])
dataset = tf.data.Dataset.from_tensor_slices(r)
iterator = dataset.make_initializable_iterator()
next_row = iterator.get_next()

sess.run(iterator.initializer)
while True:
  try:
    print(sess.run(next_row))
  except tf.errors.OutOfRangeError:
    break

[-2.0293903   0.33965522 -1.0952375 ]
[-0.221313  -0.9814494 -1.80212  ]
[ 0.5486614  1.8664402 -1.2439016]
[-1.3527572   0.26240706 -1.1845875 ]
[ 0.78239065 -0.510512   -1.6641259 ]
[-0.90511626 -0.20295154  2.3680062 ]
[-0.14922859  0.31022593 -0.82432693]
[-0.48443475 -0.6864904   0.28561288]
[ 1.371163  -1.5768971  0.5658019]
[ 0.0979891 -2.3049946 -1.5383164]


## Layers

Layers are the way to add trainable parameters to our computation graph. They encapsulate both the variables and the operations. For example, a `densely-connected layer` performs a weighted sum and then applies an activation function on it. The layer weights and biases are managed by the layer object

The following code creates a `tf.layers.Dense` layer that takes a batch of input vectors, and produce a single output for each of them

In [0]:
x = tf.placeholder(tf.float32, shape=[None, 3])
linear_model = tf.layers.Dense(units=1)
y = linear_model(x)


summary_writer.add_graph(tf.get_default_graph())
summary_writer.flush()


Here, the `None` means that this dimension is `dynamic`: it will accept tensors of shape $d \times 3$, for some $d$

Also, `units=1` means that it will return a single value. 

Layers contain internal variables that need to be initialized. While this can be done one by one, an easier way to do this is as follows:

In [0]:

init = tf.global_variables_initializer()
sess.run(init)

**Important**: Calling tf.global_variables_initializer only creates and returns a handle to a TensorFlow operation. That op will initialize all the global variables when we run it with tf.Session.run.

In [16]:
sess.run(y, {x: [[1, 1, 1]]})

array([[-1.7966515]], dtype=float32)

## Layer shortcuts

We can also write this in a more succint way:

In [17]:
x = tf.placeholder(tf.float32, shape=[None, 3])
y = tf.layers.dense(x, units=1)

init = tf.global_variables_initializer()
sess.run(init)

print(sess.run(y, {x: [[1, 2, 3], [4, 5, 6]]}))

[[1.5801585]
 [3.7080307]]


## Training

Let's first define some data

In [0]:
x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)
y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)


Now define the model

In [19]:
linear_model = tf.layers.Dense(units=1)

y_pred = linear_model(x)


sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

print(sess.run(y_pred))

[[0.32088196]
 [0.6417639 ]
 [0.9626459 ]
 [1.2835279 ]]


Let's define the loss. We will use MSE

In [20]:
loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)

print(sess.run(loss))

7.481059


### Optimizer

Use Gradient Descent with LR=0.01 to optimize our loss function.

Since train is an op, not a tensor, it doesn't return a value when run. To see the progression of the loss during training, we run the loss tensor at the same time, producing output like the following:

In [22]:
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)


summary_writer.add_graph(tf.get_default_graph())

summary_writer.flush()

for i in range(100):
  _, loss_value = sess.run((train, loss))
  print("Loss function - iter {} : {:.3f}".format(i, loss_value))

Loss function - iter 0 : 0.155
Loss function - iter 1 : 0.154
Loss function - iter 2 : 0.153
Loss function - iter 3 : 0.152
Loss function - iter 4 : 0.151
Loss function - iter 5 : 0.150
Loss function - iter 6 : 0.149
Loss function - iter 7 : 0.148
Loss function - iter 8 : 0.147
Loss function - iter 9 : 0.147
Loss function - iter 10 : 0.146
Loss function - iter 11 : 0.145
Loss function - iter 12 : 0.144
Loss function - iter 13 : 0.143
Loss function - iter 14 : 0.142
Loss function - iter 15 : 0.141
Loss function - iter 16 : 0.141
Loss function - iter 17 : 0.140
Loss function - iter 18 : 0.139
Loss function - iter 19 : 0.138
Loss function - iter 20 : 0.137
Loss function - iter 21 : 0.136
Loss function - iter 22 : 0.136
Loss function - iter 23 : 0.135
Loss function - iter 24 : 0.134
Loss function - iter 25 : 0.133
Loss function - iter 26 : 0.132
Loss function - iter 27 : 0.132
Loss function - iter 28 : 0.131
Loss function - iter 29 : 0.130
Loss function - iter 30 : 0.129
Loss function - it