# Hackathon #1

Topics: 
- TF Graph
- Tensors
- Sessions
- Operations
- Placeholders
- Variables: initialization and assignment

Some material adapted from the TensorFlow documention: https://www.tensorflow.org

This is all setup in a IPython notebook so you can run any code you want to experiment with. Feel free to edit any cell, or add some to run your own code.

In [1]:
# We'll start with our library imports...
from __future__ import print_function

import tensorflow as tf  # to specify and run computation graphs

#### TF Graph

TensorFlow is a Python library for setting up computation graphs and running them. Computation graphs live in their own context, outside of the normal Python context. Thus, members of the graph can't be interacted with directly, but only through certain calls of the TensorFlow library. Today, we'll introduce the most basic tools you'll be using this semester to build and run highly-optimized computation graphs which run on both CPU and GPU.

<img src="https://imgur.com/8naUUs4.png" alt="TF Graph concept" width=400/>

A computational graph is a series of TensorFlow operations arranged into a graph. Each graph node corresponds to an operation that takes zero or more tensors as inputs and produces tensors as output. This is called the tensor-in tensor-out (TITO) model, and we think of tensors as the edges between node operations.

<img src="https://imgur.com/Sw1JNnC.png" alt="Example computation graph" width="250"/>

TensorFlow programs typically consist of two sections:
1. Building the computation graph
2. Running the computation graph

These must both be accomplished with TensorFlow code written in Python, but this setup allows the computation to be highly optimized to be able to run contemorary deep learning algorithms. 

#### Tensor

The basic unit of data in TensorFlow is the [Tensor](https://www.tensorflow.org/api_docs/python/tf/Tensor). A tensor consists of a set of primitive values (think `float` or `int`) shaped into an n-dimensional array. It represents an edge in the TensorFlow graph. We say that a tensor's _rank_ is its number of dimensions. Here are some examples of tensors:
```
its_complicated = tf.Variable(12.3 - 4.85j, tf.complex64)   # a rank 0 tensor; a scalar with shape []
first_primes = tf.Variable([2, 3, 5, 7, 11], tf.int32)      # a rank 1 tensor; a vector with shape [5]
myxor = tf.Variable([[False, True],[True, False]], tf.bool) # a rank 2 tensor; a matrix with shape [2, 2]
my_image = tf.zeros([10, 299, 299, 3])                      # a rank 4 tensor with shape [10, 299, 299, 3]
```
There are two ways of accessing the shape of a `tf.Tensor`. The first is used during graph construction by reading the `shape` property of a tensor (e.g., `node1.shape`), which returns a [TensorShape](https://www.tensorflow.org/api_docs/python/tf/TensorShape) object. The second, at runtime, can be done by running the [tf.shape](https://www.tensorflow.org/api_docs/python/tf/shape) operation. Tensors can be reshaped with [tf.reshape](https://www.tensorflow.org/api_docs/python/tf/reshape).

The simplest type of graph node is a [constant](https://www.tensorflow.org/api_docs/python/tf/constant):

In [2]:
m = tf.constant(3.0, dtype=tf.float32)
b = tf.constant(-4.0, name='b') # also tf.float32 implicitly
print(m, b)

Tensor("Const:0", shape=(), dtype=float32) Tensor("b:0", shape=(), dtype=float32)


Notice that printing the nodes does not output the values `3.0` and `-4.0` as you might expect. Instead, all information available statically about the `Tensor`s is printed. We can see that each `Tensor` has a name, a shape, and a data type.

#### TF Session

Now that there is someting going on inside of the graph, we have to create a TensorFlow [Session](https://www.tensorflow.org/api_docs/python/tf/Session) to run it and get the results out. Then, we can use [Session.run](https://www.tensorflow.org/api_docs/python/tf/Session#run) or [tf.Tensor.eval](https://www.tensorflow.org/api_docs/python/tf/Tensor#eval) to get the value of a Tensor out of the graph.

In [3]:
sess = tf.Session()
print(sess.run([m, b])) # This prints what we expect
print("m using eval, passing session explicitly:", m.eval(session=sess)) # Another way of getting the value 
with sess.as_default():
    print("b using eval with session context: ", b.eval())         # This also works... 

[3.0, -4.0]
m using eval, passing session explicitly: 3.0
b using eval with session context:  -4.0


Each `Tensor` is uniquely identified by its name in the namespace of a `Graph`. No more than one `Tensor` can be declared with a particular name and a `Tensor` can be uniquely identified and retrieved by its name. If you declare the same name twice, the name will be uniquified automatically to avoid collision. The python object returned from a function call is just a handle (like a pointer) to the corresponding object in the TensorFlow graph.

In [4]:
b_clone = tf.constant(4.0, name='b')
with sess.as_default():
    print(b_clone.name) # notice that this has been uniquified

b_1:0


#### Operations

Operations act as the nodes in the TF graph, and we never actually interact with them directly. Instead, when an operation is created using python code, the tensor output of the operation is returned. We'll have to use [Session.run](https://www.tensorflow.org/api_docs/python/tf/Session#run) on the returned `Tensor` to run an operation and get the output.

In [5]:
# tf.add sums the two tensors provided
node3 = tf.add(m, b)
print("node3:", node3)
print("sess.run(node3):", sess.run(node3))

# list all operations
tf.get_default_graph().get_operations()

node3: Tensor("Add:0", shape=(), dtype=float32)
sess.run(node3): -1.0


[<tf.Operation 'Const' type=Const>,
 <tf.Operation 'b' type=Const>,
 <tf.Operation 'b_1' type=Const>,
 <tf.Operation 'Add' type=Add>]

Notice that when we print `node3`, it's a `Tensor` whose name ends with `:0`. The name of a `Tensor` usually ends with `:i` where `i` is an integer. This indicates that it's the `i`th output of the `Operation` of the same name. Internally, everything including constants and Variables are operations with output Tensors.

#### Placeholders

A graph can be setup to accept external inputs using [tf.placeholder](https://www.tensorflow.org/api_docs/python/tf/placeholder).

In [6]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b  # '+' provides a shortcut for tf.add(a, b)
print("a:", a)
print("b:", b)
print("adder_node:", adder_node)

a: Tensor("Placeholder:0", dtype=float32)
b: Tensor("Placeholder_1:0", dtype=float32)
adder_node: Tensor("add_1:0", dtype=float32)


Many python operators are overloaded by TensorFlow to be their pointwise equivalents using numpy broadcasting (you can see the details [here](https://www.tensorflow.org/api_docs/python/tf/Tensor#methods), where everything with `__op__` form should be an overload). Much of the TensorFlow API closely resembles that of Numpy, including built-in features like [broadcasting]( https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html). This is important to understand in order to effectively use TensorFlow!

Now, the graph looks like this:
![img_2](https://www.tensorflow.org/images/getting_started_adder.png)

In order to use operations that rely directly or indirectly on placeholders we must provide a `feed_dict` to the `Session.run` method. This is a dictionary with `placeholder` as keys and the value that they should use as the corresponding values.

In [7]:
print(sess.run(adder_node, {a:3, b: 4.5}))
print(sess.run(adder_node, feed_dict={a: [1, 3], b: [2, 4]}))

7.5
[3. 7.]


Notice that, in the second `print` we feed lists for the variables. Python lists and `numpy` arrays can both be converted into Tensors.

We can make the computational graph more complex by adding another operation. For example:

In [8]:
add_and_triple = adder_node * 3.
print(sess.run(add_and_triple, {a: 3, b: 4.5}))

22.5


The graph now looks like this:
![img_3](https://www.tensorflow.org/images/getting_started_triple.png)

## Hackathon 1 Exercise 1

Write code to evaluate the function `f(x, y) = 7xy^2*cos(3x) + sqrt(5)*xy + exp(2y)` in the cell below using placeholders for `x` and `y`, and evaluate the function in a session. The output should be the result of the function for your choice of values of `x` and `y`. To get points for this hackathon, make sure you fill this out, save and download the notebook, and submit it on Handin. (If you're having trouble, look at the [TF documentation](https://www.tensorflow.org/api_docs/python/tf))

In [9]:
# Set values for x and y
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

# First calculate term 1

# First multiply 7 and x
x7 = tf.multiply(7.,x,name='x7')
# Calculate squared of y
yy = tf.square(y,name='yy')
# Calculate mult together
x7yy = tf.multiply(x7,yy,name='x7yy')
# Calculate x times 3
x3 = tf.multiply(x,3.,name='x3')
# Calculate cosine of x3
cosX3 = tf.cos(x3,name='cosX3')
# Calculate full term 1
term1 = tf.multiply(x7yy,cosX3,name='term1')

# Now calculate term 2

# Calculate square root of 5
sqrt5 = tf.sqrt(5.,name='sqrt5')
# Calculate x times y
xy = tf.multiply(x,y,name='xy')
# Calculate full term 2
term2 = tf.multiply(sqrt5,xy,name='term2')

# Now calculate term 3

# Calculate 2 times y
y2 = tf.multiply(y,2.,name='y2')
# Now calculate full term 3
term3 = tf.exp(y2,name='term3')

# Now add terms 1 and 2
term1_2 = tf.add(term1,term2,name='term1_2')
# Now add term 3
term1_2_3 = tf.add(term1_2,term3,name='term1_2_3')

# Now run this code within a session
print(sess.run(term1_2_3, { x:2. , y:7. }))

1203294.2


#### Variables and initialization

Let's imagine now that we have a dataset of `(input, output)` pairs to which we want to fit a linear function.

To make the model store the parameters of the linear function, we need to be able to modify the graph to get new outputs with the same input. [Variables](https://www.tensorflow.org/api_docs/python/tf/Variable) allow us to add trainable parameters to a graph. They are constructed with a type and initial value and must be explicitly initialized.

High-level `Variable` how-to: https://www.tensorflow.org/programmers_guide/variables

In [10]:
W = tf.Variable([.3], dtype=tf.float32)  # slope
b = tf.Variable([-.2], dtype=tf.float32) # y-intercept
x = tf.placeholder(tf.float32)           # input placeholder
linear_model = W*x + b                   # define output
print("W:", W)
print("b:", b)
print("linear_model:", linear_model)

W: <tf.Variable 'Variable:0' shape=(1,) dtype=float32_ref>
b: <tf.Variable 'Variable_1:0' shape=(1,) dtype=float32_ref>
linear_model: Tensor("add_2:0", dtype=float32)


Constants are initialized when you call `tf.constant`, and their value can never change.
By contrast, variables are not initialized when you call `tf.Variable`.
To initialize all the variables in a TensorFlow program, we can explicitly call a special operation, [tf.global_variables_initializer](https://www.tensorflow.org/api_docs/python/tf/global_variables_initializer).

In [11]:
init = tf.global_variables_initializer()
sess.run(init)
# Now we can do things with the values of the Variables we defined
print(sess.run([W, b]))

[array([0.3], dtype=float32), array([-0.2], dtype=float32)]


Since `x` is a placeholder, we can evaluate `linear_model` for several values of `x` simultaneously by passing a list as input to get a list as output.

In [12]:
print(sess.run(linear_model, {x: [1, 2, 3, 4]}))

[0.10000001 0.40000004 0.70000005 1.        ]


We've created a linear model to fit the data function, but we don't yet know how well it perform. To evaluate the model on training data, we need a `y` placeholder to input the correct output values, and we need to write a loss function.

A loss function measures how far the current estimated output is from the correct output and can be used as a signal to update `Variable` values to get a better fitting model. We'll use a standard loss funciton for linear regression, squared loss, which sums the squares of the differences between the model output and the correct output, given an input.

`linear_model - y` creates a vector where each element is the corresponding example's error delta. We then call `tf.square` to square that error, and finally we sum all the squared errors to create a single scalar that combines the error of all examples using `tf.reduce_sum`. This gives an aggregate measure of how far the model is from fitting the function.

In [13]:
y = tf.placeholder(tf.float32)
squared_deltas = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_deltas)
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))

25.260002


We can improve the fit manually by re-assigning the values of `W` and `b` to the perfect values of `-1` and `1` respectively.
A variable is initialized to the value provided to `tf.Variable`, but can be changed using operations like [tf.assign](https://www.tensorflow.org/api_docs/python/tf/assign).

In [14]:
fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])
sess.run([fixW, fixb])
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))

0.0


Typically, rather than providing static values as the second argument to `tf.assign`, a placeholder is created to allow updating repeatedly as needed.

We guessed the "perfect" values of `W` and `b`, but the whole point of machine learning is to find the correct model parameters automatically. We will show how to accomplish this in the next Hackathon.