# TensorFlow
Woo, it's tensorflow again. Let's build some graphs!

## Programming Model
Tensorflow expresses a numeric computation as a graph. Each node is an operation which has any number of inputs and outputs. Each edge is a tensor which flows between nodes. Suppose we have a simple network expressed by the following operation(s).

$$
h = \text{ReLU}(Wx + b)
$$

We can interpret it as three sequential operations.
![tf-graph](./assets/tf-graph.png)

## Node Types
New nodes are automatically built into the underlying graph. We can inspect what are the nodes inside our graph using `get_operations` on our default graph. As we can see there are multiple types of node.
```
tf.get_default_graph().get_operations()
```

### Variables
Variables are stateful nodes which output their current value. State is retained across multiple executions of a graph. We can think of variables as the parameters we wish to tune. The training will be occurred on variables instead of placeholders.
```python
b = tf.Variable(tf.zeros((100,)))
W = tf.Variable(tf.random_uniform((784, 100), -1, 1))
```

### Placeholder
Placeholders are nodes whose value is fed in at execution time.
```python
x = tf.placeholder(tf.float32, (100, 784))
```

### Mathematical operation
For example, we have `MatMul`, `Add`, `ReLU` and the list goes on.
```python
h = tf.nn.relu(tf.matmul(x, W), + b)
```

## Session
Once a graph is defined, we can deploy it with a session, which is a binding to a particular execution context. The `run` method has the followng signature.
```
sess.run(fetches, feeds)
```

### Arguments
There are two arguments we need to supply to a session.
* Fetches - List of graph nodes which will return outputs of these nodes
* Feeds - Dictionary mapping from graph nodes to concrete values which will specify the value of placeholder.

```python
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(h, {x: np.random.random(100, 784)})
```

### Losses
Since labels are not parameters we are going to tune, we will use placeholder for them. In fact, loss is just a mathematical operation node.
```python
prediction = tf.nn.softmax(...) # Output of a neural network
label = tf.placeholder(tf.float32, [100, 10])
cross_entropy = -tf.reduce_sum(label * tf.log(prediction), axis=1)
```

### Optimizer
It's time to compute gradients! We need to first define an optimizer object. This should be a familiar concept to me, I have written many different types of optimizer in CS231n. 
```python
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
```

Each graph node has attached gradient operation. The gradient operation computes local gradient and combines it with upstream gradient with respect to loss. In order to use the gradient to update our parameters, we simply run the training step in a session.
```python
sess = tf.Session()
sess.run(tf.initialize_all_variables())
for i in range(1000):
    batch_x, batch_y = data.next_batch()
    sess.run(train_step, feed_dict={x: batch_x, label: batch_y})
```

### Shared Variables
What if you are running multiple sessions on a cluster of computers but you are still training one model, i.e. the same set of variables is being used across multiple machines/sessions? We can use `variable_scope`!
```python
with tf.variable_scope('foo'):
    v = tf._get_variable('v', shape=[1]) # v.name == "foo/v:0"
   
with tf.variable_scope('foo', reuse=True):
    v1 = tf.get_variable('v') # shared variable found!
   
with tf.variable_scope('foo', reuse=False):
    v1 = tf.get_variable('v') # CRASH foo/v:0 already exists
```

`variable_scope()` provides simple name-spacingto avoid clashes. `get_variable()` creates/accesses variables from within a variable scope. 
  