# Tensorflow Tutorial and Intro

This tutorial follows Lecture 7 of CS224 taught at Stanford University

Compiled by **Aditya Thakkar**

**Variables** - stateful nodes which output their current value, state is retained across multiple executions of a graph (easy to restore saved values toward variables) - gradient descent applies on these to minimize the loss 

**Nodes** - act as operations in a graph 

**Placeholders** - nodes whose value is fed in at execution time (inputs, labels), values that get added in during training - assign a data type and shape of tensor  

**Mathematical Operations** - eg. Add, MatMul, ReLU - act as Nodes

In [None]:
import tensorflow as tf
import numpy as np

b = tf.Variable(tf.zeros((100,)))
W = tf.Variable(tf.random_uniform((784, 100), -1, 1))

x = tf.placeholder(tf.float32, (100, 784))

h = tf.nn.relu(tf.matmul(x,W) + b)

# In words: 
# Create weighs, including initialization 
# Random uniform init from -1 to 1 
# Create input placeholder x -> m*784 input matrix
# Build the flow graph
# h = ReLU(Wx+b)
# h and ReLU point to the same thing in memory 

Above, we just defined a **graph**. We deploy our graphs with **sessions**.

**Session** - a binding to a particular execution context, or an execution environment

**Fetches** - List of graph nodes. Return the output of these nodes

**Feeds** - Dictionary mapping from graph nodes to concrete values. Specifies the value of each graph node given in the directory.

In [None]:
# Initialize all variables 
# Fill in placeholders
sess = tf.Session() # Takes a default env like CPU

# sess.run(fetches, feeds)

# Lazy evaulation -> evaluation only happens at runtime
sess.run(tf.initialize_all_variables()) 

# Run on nodes we're interested in
sess.run(h, {x: np.random.random(100,784)}) 
# x is a placeholder for the values we're interested in

# How to Define Loss
- Use placeholders for labels
- Build loss node using labels and predictions

In [None]:
prediction = tf.nn.softmax(...) # End of feed forward stage of NN
label = tf.placeholder(tf.float32, [100,10])

cross_entropy = -tf.reduce_sum(label * tf.log(prediction), axis =1)

# Gradient Computation
- tf.train.GradientDescentOptimizer is an object 
- tf.train.GradientDescentOptimizer(lr).minimize(cross_entropy) adds **optimization operation** to top of computation graph

Tensorflow graph **nodes** have attached gradient operations. 
Gradient with respect to **parameters** computed with back prop

In [None]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In [None]:
# Creating the train_step op

prediction = tf.nn.softmax(...)
label = tf.placeholder(tf.float32, [None,10])

cross_entropy = tf.reduce_mean(-tf.reduce_sum(label * tf.log(prediction), 
                                              reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In [None]:
# Training the model

sess = tf.Session() 
sess.run(tf.initialize_all_variables())

# batch_x and batch_label can be numpy arrays
for i in range(1000):
    batch_x, batch_label = data.next_batch()
    sess.run(train_step, feed_dict={x: batch_x, 
                                   label: batch_label})