# tensorflow tutorial

In [3]:
import tensorflow as tf
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

A computational graph is a series of TenserFlow operations arranged as a graph of nodes.  
A node takes in inputs called tensors, which are arrays of rank n 
(arrays that have n dimensions), and spits out tensors as outputs.

In [27]:
node0 = tf.constant(4)
node1 = tf.constant(3.0)
node2 = tf.constant(4, tf.float32)
print node0
print node1
print node2

Tensor("Const_33:0", shape=(), dtype=int32)
Tensor("Const_34:0", shape=(), dtype=float32)
Tensor("Const_35:0", shape=(), dtype=float32)


Nodes are evaluated in a **session**, which is an object with a run method that is used to run the computational graph (the nodes aka the series of TenserFlow operations).

In [29]:
sess = tf.Session()
sess.run([node1, node2])

[3.0, 4.0]

In [34]:
node3 = tf.add(node1, node2)
sess.run(node3)

7.0

Parametrize a graph (add parameters to it, whose values are not given until the session is started), by adding **placeholders**.  The documentation calls them parameters, but I like to think of them as variables... promises to provide values later (variables whose values are specified when the session is started or variables that take external inputs)

In [48]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
add_node = a+b # a+b is a shortcut for tf.add(a,b)
add_and_triple = add_node * 3

print sess.run(add_node, {a: 2, b: 4}) #add 2 and 4
print sess.run(add_node, {a: [1,2], b: [4,5]})  #[1+4,2+5]
print sess.run(add_and_triple, {a:2,b:4})

6.0
[ 5.  7.]
18.0


**Variables** are the parameters in our model, whose values we wish to optimize using the training data (aka parameters... lol). <br> **Placeholders** are the variables/predictors/columns in our model.<br>
The initial value and type of the variable (trainable parameter) must be given. <br>
Constants are initialized with their given values when you create them and stay the same forever. <br>
Variables are not initialized with their initial values until you initialize them by doing the special operations shown below.



In [65]:
W=tf.Variable([.3],tf.float32) #Variables are the parameters to optimize
b=tf.Variable([-.3],tf.float32) 
x=tf.placeholder(tf.float32) #Placeholders are the variables that store the data
linear_model=W*x+b

init = tf.global_variables_initializer() #initializer object 
sess.run(init) #run the session with the initializer object to initialize all global variables

print sess.run(linear_model,{x:[1,2,3,4]})

[ 0.          0.30000001  0.60000002  0.90000004]


A **loss function** is an error metric used to evaluate the model's predictions, like the mean squared error sum((y-yhat)^2).  <br>Let's evaluate this model on training data.

In [71]:
y=tf.placeholder(tf.float32)
squared_deltas=tf.square(y-linear_model) #creates vector of squared error deltas (y-yhat)
loss=tf.reduce_sum(squared_deltas) #sums tensor elements (squared error deltas)
sess.run(loss, {x:[1,2,3,4],y:[0,-1,-2,-3]}) #run session which evaluates loss func of model

23.66

Change the values of variables after they have been initialized by doing the following:

In [77]:
fixW = tf.assign(W,[-1.]) #re-assign parameter value W to -1
fixb = tf.assign(b,[1.])
sess.run([fixW,fixb]) #take re-assingment objects and evaluate them using run method
sess.run(loss, {x:[1,2,3,4],y:[0,-1,-2,-3]}) #evaluate the loss function

[array([-1.], dtype=float32), array([ 1.], dtype=float32)]

0.0

TensorFlow has **optimizers** which slowly change the value of each variable in order to minimize the loss function.  The simplest optimizer is **gradient descent**, which modifies each variable according to the magnitude of the derivative of loss with respect to that variable. <br>

The **minimize()** method of the optimizer performs two steps: <br>
1) It runs compute_gradients() and returns a list of (gradient, variable) pairs where a gradient is a partial derivative <br>
2) It runs apply_gradients() to apply the gradients to the variable coefficients
such that (this is what I believe is happening...) coefficient = coefficient – (learning rate * gradient).  So the coefficients are revised from their old values upon each iteration of the method (when the session is run with the optimizer object and training data multiple times).

In [89]:
optimizer = tf.train.GradientDescentOptimizer(.01) #select optimizer with learning rate = .01
train = optimizer.minimize(loss) 
#create object of optimizer's minimize method on specified loss func 
sess.run(init) #reset values to un-optimal defaults
for i in range(0,1000):
    sess.run(train,{x:[1,2,3,4], y:[0,-1,-2,-3]})
print sess.run([W,b])

[array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)]


## Gradient Descent Algorithm for Simple Linear Regression

In [91]:
sess = tf.Session()

#parameters
W=tf.Variable([.3],tf.float32)
b=tf.Variable([-.3],tf.float32)

#variables
x=tf.placeholder(tf.float32)
y=tf.placeholder(tf.float32)

#model
linear_model=W*x+b

#data
x_train=[1,2,3,4]
y_train=[0,-1,-2,-3]

#loss function
loss=tf.reduce_sum(tf.square(y-linear_model) ) 

#optimizer
optimizer = tf.train.GradientDescentOptimizer(.01) 
train = optimizer.minimize(loss) 

#training loop
init = tf.global_variables_initializer() 
sess.run(init) 
for i in range(0,1000):
    sess.run(train,{x:x_train, y:y_train})
print sess.run([W,b])

#results
curr_W, curr_b, curr_loss  = sess.run([W, b, loss], {x:x_train, y:y_train})
print "W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss)

[array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)]
W: [-0.9999969] b: [ 0.99999082] loss: 5.69997e-11


# Udacity tutorial

In [16]:
hello_constant = tf.constant('Hello World!')
with tf.Session() as sess:  #within this session, run the following indented code
    output = sess.run(hello_constant)
    print output

Hello World!


In [17]:
#constants 
hello_constant = tf.constant('Hello World!') #A 0-Dimensional string tensor
A=tf.constant(123) # A 0-D int32 tensor
B=tf.constant([123,456,789]) # A 1-D int32 tensor
C=tf.constant([[123,456,789],[222,333,444]]) # A 2-D int32 tensor

###  Placeholders are input data that doesn't change as your model is trained over time. <br> tf.placeholder() returns a tensor that gets its value from data passed to the tf.session.run() function.  Initial values aren't specified.  Values are specified at run time.

In [198]:
#placeholders (values must be feed using the feed_dict argument of Session.run()
x = tf.placeholder(tf.string)
y = tf.placeholder(tf.int32)
r = tf.placeholder(tf.int32)
z = tf.add(y,r)
with tf.Session() as sess:
    output = sess.run(z,feed_dict={x:'test',y:52,r:10})
    print output

62


**sess.run(output variable,feed_dict={input var 1: values, input var 2: values}** <br>
The first argument is the variable whose values will be outputed based on the input var values (assuming it's value is linked).  Notice above how x's input value is irrelevant given that the output variable z only dependings on y and r.

### Arithmetic operations must involve data of the same type.  Use cast() to convert data

In [None]:
#arithmetic operations; tf.add(), tf.multiply(), tf.divide()
#you can't do operations between un-like data types, so convert using cast     
tf.constant(2.0)-tf.cast(tf.constant(2), tf.float32) 

In [122]:
#Convert x/y  - 1  to tensorflow.
x = tf.constant(10,tf.int32)
y = tf.constant(2,tf.int32)
z = tf.subtract(tf.divide(x,y), tf.cast(tf.constant(1), tf.float64))
#division of two int32s results in a float64

with tf.Session() as sess:
    output = sess.run(z)
    print output

4.0


### Variables are tensors whose value changes over time.  They are the parameters you are optimizing in the model.  You must specify them with initial values and then initialize them all with a command.  In a neural network, these are the weights and biases.

In [4]:
d=tf.Variable(tf.truncated_normal((2, 3))) 
c=tf.Variable(tf.ones((3, 2))) 
e=tf.matmul(d,c) #matrix multiplication (you can't use tf.multiply() for matrices)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(d)
    sess.run(c)
    sess.run(e)

array([[-0.44390768, -0.61203152,  0.23738734],
       [ 0.87302935,  0.16142465, -0.73896587]], dtype=float32)

array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]], dtype=float32)

array([[-0.81855184, -0.81855184],
       [ 0.29548812,  0.29548812]], dtype=float32)

In [163]:
-0.30829522*1+1.1575352+0.02862122 #this matches [0,0] of output matrix e

0.8778612000000001

In [5]:
d=tf.Variable(tf.truncated_normal((2, 5,2))) 
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(d)

array([[[ 0.53394204, -0.24607509],
        [-0.8625955 ,  0.77668595],
        [ 0.30873638,  0.15395114],
        [-1.70844173,  1.19698119],
        [ 1.36034513, -0.75936306]],

       [[ 0.24787042,  0.04737306],
        [-0.89776468, -0.31662938],
        [-1.02188563, -0.56089604],
        [ 0.90292042,  0.66548216],
        [ 1.73288608, -0.06248853]]], dtype=float32)

In [189]:
#variables must be initialized (with initial values) before they can be used
random_data=tf.Variable(tf.truncated_normal((2,2))) #
weights=tf.Variable(tf.truncated_normal((2, 1))) #
bias=tf.Variable(tf.zeros(3)) #tf.zeros generates 3 zeros
results=tf.add(tf.matmul(random_data,weights),bias)

sess=tf.Session()
sess.run(tf.global_variables_initializer()) 
sess.run(random_data)
sess.run(weights)
sess.run(bias)
sess.run(results)             #this will return an error if the variable is not initialized         

array([[-0.15133592,  0.44067106],
       [-1.28744531,  0.0758683 ]], dtype=float32)

array([[ 1.68945396],
       [ 0.61692977]], dtype=float32)

array([ 0.,  0.,  0.], dtype=float32)

array([[ 0.01618803,  0.01618803,  0.01618803],
       [-2.1282742 , -2.1282742 , -2.1282742 ]], dtype=float32)

In [190]:
-0.15133592*1.68945396+0.44067106*0.61692977

0.016188026357213003

In [172]:
#alternatively, below seems to highlight that all indented actions belong to this session 
with tf.Session() as sess: 
    sess.run(tf.global_variables_initializer()) 
    sess.run(bias)

array([ 0.,  0.,  0.], dtype=float32)

tf.truncated_normal((5, 10) produces a matrix of 5 rows and 10 columns of random values in a truncated normal distribution, where values cannot exceed 2 standard deviations from the mean.

### Run a softmax function, which converts logit values (log-odds) to probabilities in the output layer of a Neural Network

In [203]:
def run():
    output = None
    logits = tf.placeholder(tf.float32) #placeholder for input data that we'll feed into sess 
    softmax =  tf.nn.softmax(logits)    #function that depends on value of logits placeholder
                                        #this will be the output variable 
    logit_data = [2.0, 1.0, 0.1] #input data

    
    with tf.Session() as sess:

        output = sess.run(softmax, feed_dict={logits:logit_data})
    return output
run()

array([ 0.65900117,  0.24243298,  0.09856589], dtype=float32)

### Calculate the cross-entropy cost function (for classification) using softmax probabilities and one-hot encoded labels of those probabilities

cross-entropy for one observation <br>
cross-entropy = -log(y-hat) for the true label <br>
cross-entropy =  - sum [  log(softmax prob j)  x one-hot-encoded label aka 1s 0s ]  <br>
cross-entropy =  - sum [  log(predicted prob)  x actual y label ]

<img src="pictures/cross-entropy-diagram.png" style="width: 400px;">

In [221]:
import tensorflow as tf

softmax_data = [0.7, 0.2, 0.1] #y-hat
one_hot_data = [1.0, 0.0, 0.0] #y

softmax = tf.placeholder(tf.float32)
one_hot = tf.placeholder(tf.float32)
cross_entropy=-tf.reduce_sum(tf.multiply(one_hot,tf.log(softmax)))

#Print cross entropy from session
with tf.Session() as sess:
    output=sess.run(cross_entropy,feed_dict={softmax:softmax_data,one_hot:one_hot_data})
print output

0.356675


In [225]:
a = 1000000000
for i in range(1000000):
    a = a + 1e-6
print a - 1000000000

0.953674316406


In [3]:
for j in range(0,100,30):
    print j

0
30
60
90


In [13]:
import math
def batches(batch_size, features, labels):

    assert len(features) == len(labels)
    # TODO: Implement batching
    output_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        output_batches.append(batch)
        
    return output_batches


In [20]:
# 4 Samples of features
example_features = [
    ['F11','F12','F13','F14'],
    ['F21','F22','F23','F24'],
    ['F31','F32','F33','F34'],
    ['F41','F42','F43','F44']]
# 4 Samples of labels
example_labels = [
    ['L11','L12'],
    ['L21','L22'],
    ['L31','L32'],
    ['L41','L42']]


[[['F41', 'F42', 'F43', 'F44']], [['L41', 'L42']]]

In [29]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

In [34]:
mnist = input_data.read_data_sets('.../MNIST_data', one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting .../MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting .../MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting .../MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting .../MNIST_data/t10k-labels-idx1-ubyte.gz


In [37]:
train_features = mnist.train.images
test_features = mnist.test.images

In [53]:
import pandas as pd
pd.DataFrame(train_features).loc[0,:].value_counts()

0.000000    616
0.996078     53
0.921569      7
0.239216      6
0.984314      4
0.780392      4
0.917647      3
0.807843      3
0.894118      3
0.458824      3
0.329412      3
0.992157      2
0.349020      2
0.450980      2
0.545098      2
0.945098      2
0.266667      2
0.949020      2
0.082353      2
0.741176      2
0.223529      2
0.615686      2
0.843137      1
0.294118      1
0.160784      1
0.149020      1
0.835294      1
0.658824      1
0.415686      1
0.933333      1
           ... 
0.862745      1
0.188235      1
0.015686      1
0.733333      1
0.980392      1
0.972549      1
0.090196      1
0.964706      1
0.960784      1
0.352941      1
0.952941      1
0.941177      1
0.690196      1
0.937255      1
0.466667      1
0.443137      1
0.050980      1
0.462745      1
0.337255      1
0.243137      1
0.662745      1
0.905882      1
0.098039      1
0.650980      1
0.321569      1
0.745098      1
0.019608      1
0.874510      1
0.870588      1
0.890196      1
Name: 0, dtype: int64

In [57]:
import numpy as np
np.exp(-5)

0.006737946999085467

In [59]:
import numpy as np

def sigmoid(x):
    """
    Calculate sigmoid
    """
    return 1/(1+np.exp(-x))

def sigmoid_prime(x):
    """
    # Derivative of the sigmoid function
    """
    return sigmoid(x) * (1 - sigmoid(x))

learnrate = 0.5
x = np.array([1, 2, 3, 4])
y = np.array(0.5)

# Initial weights
w = np.array([0.5, -0.5, 0.3, 0.1])

### Calculate one gradient descent step for each weight
### Note: Some steps have been consilated, so there are
###       fewer variable names than in the above sample code

# TODO: Calculate the node's linear combination of inputs and weights
import numpy as np
h = np.dot(x,w)

# TODO: Calculate output of neural network
nn_output = sigmoid(h)

# TODO: Calculate error of neural network
error = (y-sigmoid(h))

# TODO: Calculate the error term
#       Remember, this requires the output gradient, which we haven't
#       specifically added a variable for.
error_term = error*sigmoid_prime(h)

# TODO: Calculate change in weights
del_w = [learnrate* error_term * x[0],
          learnrate* error_term * x[1],
          learnrate* error_term * x[2],
          learnrate* error_term * x[3]]

print('Neural Network output:')
print(nn_output)
print('Amount of Error:')
print(error)
print('Change in Weights:')
print(del_w)

Neural Network output:
0.689974481128
Amount of Error:
-0.189974481128
Change in Weights:
[-0.020318691802303994, -0.040637383604607988, -0.060956075406911982, -0.081274767209215976]


In [41]:
weights = tf.Variable(tf.truncated_normal([2, 3],seed=1))
bias = tf.Variable(tf.truncated_normal([3]))
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(weights))
    print('')
    print(sess.run(bias))

[[ 0.28183395 -0.308662   -0.44758153]
 [-0.80714601 -0.29025713  0.51695603]]

[-1.6614666  -0.94179952 -0.38951397]
