## Getting Started with TensorFlow

 <span style="font-size:0.8em">Based on [Getting Started with Tensor Flow](https://www.tensorflow.org/get_started/get_started).</span></p>

In [2]:
import tensorflow as tf
import sys as sy

Check your tensorflow and python versions

In [3]:
print("tensorflow version:",tf.__version__)
print("python version is:", sy.version)

tensorflow version: 1.2.0
python version is: 3.5.3 |Continuum Analytics, Inc.| (default, May 15 2017, 10:43:23) [MSC v.1900 64 bit (AMD64)]


### Create a TensorFlow Computation Graph
<p>In a tensorflow program, you are required to build a computational graph. You build the graph 
by adding *nodes* to the graph. Nothing is actually computed until you execute the operations in the graph. The graph is run
in a tensorflow [session](https://www.tensorflow.org/api_docs/python/tf/Session). <code>tf.Session</code> is a class for running (or executing) tensorflow graph operations. </p>
<p>
Let's begin with *Hello World* in tensorflow
</p>

In [64]:
hello = tf.constant('Hello World')
sess = tf.Session()
print(sess.run(hello))

b'Hello World'


In the *Hello World* program the tf.constant hello is a *node*. By running the node in a *Session* we perform the operations specified in the node. It can seem unusual to run a constant, but this is how all operations work in tensorflow. Watch what happens if we attempt to evaluate the tf.constant directly.

In [5]:
print(hello)

Tensor("Const:0", shape=(), dtype=string)


The tf.constant named <code>hello</code> is a tensor of type string. The value cannot be evaluated outside a session. <br>
Consider basic math operations. 

In [6]:
nodeA = 7
nodeB = 3
nodeC = tf.add(nodeA,nodeB)
print(sess.run(nodeC))

10


Rather than constants, consider the use of parameters in a graph. This is accomplished by using a tensorflow **placeholder**. Consider the following [tf.placeholder](https://www.tensorflow.org/get_started/get_started) examples. 

In [7]:
node1 = tf.placeholder(tf.float32)
node2 = tf.placeholder(tf.float32)
nodeAdd = tf.add(node1,node2)

Values are provided to a palceholder by using a <code>feed_dict</code> for feed dictionary. Since placeholders are nodes the must be fed a value and evalued in a tensorflow session. 

In [65]:
print(sess.run(nodeAdd,feed_dict ={node1:2.718, node2:3.14158} ))

5.85958


In addition to placeholders, which must be fed input data inside a session, there are also tensorflow **Variables** called [tf.Variable](https://www.tensorflow.org/api_docs/python/tf/Variable). They can take on different values and they  maintain state in the tensorflow graph across calls to run() of the tensorflow session.Note the following example of a very simple neural network like computation that uses both <code>tf.Variable</code> and <code>tf.placeholder</code>

### Simple Network

In [17]:
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
x = tf.placeholder(tf.float32)
linear_model =  x * W + b

In the example above we have a <code>tf.Variable</code> tensor (a node) for the weights in a neural network called <code>W</code>, a tf.Variable for the bias <code>b</code>, and tf.placeholder for the inputs. The tensor <code>linear_model</code> is the output. Before we execute these operatins in a tensorflow graph, the tf.Variable items must be initialized. This initialialization must tae place in a tensorflow session. 

In [18]:
init = tf.global_variables_initializer()
sess.run(init)

Now we evaluate <code>linear_model</code> in a session. The placeholder is used to accept input data. 

In [20]:
print(sess.run(linear_model, feed_dict={x:[1,2,3,4]}))

[ 0.          0.30000001  0.60000002  0.90000004]


Suppose we wanted to sum the product or the weights and the inputs as we do in a **neural netwok**. In this case we would use [tf.reduce_sum](https://www.tensorflow.org/api_docs/python/tf/reduce_sum). <code>tf.reduce_sum</code> is used to sum the elements across a tensor.  Consider the following modification. In this case we sum the products of the weights and the inputs and then add the bias. This is a  simple implementation of a single neuron. 

In [21]:
linear_model2 = tf.reduce_sum(x*W) + b
print(sess.run(linear_model2, feed_dict={x:[1,2,3,4]}))

[ 2.70000005]


### Model Loss (Error )

Continuing with the original example (<code>linear_model</code>), suppose wanted to evaluate the accuracy of our model. A loss (or error) function compares the model output with the expected output. Since error can be positive or negative, it is a common practice to square the error. This ensures that the error is positive and also penalizes large errors more than small errors. to compute the error of our model, we will provide a <code>tf.placeholder for the truth data. We will also square and sum the errors (often referred to as the *sum of squared errors*). 

In [29]:
y = tf.placeholder(tf.float32)
squared_errors = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_errors)
print("The error (loss) is:", sess.run(loss, feed_dict = {x:[1,2,3,4], y:[0,-1,-2,-3]}))

The error (loss) is: 23.66


<p>
The error is high because of the values of W and b. In machine learning, the model is trained, with data, to minimize the error. When the model is trained, the weights are adjusted to minimize the error.  One of the most widely used techniques to minimize error is ***gradient descent***. There are many resources for learning gradient descent. [Here](https://spin.atomicobject.com/2014/06/24/gradient-descent-linear-regression/) is a recommended reference. 
</p>
<p>
Tensorflow provides tools to train the model. These tools are provided in the [tf.train](https://www.tensorflow.org/api_docs/python/tf/train) API. This API provides an easy way to train networks using gradient descent.  The [<code>tf.train.GradientDescentOptimizer</code>](https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer) class implements the gradient descent algorithm. The <code>minimize</code> method of this class provides a way to compute and apply (descent) a gradient. We pass a learning rate parameter which govern the size of the gradient descent steps. 
</p>

In [40]:
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(loss)

Finally, we run the optimize step 1000 times, in a tensorflow session. 

In [47]:
training_steps = 1000
sess.run(init) 
for i in range(training_steps): 
    sess.run(train, feed_dict = {x:[1,2,3,4], y:[0,-1,-2,-3]})
        

After training the model, the weights should be adusted to minimimze the error. Lets check the loss: 

In [50]:
y = tf.placeholder(tf.float32)
squared_errors = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_errors)
print("The error (loss) is:", sess.run(loss, feed_dict = {x:[1,2,3,4], y:[0,-1,-2,-3]}))

The error (loss) is: 5.69997e-11


Now lets check the weights and the bias. 

In [59]:
print("After training, the weight is:", sess.run(W))
print("and the bias is", sess.run(b))

After training, the weight is: [-0.9999969]
and the bias is [ 0.99999082]


If we go back to the original weights and biases, note the loss

In [60]:
sess.run(init) 
y = tf.placeholder(tf.float32)
squared_errors = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_errors)
print("The error (loss) is:", sess.run(loss, feed_dict = {x:[1,2,3,4], y:[0,-1,-2,-3]}))

The error (loss) is: 23.66


To double check, watch what happens if we hard set the weights and the bias, to trained weight and bias. 

In [63]:
fixW = tf.assign(W, [-0.9999969])
fixb = tf.assign(b, [0.99999082])
sess.run([fixW, fixb])
print(sess.run(loss, {x:[1,2,3,4], y:[0,-1,-2,-3]}))

5.69997e-11


### Summary
In this exercise we performed the following: 
- built a linear regression model using tf.Variable and tf.placeholder 
- briefly looked at how a similar approach could be used to build a simple neural network 
- calculated the error in our simple network
- use the <code>tf.train</code> API to train our network using **gradient descent**
- evaluated the error after tarining the network 
