In [1]:
import tensorflow as tf

In [2]:
node1 = tf.constant(3., dtype=tf.float32)
node2 = tf.constant(4.)

In [3]:
print node1
print node2

Tensor("Const:0", shape=(), dtype=float32)
Tensor("Const_1:0", shape=(), dtype=float32)


In [4]:
print node1 + node2
print tf.add(node1, node2)

Tensor("add:0", shape=(), dtype=float32)
Tensor("Add:0", shape=(), dtype=float32)


So, the above operation just created a node. We do not get the addition of nodes as expected, it just produces another node, i.e. another tensor. 
We need to explicitly create and run a session with the operation to get the intended output!

In [5]:
sess = tf.Session()

In [6]:
print sess.run([node1, node2])

[3.0, 4.0]


In [7]:
print sess.run(node1+node2)

7.0


Let's check another cool feature using which we can do some dynamic addition. <br>As in the above explanation, the static values of node1 and node2 are added to produce 7. <br>How if we can change values of node1 and node2 at will??
Placeholders are used for this purpose!

In [8]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
c = a + b # operator over loading done inside tf, which is ideally tf.add(a, b)

In [9]:
print sess.run(c, {a: 3., b: 4.})

7.0


In [10]:
print sess.run(c, {a: 4., b: 5.})

9.0


The simpler explanation for guys who understand c is we have just created a pointer with no value initialization. <br>
And thus we add values at will by just passing a dictionary of the placeholder as key and its values as value

We can also pass in vectors, matrices instead of scalar values of a and b.

In [11]:
print sess.run(c, {a: [1., 2.], b: [3.3, 4.3]})

[ 4.30000019  6.30000019]


In [12]:
add_and_triple = c * 3

In [13]:
sess.run(add_and_triple, {a: 3., b: 2.5})

16.5

Let's design a linear model of the form y = mx + b, where <br>
y = dependent var <br>
x = independent var <br>
m = slope or derivative <br>
b = bias or intercept <br>

In [14]:
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
x = tf.placeholder(tf.float32)

In [15]:
linear_model = W * x + b

In [16]:
init = tf.global_variables_initializer()
sess.run(init)

In [17]:
print sess.run(linear_model, {x: [1, 2, 3, 4]})

[ 0.          0.30000001  0.60000002  0.90000004]


In [18]:
y = tf.placeholder(tf.float32)

In [19]:
squared_deltas = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_deltas)

In [20]:
print sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]})

23.66


Now, let's try to reduce the loss. <br>
We will use the GradientDescentOptimizer so that, forward prop and backward prop are used to reduce the cost by initializing W and b after each iteration of the gradient descent.

In [21]:
optimizer = tf.train.GradientDescentOptimizer(0.01) #learning_rate alpha
train = optimizer.minimize(loss)

In [24]:
sess.run(init)
for i in range(1000):
    if i % 100. == 0:
        print "Iterations: {}".format(i)
    sess.run(train, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]})

Iterations: 0
Iterations: 100
Iterations: 200
Iterations: 300
Iterations: 400
Iterations: 500
Iterations: 600
Iterations: 700
Iterations: 800
Iterations: 900


In [25]:
print sess.run([W, b])

[array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)]


What this means is for values 1 and -1, the loss function will be minimum. <br>
If we draw the intercepts on a coordinate system, you will find out that loss = 0 when W, b = 1., -1. 

Let's see if what I said above is correct?

In [26]:
fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])

In [28]:
sess.run([fixW, fixb])
print sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]})

0.0


So, by using machine learning, we found out these values above, -0.9999969 and 0.99999082, which would minimize the loss. <br>
We hard coded 1 and -1 to check for this particular network because it was linear and easy to interpret, but incase of complex functions and models, we wont be able to do this, and hence we will use the GradientDescentOptimizer and some other optimizers with maybe tuning some hyperparameters to minimize the loss :)