## Intro to Tensorflow - Part IV - Implementation of Neural Networks

In [2]:
import tensorflow as tf

In [3]:
#!pip install --upgrade pip

In [4]:
#!pip install --upgrade tensorflow

#### First reset the default graph

In [5]:
tf.reset_default_graph()

In [6]:
#Import a few other packages some included others provided to you for this project
import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
from tensorflow.python.framework import ops
from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict

%matplotlib inline
np.random.seed(1)

### Warmup exercise

In [7]:
y = tf.constant(9, dtype=tf.float32, name='y')                   #Define y, and set to 9
y_hat = tf.constant(5, dtype=tf.float32, name='y_hat')           #Define y_hat and set it to 5
loss = tf.Variable((y-y_hat)**2,dtype=tf.float32, name='loss')   #define loss as variable and its calcualtion

init = tf.global_variables_initializer()                         #initialize everything

with tf.Session() as sess:                                       #now run the session
    sess.run(init)
    print(loss.eval())
    

Instructions for updating:
Colocations handled automatically by placer.
16.0


Writing and running programs in TensorFlow cab be decribed as these 5 steps:

1. Create Tensors (variables) that are not yet executed/evaluated. 
2. Write operations between those Tensors.

3. Initialize your Tensors. 
4. Create a Session. 
5. Run the Session. This will run the operations you'd written above. 

The first two creates the Tensorflow graph and the last 3 involves evaluating the graph.

Therefore, when we created a variable for the loss, we simply defined the loss as a function of other quantities, but did not evaluate its value. To evaluate it, we had to run `init=tf.global_variables_initializer()`. That initialized the loss variable, and in the last line we were finally able to evaluate the value of `loss` and print its value.

Now let us look at another easy example. Run the cell below:

In [8]:
x = tf.constant(5, dtype=tf.float32, name='x')
y = tf.constant(3, dtype=tf.float32, name='y')
f = x**2 + y -3
print(f)

Tensor("sub_1:0", shape=(), dtype=float32)


In [9]:
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print(f.eval())

25.0


Great! To summarize, remember to **initialize your variables, create a session and run the operations inside the session.**

Next, we are take look at placeholders. It is an important utility in Tensorflow.  A placeholder is an object whose value you can specify only later. To specify values for a placeholder, you can pass in values by using a "feed dictionary" (feed_dict variable). Below, we created a placeholder for x. This allows us to pass in a number later when we run the session.

In [10]:
#define x as palceholder for an number (float32 type in this case)
x = tf.placeholder(dtype=tf.float32, name='x')

with tf.Session() as sess:
    for val in range(10):
        #val of x are fed in for x at each step and calculate the y-values
        y = sess.run(x**2 -3, feed_dict={x:val})
        print('y = ', y)
sess.close()

y =  -3.0
y =  -2.0
y =  1.0
y =  6.0
y =  13.0
y =  22.0
y =  33.0
y =  46.0
y =  61.0
y =  78.0


## EX 1.
#### Randomly initiate two vectors of dimension 10 in Tensorflow and calcuate the euclidean disance between them.

In [11]:
tf.reset_default_graph()

x1 = tf.placeholder(dtype=tf.float32, shape=[10], name='x1')
x2 = tf.placeholder(dtype=tf.float32, shape=[10], name='x2')
dist = tf.sqrt(tf.reduce_sum(tf.square(x1-x2)))

dictt = dict()
dictt[x1] = np.random.randn(10)
dictt[x2] = np.random.randn(10)

with tf.Session() as sess:
    dist = sess.run(dist, feed_dict=dictt)

print(dist)

2.9352398


When you first defined `x` you did not have to specify a value for it. A placeholder is simply a variable that you will assign data to only later, when running the session. We say that you **feed data** to these placeholders when running the session. 

Here's what's happening: When you specify the operations needed for a computation, you are telling TensorFlow how to construct a computation graph. The computation graph can have some placeholders whose values you will specify only later. Finally, when you run the session, you are telling TensorFlow to execute the computation graph.

## Now we are going to start putting together the elements that we need to implement neural networks algorithm in steps.

### First let us try to get matrix multiplication in Tensorflow working right for us

In [12]:
tf.reset_default_graph()
X = tf.constant(np.ones([5,5]), name = "X")
W = tf.constant(np.ones([5,2]), name = "W")
b = tf.constant(np.random.randn(5,1), name = "b")
with tf.Session() as sess:
    z = sess.run(tf.add(tf.matmul(X,W),b))
    print('z =', z)

z = [[3.89938082 3.89938082]
 [6.14472371 6.14472371]
 [5.90159072 5.90159072]
 [5.50249434 5.50249434]
 [5.90085595 5.90085595]]


 ## Exercise 2:  
 #### Define three matrices of dimensions 3x3, 3x4, and 4x4 of random numbers as tensor constants and and calculate the product of all three using Tensorflow </color>

In [13]:
m1 = tf.random.uniform([3,3],0,10)
m2 = tf.random.uniform([3,4],0,10)
m3 = tf.random.uniform([4,4],0,10)

result = tf.matmul(tf.matmul(m1,m2),m3)
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print("m1: ",m1.eval())
    print("m2: ",m2.eval())
    print("m3: ",m3.eval())
    print("result: ",result.eval())

m1:  [[1.6376138 9.015155  8.805347 ]
 [4.0429068 2.9545856 3.3601058]
 [3.2843328 2.3481107 4.79424  ]]
m2:  [[9.853394  8.788494  4.856735  4.311963 ]
 [7.5058246 1.5067458 0.4169774 3.2545424]
 [1.5525901 4.2622805 6.7423544 8.861276 ]]
m3:  [[7.374016   5.0932813  0.58565855 7.4639797 ]
 [8.329463   9.170397   9.827399   3.7922823 ]
 [0.78127384 6.058632   2.571025   4.896269  ]
 [1.4031482  8.543531   9.286342   6.7104254 ]]
result:  [[2263.9229 1303.5292 2546.9397 1683.8037]
 [1774.635  1037.498  2020.506  1245.8259]
 [2319.7334 1330.837  2617.2427 1654.3594]]


#### 1. Linear function (hypothesis) calculation

In [14]:
tf.reset_default_graph()

In [15]:
def lin_fun(X, W, b):
    z = tf.add(tf.matmul(X,W), b)
    return z

In [16]:
Xr = np.random.randn(10,10)

In [17]:
#X = tf.placeholder(tf.float64,shape=(10,10), name='X')
#W = tf.placeholder(tf.float64, shape=(10,10), name='W')
Xr = np.random.randn(10,10)
W1 = np.random.randn(10,20)
b1 = np.random.randn(1)
X = tf.placeholder(tf.float64, name='X')
W = tf.placeholder(tf.float64, name='W')
b = tf.placeholder(tf.float64, name='b')
#W1 = tf.constant(np.random.randn(10,10), name = "W1")
#b1 = tf.constant(np.random.randn(1,1))
with tf.Session() as sess:
    z = sess.run([lin_fun(X,W,b)], feed_dict={X:Xr, W:W1, b:b1})
    #Let us verify we are getting the right matriz out
    print(z)

[array([[ 4.62737425, -1.69324339,  0.92007405,  2.99304044, -2.50177457,
        -0.45683678,  3.20199762, -2.42816813,  6.22272206, -1.64962547,
        -3.6257785 , -2.49528668,  2.43891757,  2.22727518, -1.48238289,
        -4.78707147, -1.40704019,  5.66844999, -1.26449309,  0.89697317],
       [-1.89259926, -0.2213172 , -3.437651  ,  0.2937006 ,  0.74708009,
        -2.71666738, -0.28484516, -0.69911654, -1.73927413,  0.74653693,
         1.38692545,  4.276587  ,  0.9710673 , -0.06279389, -0.3905815 ,
         2.73713401, -0.65487695, -1.27293267, -1.02699808,  1.74792675],
       [ 2.41740599,  0.17966539, -3.86559159, -1.85116677,  1.24332134,
        -4.51650803,  2.91184696,  2.86649683, -0.95066753,  1.43532332,
         2.79845187, -2.26473934,  3.67224225, -0.9128823 ,  7.25634433,
        -0.69043275, -3.03226484,  4.08194484,  0.5746102 ,  1.57566876],
       [-2.46866651, -0.39065236, -0.60933301, -0.5641226 ,  2.37432786,
        -2.23169274, -1.58291303,  0.66333944, 

### This following is one step of the neural network calcualtion: from input to ready for activation using the function we just defined

In [18]:
#For linear function calculation for 10 nodes with 10 feature inputs
tf.reset_default_graph()
#first the fearure vector of dimension 1x10
X = tf.constant(np.random.randn(1,10), name = "X")
#the weights matrix W of dimension 10x10 that produces 10 activations
W = tf.constant(np.random.randn(10,10), name = "W")
#now 1X10 bias b vector added
b = tf.constant(np.random.randn(1,10), name = "b")
with tf.Session() as sess:
    t = tf.matmul(X,W)
    print('t =',sess.run(t))
    print('b = ', sess.run(b))
    #calculate the linear function
    z = sess.run(lin_fun(X, W,b))
    print('z =', z)

t = [[-2.24152505 -1.90377908  0.10518085 -0.87580661 -0.28258289  0.0369864
  -1.02475681  1.50358803  1.55950109  0.56875331]]
b =  [[ 0.68400133 -0.35340998 -1.78791289  0.36184732 -0.42449279 -0.73153098
  -1.56573815  1.01382247 -2.22711263 -1.6993336 ]]
z = [[-1.55752372 -2.25718906 -1.68273204 -0.51395929 -0.70707568 -0.69454458
  -2.59049496  2.5174105  -0.66761155 -1.13058029]]


## Exercise 3 
#### Calculate the linear function for input vector with 20 features and 10 activations

In [19]:
tf.reset_default_graph()

X = tf.constant(np.random.randn(1,20), name="X")     #20 features
W = tf.constant(np.random.randn(20,10), name="W")    #10 activations
b = tf.constant(np.random.randn(1,10), name='b')     #1x10 bias vector

with tf.Session() as sess:
    t = tf.matmul(X,W)
    print('t =',sess.run(t))
    print('b = ', sess.run(b))
    z = sess.run(lin_fun(X,W,b))
    print('z = ', z)

t = [[-1.8805546  -3.96946606 -1.45661758 -4.99158183  4.29165632 -9.17251828
  -4.80516969 10.05421238  8.58201632 -7.72605462]]
b =  [[ 0.4972691   0.2373327  -2.14444405 -0.36956243 -0.01745495  0.73140252
   0.95449567  0.09574677  1.0334508  -0.14627327]]
z =  [[-1.3832855  -3.73213336 -3.60106163 -5.36114425  4.27420136 -8.44111577
  -3.85067402 10.14995915  9.61546713 -7.87232789]]


### Now let us calculate the activations for the first layer

In [20]:
with tf.Session() as sess:
    a = sess.run(tf.sigmoid(z))
    print(a)
    #The index of the largest entry, the choice
    print(tf.argmax(a, axis=1).eval())  #the 

[[2.00481853e-01 2.33819024e-02 2.65695223e-02 4.67358520e-03
  9.86268029e-01 2.15762660e-04 2.08225975e-02 9.99960924e-01
  9.99933315e-01 3.81000846e-04]]
[7]


## Ex 4: 
#### Try working the same (the activations a, and the index of the max entry in each column) for a 5x5 matrix X, 5x3 matrix W and appropriate b vector

In [21]:
tf.reset_default_graph()

X = tf.constant(np.random.randn(5,5), name='X')
W = tf.constant(np.random.randn(5,3), name='W')
b = tf.constant(np.random.randn(1,3), name='b')

with tf.Session() as sess:
    z = sess.run(lin_fun(X,W,b))
    a = sess.run(tf.sigmoid(z))
    print(a)
    print(tf.argmax(a,axis=1).eval())

[[0.90946291 0.9937326  0.77963643]
 [0.83993068 0.01280862 0.64267794]
 [0.52497656 0.00616977 0.17312897]
 [0.82515619 0.00700432 0.31356929]
 [0.95898659 0.68184104 0.78902589]]
[1 0 0 0 0]


## Neural Networks -  Mutilayers:

In [22]:
tf.reset_default_graph()
#first the fearure vector of dimension 1x10
X = tf.constant(np.random.randn(2,10), name = "X")
#the weights matrix W of dimension 10x10 that produces 10 activations
W1 = tf.constant(np.random.randn(10,20), name = "W1")
W2 = tf.constant(np.random.randn(20,15), name = "W2")
#now 1X10 bias b vector added
b1 = tf.constant(np.random.randn(1,1), name = "b1")
b2 = tf.constant(np.random.randn(1,1), name = "b2")
with tf.Session() as sess:
    t1 = tf.matmul(X,W1)
    t2 = tf.matmul(t1,W2)
    print('t1 =',sess.run(t1))
    print('t2 =',sess.run(t2))
    print('b1 = ', sess.run(b1))
    print('b2 = ', sess.run(b2))
    #calculate the linear function
    z1 = sess.run(lin_fun(X, W1,b1))
    z2 = sess.run(lin_fun(t1, W2,b2))
    print('z1 =', z1)
    print('z2 =', z2)

t1 = [[ 1.70121682 -1.62706074 -0.18924164 -1.41491851 -2.88615108 -1.10396498
   0.65810806 -0.23895881  2.19750782 -1.5547158   1.28408889  0.02355386
  -2.40309095 -2.1223578   0.99424222  1.07489665  1.87616587 -3.20846533
  -3.30278407 -1.31521552]
 [ 0.58289675 -1.22631104  0.43508554 -0.60279374 -0.97048915 -2.11769391
  -0.84678621 -0.1974878   1.28243424  0.51543104 -1.59911334  0.54260597
  -0.40830782 -2.34240684 -2.75400065  0.38536477 -0.79593992 -1.4294501
  -1.9398684  -0.24621014]]
t2 = [[ -9.76588684  -2.53390881  -5.99305799   1.53001142  -2.11687331
    4.42344809 -12.6956139   -3.83823601  -6.46113219   5.3011592
  -12.61066027  -7.12376503   8.64522211 -18.1649041    2.89023458]
 [ -5.9804357    2.86481399  -2.95527775  -8.98020707   5.10571433
    5.94252481  -2.76498024  -7.66044768   0.33286772  -1.21344299
   -6.92441945  -1.90733013   8.12604065 -11.95382493   0.8442229 ]]
b1 =  [[-0.07094967]]
b2 =  [[0.03406586]]
z1 = [[ 1.63026714 -1.69801041 -0.26019132 -1

In [23]:
with tf.Session() as sess:
    print(tf.argmax(z2, axis=1).eval())

[12 12]


## Ex 5:
#### Write the code for a neural network algorithm where input is 5x10 matrix, first layer has 20 activations, the second has 15, the third has 20, and the multiclassification output has 5 classes

In [26]:
tf.reset_default_graph()

#((XW + b)W2 +b2 ...)Wn +bn

#weights
X = tf.constant(np.random.randn(5,10))
W1 = tf.constant(np.random.randn(10,20))
W2 = tf.constant(np.random.randn(20,15))
W3 = tf.constant(np.random.randn(15,20))
W4 = tf.constant(np.random.randn(20,5))

#bias
b1 = tf.constant(np.random.randn(1,20))
b2 = tf.constant(np.random.randn(1,15))
b3 = tf.constant(np.random.randn(1,20))
b4 = tf.constant(np.random.randn(1,5))

#layers
z1 = lin_fun(X,W1,b1)
z2 = lin_fun(z1,W2,b2)
z3 = lin_fun(z2,W3,b3)
z4 = lin_fun(z3,W4,b4)

#argmax
mpo = tf.argmax(z4, axis=1)

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print('z1 = ',z1.eval())
    print('z2 = ',z2.eval())
    print('z3 = ',z3.eval())
    print('z4 = ',z4.eval())
    print('argmax: ',mpo.eval())

z1 =  [[ -1.08000371   0.5739863    2.69522937  -2.87023972   0.53528184
    5.30545654   0.9228641    5.99089519   4.62482081  -0.68012586
   -4.66142356   3.3322048   -5.44911842  -1.53937687   1.68942127
   -1.32224183   1.30442347   1.80407338   4.73213777   1.20305486]
 [  0.60396624  -1.96363422  -1.79466465   3.30220639  -5.08630385
    3.86470833   6.14063237   0.27559071  -1.47517721   1.41696779
   -0.59904935   0.14347314   0.90410854  -2.16343364   2.43257249
   -0.01267571  -0.67353074   2.20289598   2.81399876  -6.67031613]
 [  2.35934527   1.38875148   0.26833781  -0.92991011   2.04458768
   -1.34757159   1.38849419   0.45125217  -0.79773443  -3.92290062
    1.14322825  -2.13118471  -0.68368382  -4.37336123   1.33492066
    0.24055074   0.52661102   2.84682108  -0.52686645   2.65455451]
 [ -1.28740666   2.48105255   1.35657981  -0.11098387   0.23706786
   -2.16368006  -1.57245473   1.18480708  -0.53468199  -5.01983669
    0.11379635  -2.69011468   0.59995688  -4.4381233 

## Softmax calcluation

In [25]:
# correct solution:
def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0) # only difference