<a href="https://colab.research.google.com/github/hikmatfarhat-ndu/CSC645/blob/master/shallow_tensorflow_answer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Using Tensorflow to model the shallow network
In this exercise will we redo, using tensorflow the shallow network that we trained from first
principles before to recognize the "flower" shape.


### Reading the data
First recall that tensorflow stacks the samples row-wise instead of column-wise
as we have been doing when we did the gradient descent oursleves. Therefore in the last line of the
function load_dataset() below we don't take the transpose of X and Y as we did before.

In [None]:
import tensorflow as tf
import numpy as np

def load_dataset(n):
    np.random.seed(1)
    m = n # number of examples
    N = int(m/2) # number of points per class
    D = 2 # dimensionality
    X = np.zeros((m,D),dtype='float32') # data matrix where each row is a single example
    Y = np.zeros((m,1), dtype='float32') # labels vector (0 for red, 1 for blue)
    a = 4 # maximum ray of the flower

    for j in range(2):
        ix = range(N*j,N*(j+1))
        t = np.linspace(j*3.12,(j+1)*3.12,N) + np.random.randn(N)*0.2 # theta
        r = a*np.sin(4*t) + np.random.randn(N)*0.2 # radius
        X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
        Y[ix] = j

    return X, Y



### Defining the parameters
Below we define the parameters that are needed. We know that n_x=2 and n_y=1 but we extract them from the shape of
X_data and Y_data after we call load_dataset() . We also set the number
of data points to 500.

In [None]:
learning_rate = 5
nb_iterations = 10000
num_data=500 #number of data points
X,Y=load_dataset(num_data)

#X_data,Y_data=load_dataset(num_data)# load data
# Network Parameters
n_h = 4 # number of neurons in hidden layer
n_x = X.shape[1] #number of neurons in input
n_y = Y.shape[1] #number of neurons in ouput


### Initialization

Since tensorflow stacks the data row-wise the forward propagation is slightly different then we are used to.
Let $W^1$,$W^2$,$b^1$,$b^2$ be the weights and biases of the first and second layer respectively then forward propagation is define as
\begin{align*}
Z^1&=X\cdot W^1+b^1\\
A^1 &=\sigma(Z^1)\\
Z^2 &=A^1\cdot W^2+b2\\
A^2 &=\sigma(Z^2)
\end{align*}
Accorging to the above equations you have to define the tensorflow variables that will hold the weights and biases. 
The biases are set to zero using the tensorflow function tf.zeros([size]) and the weights randomly using tf.random_normal([size1,size2]) using the appropriate sizes.
Also we have to define two placeholders for the data X and Y

In [None]:

initializer = tf.initializers.RandomNormal()

#W1=tf.Variable(tf.random.normal([n_x,n_h]),dtype='float32')#Weights of the first layer
W1=tf.Variable(initializer([n_x,n_h]),trainable=True,dtype=tf.float32)
W2=tf.Variable(initializer([n_h,n_y]),trainable=True,dtype=tf.float32)

#W2=tf.Variable(tf.random.normal([n_h,n_y]),dtype='float32')#weights of the second layer
b1=tf.Variable(tf.zeros([n_h]))            #biases of the first layer
b2=tf.Variable(tf.zeros([n_y]))            #biases of the second layer
print(n_y)

### Defining the model
Our model has two layers. The function "model" below should return the ouput of our model for a given input.

In [None]:
def model(input):
    # Hidden fully connected layer with 256 neurons
   
    layer_1 = tf.add(tf.matmul(input, W1), b1)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.matmul(tf.sigmoid(layer_1), W2) + b2
    return out_layer

Once the model is defined the remaining code is similar to our previous exercise. We define the loss
as an average over the cross-entropy but this time since it is binary classification we use the sigmoid instead
of the softmax function. Then our optimizer uses gradient descent to minimize the loss

In [None]:

# Define loss and optimize
def loss(pred,label):
   return tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=pred, labels=label))


The model is defined now we run our computation in a session.

In [None]:
# Initializing the variables
optimizer=tf.optimizers.SGD(learning_rate)
def train(model,input,output):
  with tf.GradientTape() as tape:
    diff=loss(model(X),Y)
  grad=tape.gradient(diff,[W1,W2,b1,b2])
  optimizer.apply_gradients( zip( grad , [W1,W2,b1,b2] ) )
print(loss(model(X),Y))

for i in range(nb_iterations):
 if(i%100==0):
   print(loss(model(X),Y))
 train(model,X,Y)
 
def prediction(X):
  a=tf.math.sigmoid(model(X))
  return tf.cast((a>0.5),tf.int32)
pT=tf.transpose(prediction(X))
print(np.dot(pT,Y))
print(np.dot(1-pT,1-Y))
correct=np.dot(pT,Y)+np.dot(1-pT,1-Y)
accuracy=100*float(np.squeeze(correct))/float(Y.shape[0])
print("Accuracy="+str(accuracy))
