# Introduction to Neural Networks 


** Ecole Centrale Nantes **

** Diana Mateus **


** Participants : **

Sakthi Vikneswar Suresh Babu

## General description
In this lab we will create a simple classifier based on neural networks. We will progress in two parts:
- In the first part, and to better understand the involved operations, we will create a single-neuron model and optimize its parameters "by hand". For this first part we will only use the **Numpy** library
- We will then build a multi-layer perceptron with the built-in library **Keras** module and **tensorflow**. Tensorflow is already installed in the university computers. If using your own computer you should have already installed **tensorflow** or use **collab** online platform.




In [None]:
import numpy as np
import matplotlib.pyplot as plt
import h5py

### Loading the dataset
Start by runing the following lines to load and visualize the data.

In [None]:
#Uncomment if using COLAB  (and comment next cell)
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
IMDIR = ('/content/drive/MyDrive/Colab Notebooks/dataset')


In [None]:
def load_dataset(IMDIR):
    train_dataset = h5py.File(IMDIR+'/train_catvnoncat.h5', "r")
    train_x = np.array(train_dataset["train_set_x"][:]) 
    train_y = np.array(train_dataset["train_set_y"][:])
    test_dataset = h5py.File(IMDIR+'/test_catvnoncat.h5', "r")
    test_x = np.array(test_dataset["test_set_x"][:]) 
    test_y = np.array(test_dataset["test_set_y"][:])
    classes = np.array(test_dataset["list_classes"][:]) 
    
    train_y = train_y.reshape((1, train_y.shape[0]))
    test_y = test_y.reshape((1, test_y.shape[0]))
    
    return train_x, train_y, test_x, test_y, classes

train_x, train_y, test_x, test_y, classes=load_dataset(IMDIR)

#### Visualize data

In [None]:
# run several times to visualize different data points
# the title shows the ground truth class labels (0=no cat , 1 = cat)
index = np.random.randint(low=0,high=train_y.shape[1])
plt.imshow(train_x[index])
plt.title("Image "+str(index)+" label "+str(train_y[0,index]))
plt.show()
print ("Train X shape: " + str(train_x.shape))
print ("We have "+str(train_x.shape[0]), 
       "images of dimensionality " 
       + str(train_x.shape[1])+ "x"
       + str(train_x.shape[2])+ "x"
       + str(train_x.shape[3]))

#### Preprocessing
In the following lines we vectorize the images (Instead of a 2-D image we will give as input to the models a 1-D vector). The normalization makes the image intensities be between 0 and 1, and converts the images to floats.

In [None]:
train_x, train_y, test_x, test_y, classes=load_dataset(IMDIR)
train_x = train_x.reshape(train_x.shape[0], -1).T
test_x = test_x.reshape(test_x.shape[0], -1).T
print ("Train X shape: " + str(train_x.shape))
print ("Train Y shape: " + str(train_y.shape))
print ("Test X shape: " + str(test_x.shape))
print ("Test Y shape: " + str(test_y.shape))

In [None]:
train_x = train_x/255.
test_x = test_x/255.

### 1. Classification with a single neuron 


**a)** Fill-in the following three functions to define the single neuron model (a single neuron in the hidden layer):
- A function **initialize_parameters** of the neuron. The function will randomly initializes the model's weights with small values. Initialize the bias with 0. What is the number of weights required? pass this information as a parameter to the function.
- A function **sigmoid** that computes the sigmoid activation function
- A function **neuron** that given an input vector, the weights and bias, computes the output of the single neuron model

In [None]:
def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-z))
#print(sigmoid(0.5))

In [None]:
def initialize_parameters(dim):
    w = np.random.randn(dim,1)*0.01
    b = 0
    return w, b
print(initialize_parameters(100))    

In [None]:
def neuron(w,b,X):
    pred_y = sigmoid(np.matmul(w.T,X) + b)
    return pred_y
  

**b)** **Forward Pass:**
Use the three functions above to compute a first forward pass for the input matrix $X$ containing the loaded dataset, for some initialization of the weights and bias.
 
 \begin{align}
 Y_{\rm pred}=\sigma(w^\top X+b) = [y_{\rm pred}^{(1)},y_{\rm pred}^{(2)},\dots,y_{\rm pred}^{(m)}]
 \end{align}
 

In [None]:
w,b = initialize_parameters(train_x.shape[0])
FP = neuron(w,b,train_x)
print(FP.shape)

**c) Cost estimation:**
 
We will use a binary cross-entropy loss, so that the empirical risk can be computed as:
 \begin{align}
 E = - \frac{1}{m} \sum_{i=1}^m 
 y^{(i)} \log(y_{\rm pred}^{(i)}) +
 (1-y^{(i)}) \log(1-y_{\rm pred}^{(i)})
 \end{align}
 
 The following cross-entropy function should give as result the scalar cost value computed over the entire dataset

In [None]:
def crossentropy(Y,Ypred):
  m = 209
  cost = 0
  for i in range (m):
    cost = cost + (Y[0,i]*np.log(Ypred[0,i])+(1-Y[0,i])*np.log((1-Ypred[0,i])))
  cost1 = cost*(-1/m)
  return cost1

crossentropy(train_y,FP)

**d) Back propagation:**

After initializing the parameters and doing a forward pass, we need to backpropagate the cost by computing the gradient with respect to the model parameters to later update the weights

\begin{align}
\frac{\partial E}{\partial w} = 
& \frac{1}{m} X(Y_{\rm pred}-Y)^T = 
 \frac{1}{m} \sum_{i=1}^m x^{(i)}(y^{(i)}_{\rm pred}-y^{(i)})\\
\frac{\partial E}{\partial b} = 
& \frac{1}{m} \sum_{i=1}^m(y^{(i)}_{\rm pred}-y^{(i)})\\
\end{align}

See a demonstration of the gradient computation in 
https://en.wikipedia.org/wiki/Cross_entropy

Fill-in the backpropagation function which receives as input the the training set (X,Y), as well as the current predictions and returns the gradients updates for the weights and bias

Hint: When the error is computed for several samples simultaneously, the gradient is averaged over the contribution of different samples.

In [None]:
def backpropagate(X, Y, Ypred):
    m = X.shape[1]
    dw = 0
    db = 0
    
    #find gradient (back propagation)
    dw1 = 1/m*np.matmul(X, (Ypred-Y).T)
    db1 = 1/m * np.sum(Ypred-Y)
    # for i in range (m):
    #   dw = dw + X[0,i]*(Ypred[0,i]-Y[0,i])
    # dw1 = dw*(1/m) 
    # for j in range (m):  
    #  db = db + (Ypred[0,i]-Y[0,i])
    # db1 = db*(1/m)
    grads = {"dw": dw1,
             "db": db1} 
    
    return grads



**e) Optimization**
After initializing the parameters, computing the cost function, and calculating gradients, we can now update the parameters using gradient descent. Use the functions implemented above to fill_in the "gradient_descent" function that optimizes the parameters given a training set X, Y, a fixed number of iterations, and a learning_rate. Store and plot the value of the loss function at each iteration

In [None]:
def gradient_descent(X, Y, iterations, learning_rate):
    costs = []
    w, b = initialize_parameters(train_x.shape[0])
    
    for i in range(iterations):
        Ypred = neuron(w,b,X)
        cost = crossentropy(Y, Ypred)
        grads= backpropagate(X,Y,Ypred)
        
        #update parameters
        w = w - learning_rate * grads['dw']
        b = b - learning_rate * grads['db']
        costs.append(cost)
        
        if i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
       
    return w,b, costs

w, b, costs = gradient_descent(train_x,train_y,iterations=3000, learning_rate = 0.015)

**e) Plot the training curve**
Plot the evolution of the cost vs the iterations 

In [None]:
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations')
plt.show()

**f) Prediction**
Use the optimized parameters to make predictions both for the train and test sets and compute the accuracy for each. What do you observe?

In [None]:
def predict(w, b, X):    
  tmp = neuron(w, b, X)
  y_pred = np.zeros((X.shape[1],1))
  for i in range(X.shape[1]):
    if tmp[0,i] > 0.5:
      y_pred[i,0]=1

  return y_pred

# predict 
train_pred_y = predict(w, b, train_x)
test_pred_y = predict(w, b, test_x)
print("Train Acc: {} %".format(100 - np.mean(np.abs(train_pred_y - train_y)) * 100))
print("Test Acc: {} %".format(100 - np.mean(np.abs(test_pred_y - test_y)) * 100))
    