In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import utils

<a name="3"></a>
##  Neural Networks


 

- Implementing a neural network with a single perceptron and two input nodes for linear regression
- Implementing forward propagation using matrix multiplication




<a name="3.1"></a>
###  Linear Regression


**Linear regression** is a linear approach for modelling the relationship between a scalar response (**dependent variable**) and one or more explanatory variables (**independent variables**). You will work with a linear regression with $2$ independent variables.

Linear regression model with two independent variables $x_1$, $x_2$ can be written as

$$\hat{y} = w_1x_1 + w_2x_2 + b = Wx + b,\tag{6}$$

where $Wx$ is the dot product of the input vector $x = \begin{bmatrix} x_1 & x_2\end{bmatrix}$ and parameters vector $W = \begin{bmatrix} w_1 & w_2\end{bmatrix}$, scalar parameter $b$ is the intercept. 

The goal is the same - find the "best" parameters $w_1$, $w_2$ and $b$ such the differences between original values $y_i$ and predicted values $\hat{y}_i$ are minimum.

We can use a neural network model to do that. Matrix multiplication will be in the core of the model!

<a name="3.2"></a>
###  Neural Network Model with a Single Perceptron and Two Input Nodes

Again, you will use only one perceptron, but with two input nodes shown in the following scheme:


The perceptron output calculation for a training example $x = \begin{bmatrix} x_1& x_2\end{bmatrix}$ can be written with dot product:

$$z = w_1x_1 + w_2x_2+ b = Wx + b$$

where weights are in the vector $W = \begin{bmatrix} w_1 & w_2\end{bmatrix}$ and bias $b$ is a scalar. The output layer will have the same single node $\hat{y}= z$.

Organise all training examples in a matrix $X$ of a shape ($2 \times m$), putting $x_1$ and $x_2$ into columns. Then matrix multiplication of $W$ ($1 \times 2$) and $X$ ($2 \times m$) will give a ($1 \times m$) vector

$$WX = 
\begin{bmatrix} w_1 & w_2\end{bmatrix} 
\begin{bmatrix} 
x_1^{(1)} & x_1^{(2)} & \dots & x_1^{(m)} \\ 
x_2^{(1)} & x_2^{(2)} & \dots & x_2^{(m)} \\ \end{bmatrix}
=\begin{bmatrix} 
w_1x_1^{(1)} + w_2x_2^{(1)} & 
w_1x_1^{(2)} + w_2x_2^{(2)} & \dots & 
w_1x_1^{(m)} + w_2x_2^{(m)}\end{bmatrix}.$$

And the model can be written as

\begin{align}
Z &=  W X + b,\\
\hat{Y} &= Z,
\tag{8}\end{align}

where $b$ is broadcasted to the vector of a size ($1 \times m$). These are the calculations to perform in the forward propagation step.


Now, you can compare the resulting vector of the predictions $\hat{Y}$ ($1 \times m$) with the original vector of data $Y$. This can be done with the so called **cost function** that measures how close your vector of predictions is to the training data. It evaluates how well the parameters $w$ and $b$ work to solve the problem. There are many different cost functions available depending on the nature of your problem. For your simple neural network you can calculate it as:

$$\mathcal{L}\left(w, b\right)  = \frac{1}{2m}\sum_{i=1}^{m} \left(\hat{y}^{(i)} - y^{(i)}\right)^2.\tag{5}$$

The aim is to minimize the cost function during the training, which will minimize the differences between original values $y_i$ and predicted values $\hat{y}_i$ (division by $2m$ is taken just for scaling purposes).

When your weights were just initialized with some random values, and no training was done yet, you can't expect good results.

The next step is to adjust the weights and bias, in order to minimize the cost function. This process is called **backward propagation** and is done iteratively: you update the parameters with a small change and repeat the process.

*Note*: Backward propagation is not covered in this Course - it will be discussed in the next Course of this Specialization.

The general **methodology** to build a neural network is to:
1. Define the neural network structure ( # of input units,  # of hidden units, etc). 
2. Initialize the model's parameters
3. Loop:
    - Implement forward propagation (calculate the perceptron output),
    - Implement backward propagation (to get the required corrections for the parameters),
    - Update parameters.
4. Make predictions.

<a name="3.3"></a>
###  Parameters of the Neural Network

The neural network you will be working with has $3$ parameters. Two weights and one bias, you will start initalizing these parameters as some random numbers, so the algorithm can start at some point. The parameters will be stored in a dictionary.

<a name="3.4"></a>
###  1)  Forward propagation



Implement `forward_propagation()`.

**Instructions**:
- Look at the mathematical representation of your model:
\begin{align}
Z &=  W X + b\\
\hat{Y} &= Z,
\end{align}
- The steps you have to implement are:
    1. Retrieve each parameter from the dictionary "parameters" by using `parameters[".."]`.
    2. Implement Forward Propagation. Compute `Z` multiplying arrays `W`, `X` and adding vector `b`. Set the prediction array $A$ equal to $Z$.  

In [2]:
# GRADED FUNCTION: forward_propagation

def forward_propagation(X, parameters):
    """
    Argument:
    X -- input data of size (n_x, m), where n_x is the dimension input (in our example is 2) and m is the number of training samples
    parameters -- python dictionary containing your parameters (output of initialization function)
    
    Returns:
    Y_hat -- The output of size (1, m)
    """
    # Retrieve each parameter from the dictionary "parameters".
    W = parameters["W"]
    b = parameters["b"]
    
    # Implement Forward Propagation to calculate Z.
    ### START CODE HERE ### (~ 2 lines of code)
    Z = np.dot(W,X)+b
    Y_hat = Z
    ### END CODE HERE ###
    

    return Y_hat

<a name="3.5"></a>
### 2) Defining the cost function

The cost function used to traing this model is 

$$\mathcal{L}\left(w, b\right)  = \frac{1}{2m}\sum_{i=1}^{m} \left(\hat{y}^{(i)} - y^{(i)}\right)^2$$

The next implementation is not graded.

In [3]:
def compute_cost(Y_hat, Y):
    """
    Computes the cost function as a sum of squares
    
    Arguments:
    Y_hat -- The output of the neural network of shape (n_y, number of examples)
    Y -- "true" labels vector of shape (n_y, number of examples)
    
    Returns:
    cost -- sum of squares scaled by 1/(2*number of examples)
    
    """
    # Number of examples.
    m = Y.shape[1]

    # Compute the cost function.
    cost = np.sum((Y_hat - Y)**2)/(2*m)
    
    return cost

<a name="3.5"></a>
### 3) Initializing the parameters

In [4]:
def initialize_parameters(n_x):
    """
    Initializes parameters for a single-layer neural network.

    Arguments:
    n_x -- size of the input layer

    Returns:
    parameters -- python dictionary containing parameters:
                    W -- weight matrix of shape (1, n_x)
                    b -- bias scalar
    """
    np.random.seed(1)
    W = np.random.randn(1, n_x) * 0.01
    b = np.zeros((1, 1))
    parameters = {"W": W, "b": b}
    
    return parameters


<a name="3.5"></a>
### 4) Training the neural network

In [28]:
def train_nn(parameters, Y_hat, X, Y, learning_rate=0.01):
    """
    Update the parameters using gradient descent.

    Arguments:
    parameters -- python dictionary containing parameters
    Y_hat -- The output of the neural network of shape (1, m)
    X -- input data of shape (n_x, m)
    Y -- "true" labels vector of shape (1, m)
    learning_rate -- learning rate for gradient descent

    Returns:
    parameters -- updated parameters
    """
    W = parameters["W"]
    b = parameters["b"]
    m = X.shape[1]

    # Compute gradients
    dZ = Y_hat - Y
    dW = np.dot(dZ, X.T) / m
    db = np.sum(dZ) / m

    # Update parameters
    W = W - learning_rate * dW
    b = b - learning_rate * db

    parameters = {"W": W, "b": b}
    
    return parameters


<a name="ex06"></a>
### 5) Implementing the Neural Network

Now you're ready to implement your neural network. The next function will implement the training process and it will return the updated parameters dictionary where you will be able to make predictions.

In [16]:
# GRADED FUNCTION: nn_model

def nn_model(X, Y, num_iterations=1000, print_cost=False):
    """
    Arguments:
    X -- dataset of shape (n_x, number of examples)
    Y -- labels of shape (1, number of examples)
    num_iterations -- number of iterations in the loop
    print_cost -- if True, print the cost every iteration
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to make predictions.
    """
    
    n_x = X.shape[0]
    
    # Initialize parameters
    parameters = initialize_parameters(n_x) 
    
    # Loop
    for i in range(0, num_iterations):
         
        ### START CODE HERE ### (~ 2 lines of code)
        Y_hat = forward_propagation(X, parameters)
        
        # Cost function. Inputs: "Y_hat, Y". Outputs: "cost".
        cost = compute_cost(Y_hat, Y)
        
        
        # Parameters update.
        parameters = train_nn(parameters, Y_hat, X, Y, learning_rate = 0.001) 
        
        # Print the cost every iteration.
        if print_cost:
            if i%100 == 0:
                print ("Cost after iteration %i: %f" %(i, cost))

    return parameters

In [17]:
df = pd.read_csv("C://Users//Ravikrishna J//OneDrive//Desktop//NN//toy.csv")

In [18]:
df.head(500)

Unnamed: 0,x1,x2,y
0,1.624345,-1.719394,0
1,-0.611756,0.057121,0
2,-0.528172,-0.799547,0
3,-1.072969,-0.291595,0
4,0.865408,-0.258983,0
...,...,...,...
495,-0.828628,-0.116444,0
496,0.528880,-2.277298,0
497,-2.237087,-0.069625,0
498,-1.107713,0.353870,1


Let's first turn the data into a numpy array to pass it to the our function.

In [19]:
X = np.array(df[['x1','x2']]).T
Y = np.array(df['y']).reshape(1,-1)
df

Unnamed: 0,x1,x2,y
0,1.624345,-1.719394,0
1,-0.611756,0.057121,0
2,-0.528172,-0.799547,0
3,-1.072969,-0.291595,0
4,0.865408,-0.258983,0
...,...,...,...
495,-0.828628,-0.116444,0
496,0.528880,-2.277298,0
497,-2.237087,-0.069625,0
498,-1.107713,0.353870,1


Run the next block to update the parameters dictionary with the fitted weights.

In [20]:
parameters = nn_model(X,Y, num_iterations = 5000, print_cost= True)

Cost after iteration 0: 0.269763
Cost after iteration 100: 0.229133
Cost after iteration 200: 0.196193
Cost after iteration 300: 0.169488
Cost after iteration 400: 0.147836
Cost after iteration 500: 0.130282
Cost after iteration 600: 0.116048
Cost after iteration 700: 0.104507
Cost after iteration 800: 0.095149
Cost after iteration 900: 0.087561
Cost after iteration 1000: 0.081408
Cost after iteration 1100: 0.076418
Cost after iteration 1200: 0.072371
Cost after iteration 1300: 0.069090
Cost after iteration 1400: 0.066428
Cost after iteration 1500: 0.064269
Cost after iteration 1600: 0.062519
Cost after iteration 1700: 0.061098
Cost after iteration 1800: 0.059947
Cost after iteration 1900: 0.059012
Cost after iteration 2000: 0.058254
Cost after iteration 2100: 0.057639
Cost after iteration 2200: 0.057141
Cost after iteration 2300: 0.056736
Cost after iteration 2400: 0.056408
Cost after iteration 2500: 0.056141
Cost after iteration 2600: 0.055925
Cost after iteration 2700: 0.055750
Cost

<a name="4"></a>
## 4 - Make your predictions!

Now that you have the fitted parameters, you are able to predict any value with your neural network! You just need to perform the following computation:

$$ Z = W X + b$$ 

Where $W$ and $b$ are in the parameters dictionary.

<a name="ex07"></a>
 

In [22]:
# GRADED FUNCTION: predict

def predict(X, parameters):

    W = parameters['W']
    b = parameters['b']

    Z = np.dot(W, X) + b

    return Z

In [23]:
y_hat = predict(X,parameters)

In [24]:
df['y_hat'] = y_hat[0]

Now let's check some predicted values versus the original ones:

In [27]:
# Test the neural network
predictions = predict(X, parameters)
print("Predictions:", predictions)
print("True Labels:", Y)

Predictions: [[ 0.54099273  0.36745281  0.17642235  0.15291907  0.69640498 -0.06619126
   0.8630934   0.33431738  0.45998803  0.59224306  1.34927822  0.00237099
   0.47554181  0.12513148  1.00854926  0.02889406  0.27234619  0.41901051
   0.57903619  0.95981531  0.57879422  0.70338819  0.80321414  1.10090243
   1.01250042  0.51151777  0.81466114  0.11257742  0.60725144  1.02082811
   0.73623707  0.21105812  0.26944469  0.3843581   0.26451475  0.42070141
   0.38451211  0.49853852  0.53365968  0.81750432  0.36305137  0.09414849
  -0.07555741  1.24293968 -0.02048062 -0.0785237   0.50591098  1.40933491
   0.88266049  0.40391156  0.56063211  0.04732428  0.45933166  0.05659851
   0.42864718  0.94232361  0.69824718  1.14672812  0.38355304  0.85885977
   0.52705001  0.83322314  1.01028217  0.57680154  0.24778455  1.49066605
   0.99678032  0.96869685  0.77611733  0.15688588  0.00834056  0.54385618
   0.65922248  1.22724567  0.64485437  0.09198853  0.53065401  0.71368334
   0.53983719  1.00776168