# Coding the forward propagation algorithm
In this exercise, you'll write code to do forward propagation (prediction) for your first neural network:


Each data point is a customer. The first input is how many accounts they have, and the second input is how many children they have. The model will predict how many transactions the user makes in the next year. You will use this data throughout the first 2 chapters of this course.

# building neural network without activation function with numpy

In [1]:
import numpy as np

In [17]:
input_data=np.array([2,3])
input_data

array([2, 3])

In [3]:
weights={'node_0':np.array([1,1]),
        'node_1':np.array([-1,1]),
        'output':np.array([2,-1])}

In [4]:
node_0_value=(input_data*weights['node_0']).sum()

In [5]:
node_1_value=(input_data*weights['node_1']).sum()

In [6]:
hidden_layer_values=np.array([node_0_value,node_1_value])

In [7]:
print(hidden_layer_values)

[5 1]


In [8]:
output=(hidden_layer_values*weights['output']).sum()

In [9]:
output

9

# Now heading towards neural networks with applied activation function

# The Rectified Linear Activation Function
As Dan explained to you in the video, an "activation function" is a function applied at each node. It converts the node's input into some output.

The rectified linear activation function (called ReLU) has been shown to lead to very high-performance networks. This function takes a single number as an input, returning 0 if the input is negative, and the input if the input is positive.

Here are some examples:
relu(3) = 3 
relu(-3) = 0 

In [10]:
def relu(input):
    '''Define your relu activation function here'''
    # Calculate the value for the output of the relu function: output
    output = max(0, input)
    
    # Return the value just calculated
    return(output)

In [15]:
# Calculate node 0 value: node_0_output
node_0_input = (input_data * weights['node_0']).sum()
node_0_output = relu(node_0_input)
node_0_output

5

In [16]:
# Calculate node 1 value: node_1_output
node_1_input = (input_data * weights['node_1']).sum()
node_1_output = relu(node_1_input)
node_1_output

1

In [13]:
# Put node values into array: hidden_layer_outputs
hidden_layer_outputs = np.array([node_0_output, node_1_output])

# Calculate model output (do not apply relu)
model_output = (hidden_layer_outputs * weights['output']).sum()


In [14]:
# Print model output
print(model_output)

9


# Applying the network to many observations/rows of data
You'll now define a function called predict_with_network() which will generate predictions for multiple data observations, which are pre-loaded as input_data. As before, weights are also pre-loaded. In addition, the relu() function you defined in the previous exercise has been pre-loaded.

In [19]:
input_data=[np.array([3,5]),np.array([1,-1]),np.array([0,0]),np.array([8,4])]
input_data

[array([3, 5]), array([ 1, -1]), array([0, 0]), array([8, 4])]

In [20]:
weights={'node_0':np.array([2,4]),
        'node_1':np.array([4,-5]),
        'output':np.array([2,7])}

In [21]:
def predict_with_network(input_data_row, weights):

    # Calculate node 0 value
    node_0_input = (input_data_row * weights['node_0']).sum()
    node_0_output = relu(node_0_input)

    # Calculate node 1 value
    node_1_input = (input_data_row * weights['node_1']).sum()
    node_1_output = relu(node_1_input)

    # Put node values into array: hidden_layer_outputs
    hidden_layer_outputs = np.array([node_0_output, node_1_output])
    
    # Calculate model output
    input_to_final_layer = (hidden_layer_outputs * weights['output']).sum()
    model_output = relu(input_to_final_layer)
    
    # Return model output
    return(model_output)


In [22]:
# Create empty list to store prediction results
results = []
for input_data_row in input_data:
    # Append prediction to results
    results.append(predict_with_network(input_data_row, weights))

# Print results
print(results)
        

[52, 63, 0, 148]


# Deep learning IMP:

# Multi-layer neural networks
In this exercise, you'll write code to do forward propagation for a neural network with 2 hidden layers. Each hidden layer has two nodes. The input data has been preloaded as input_data. The nodes in the first hidden layer are called node_0_0 and node_0_1. Their weights are pre-loaded as weights['node_0_0'] and weights['node_0_1'] respectively.

The nodes in the second hidden layer are called node_1_0 and node_1_1. Their weights are pre-loaded as weights['node_1_0'] and weights['node_1_1'] respectively.

We then create a model output from the hidden nodes using weights pre-loaded as weights['output'].

In [23]:
input_data=np.array([3,5])

In [24]:
weights={'node_0_0':np.array([2,4]),
        'node_0_1':np.array([4,-5]),
        'node_1_0':np.array([-1,2]),
        'node_1_1':np.array([1,2]),
        'output':np.array([2,7])}

In [25]:
def predict_with_network(input_data):
    # Calculate node 0 in the first hidden layer
    node_0_0_input = (input_data * weights['node_0_0']).sum()
    node_0_0_output = relu(node_0_0_input)

    # Calculate node 1 in the first hidden layer
    node_0_1_input = (input_data * weights['node_0_1']).sum()
    node_0_1_output = relu(node_0_1_input)

    # Put node values into array: hidden_0_outputs
    hidden_0_outputs = np.array([node_0_0_output, node_0_1_output])

    # Calculate node 0 in the second hidden layer
    node_1_0_input = (hidden_0_outputs * weights['node_1_0']).sum()
    node_1_0_output = relu(node_1_0_input)

    # Calculate node 1 in the second hidden layer
    node_1_1_input = (hidden_0_outputs * weights['node_1_1']).sum()
    node_1_1_output = relu(node_1_1_input)

    # Put node values into array: hidden_1_outputs
    hidden_1_outputs = np.array([node_1_0_output, node_1_1_output])
    
    # Calculate output here: model_output
    model_output = (hidden_1_outputs * weights['output']).sum()
    
    # Return model_output
    return(model_output)

output = predict_with_network(input_data)
print(output)


182


# Optimizing a neural network with backward propagation
100%
Here, you'll learn how to optimize the predictions generated by your neural networks. You'll do this using a method called backward propagation, which is one of the most important techniques in deep learning. Understanding how it works will give you a strong foundation to build from in the second half of the course.



# Coding how weight changes affect accuracy
Now you'll get to change weights in a real network and see how they affect model accuracy!

Have a look at the following neural network: Ch2Ex4

Its weights have been pre-loaded as weights_0. Your task in this exercise is to update a single weight in weights_0 to create weights_1, which gives a perfect prediction (in which the predicted value is equal to target_actual: 3).

In [26]:
# The data point you will make a prediction for
input_data = np.array([0, 3])

# Sample weights
weights_0 = {'node_0': [2, 1],
             'node_1': [1, 2],
             'output': [1, 1]
            }


In [29]:
def predict_with_network(input_data_row, weights):

    # Calculate node 0 value
    node_0_input = (input_data_row * weights['node_0']).sum()
    node_0_output = relu(node_0_input)

    # Calculate node 1 value
    node_1_input = (input_data_row * weights['node_1']).sum()
    node_1_output = relu(node_1_input)

    # Put node values into array: hidden_layer_outputs
    hidden_layer_outputs = np.array([node_0_output, node_1_output])
    
    # Calculate model output
    input_to_final_layer = (hidden_layer_outputs * weights['output']).sum()
    model_output = relu(input_to_final_layer)
    
    # Return model output
    return(model_output)

In [30]:
# The actual target value, used to calculate the error
target_actual = 3

# Make prediction using original weights
model_output_0 = predict_with_network(input_data, weights_0)


In [31]:
# Calculate error: error_0
error_0 = model_output_0 - target_actual


In [32]:
error_0

6

# changing the weight

In [43]:
 weights_1={'node_0': [2, 1], 'node_1': [1, 2], 'output': [-1, -2]}

{'node_0': [2, 1], 'node_1': [1, 1.5], 'output': [1, 1.5]}

In [34]:
# Make prediction using new weights: model_output_1
model_output_1 = predict_with_network(input_data, weights_1)

# Calculate error: error_1
error_1 = model_output_1 - target_actual


In [35]:
error_1

0

# Scaling up to multiple data points
You've seen how different weights will have different accuracies on a single prediction. But usually, you'll want to measure model accuracy on many points. You'll now write code to compare model accuracies for two different sets of weights, which have been stored as weights_0 and weights_1.

input_data is a list of arrays. Each item in that list contains the data to make a single prediction. target_actuals is a list of numbers. Each item in that list is the actual value we are trying to predict.

In this exercise, you'll use the mean_squared_error() function from sklearn.metrics. It takes the true values and the predicted values as arguments.

You'll also use the preloaded predict_with_network() function, which takes an array of data as the first argument, and weights as the second argument.

In [36]:
input_data=[np.array([0,3]),np.array([1,2]),np.array([-1,-2]),np.array([4,0])]
input_data

[array([0, 3]), array([1, 2]), array([-1, -2]), array([4, 0])]

In [37]:
target_actuals=[1,3,5,7]

In [45]:
from sklearn.metrics import mean_squared_error

# Create model_output_0 
model_output_0 = []
# Create model_output_1
model_output_1 = []

weights_1={'node_0': [2, 1], 'node_1': [1, 1.5], 'output': [1, 1.5]}

In [46]:
for row in input_data:
    # Append prediction to model_output_0
    model_output_0.append(predict_with_network(row, weights_0))
    
    # Append prediction to model_output_1
    model_output_1.append(predict_with_network(row, weights_1))


In [47]:
# Calculate the mean squared error for model_output_0: mse_0
mse_0 = mean_squared_error(target_actuals, model_output_0)

# Calculate the mean squared error for model_output_1: mse_1
mse_1 = mean_squared_error(target_actuals, model_output_1)


In [48]:
# Print mse_0 and mse_1
print("Mean squared error with weights_0: %f" %mse_0)
print("Mean squared error with weights_1: %f" %mse_1)

Mean squared error with weights_0: 37.500000
Mean squared error with weights_1: 49.890625


# Calculating slopes
You're now going to practice calculating slopes. When plotting the mean-squared error loss function against predictions, the slope is 2 * x * (y-xb), or 2 * input_data * error. Note that x and b may have multiple numbers (x is a vector for each data point, and b is a vector). In this case, the output will also be a vector, which is exactly what you want.

You're ready to write the code to calculate this slope while using a single data point. You'll use pre-defined weights called weights as well as data for a single point called input_data. The actual value of the target you want to predict is stored in target.

In [None]:
● If learning rate is 0.01, the new weight would be
● 2 - 0.01(-24) = 2.24

In [None]:
Code to calculate slopes and update weights
In [1]: import numpy as np
In [2]: weights = np.array([1, 2])
In [3]: input_data = np.array([3, 4])
In [4]: target = 6
In [5]: learning_rate = 0.01
In [6]: preds = (weights * input_data).sum()
In [7]: error = preds - target
In [8]: print(error)
5
Deep Learning in Python
In [9]: gradient = 2 * input_data * error
In [10]: gradient
Out[10]: array([30, 40])
In [11]: weights_updated = weights - learning_rate * gradient
In [12]: preds_updated = (weights_updated * input_data).sum()
In [13]: error_updated = preds_updated - target
In [14]: print(error_updated)
-2.5

# above here slope is same as gradient

# BACKPROPAGATION

# In the big picture we are trying to estimate slope of the loss function w.r.t to each weight

# Backpropagation recap:Most imp

In [None]:
1]Start at random set of weights
2]Use forward propagation to make a prediction
3]use backward propagation to calculate slope of loss function w.r.t each weight.
4]Multiply that slope by the learning rate and subtract from the current weight.


# Whn slopes are calculate on one batch at a time rather than on the full data is called stochastic gradient descent rather than gradient descent which use all data for each slope calcualtion