# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint



## Learning Objectives

At the end of the experiment, you will be able to :

* understand Backpropagation using MLP


In [None]:
#@title Experiment Explanation Video
from IPython.display import HTML

HTML("""<video width="800" height="300" controls>
  <source src="https://cdn.talentsprint.com/talentsprint1/archives/sc/aiml/aiml_batch_15/preview_videos/Back_Propagation.mp4" type="video/mp4">
</video>
""")



## Dataset

#### History

This is a multivariate dataset introduced by R.A.Fisher (Father of Modern Statistics) for showcasing linear discriminant analysis. This is arguably the best known dataset in Feature Selection literature.


The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. 

#### Description
The Iris dataset consists of 150 data instances. There are 3 classes (Iris Versicolor, Iris Setosa, and Iris Virginica) each has 50 instances. 


For each flower, we have the below data attributes 

- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm


## Domain Information



Iris Plants are flowering plants with showy flowers. They are very popular among movie directors as it gives an excellent background. 

They are predominantly found in dry, semi-desert, or colder rocky mountainous areas in Europe and Asia. They have long, erect flowering stems and can produce white, yellow, orange, pink, purple, lavender, blue, or brown colored flowers. There are 260 to 300 types of iris.

![alt text](https://cdn-images-1.medium.com/max/1275/1*7bnLKsChXq94QjtAiRn40w.png)

As you could see, flowers have 3 sepals and 3 petals.  The sepals are usually spreading or drop downwards and the petals stand upright, partly behind the sepal bases. However, the length and width of the sepals and petals vary for each type.


## AI / ML Technique

A feedforward neural network is an artificial neural network where the connections between units do not form a cycle. Feedforward neural networks were the first type of artificial neural network invented and are simpler than their counterparts. Feedforward travels forward in the network (no loops), first through the input nodes, then through the hidden nodes, and finally through the output nodes. Back Propagation uses the output error to adjust the weights in order to minimize the loss.

### Setup Steps

In [1]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id="2100121"#@param{type:"string"}

In [4]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password="5142192291"#@param{type:"string"}

In [3]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython

ipython = get_ipython()
  
notebook="U3W13_23_Backpropagation_iris_B" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/Iris.csv")
    from IPython.display import HTML, display
    display(HTML('<script src="https://dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")
    
    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:        
        print(r["err"])
        return None        
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None
    
    elif getAnswer() and getComplexity() and getAdditional() and getConcepts() and getWalkthrough() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional, 
              "concepts" : Concepts, "record_id" : submission_id, 
              "answer" : Answer, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook, "feedback_walkthrough":Walkthrough ,
              "feedback_experiments_input" : Comments,
              "feedback_mentor_support": Mentor_support}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:        
        print(r["err"])
        return None   
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://aiml.iiith.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id
    

def getAdditional():
  try:
    if not Additional: 
      raise NameError
    else:
      return Additional  
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None
  
def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None
  
  
def getWalkthrough():
  try:
    if not Walkthrough:
      raise NameError
    else:
      return Walkthrough
  except NameError:
    print ("Please answer Walkthrough Question")
    return None
  
def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None
  

def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer():
  try:
    if not Answer:
      raise NameError 
    else: 
      return Answer
  except NameError:
    print ("Please answer Question")
    return None
  

def getId():
  try: 
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup 
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup() 
else:
  print ("Please complete Id and Password cells before running setup")



Setup completed successfully


### Importing Required Packages

In [5]:
import pandas as pd
import numpy as np

# Importing torch package
import torch

### Loading the Data

In [6]:
# Load data using Pandas
iris = pd.read_csv("/content/Iris.csv")
iris.head()

Unnamed: 0,Id,sepal_length,sepal_width,petal_length,petal_width,species
0,1,5.1,3.5,1.4,0.2,Iris-setosa
1,2,4.9,3.0,1.4,0.2,Iris-setosa
2,3,4.7,3.2,1.3,0.2,Iris-setosa
3,4,4.6,3.1,1.5,0.2,Iris-setosa
4,5,5.0,3.6,1.4,0.2,Iris-setosa


In [7]:
# Shuffling the data using .sample()
df_sample = iris.sample(frac=1)
df_sample.head()

Unnamed: 0,Id,sepal_length,sepal_width,petal_length,petal_width,species
81,82,5.5,2.4,3.7,1.0,Iris-versicolor
127,128,6.1,3.0,4.9,1.8,Iris-virginica
51,52,6.4,3.2,4.5,1.5,Iris-versicolor
27,28,5.2,3.5,1.5,0.2,Iris-setosa
61,62,5.9,3.0,4.2,1.5,Iris-versicolor


### One-hot Encoding

Let us now encode our species column of df_sample to one-hot encoding. Here we are using one hot encoding to get 3 outputs in Output layer.

To understand more about one-hot encoding using pandas read [pandas.get_dummies](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html)

In [17]:
# One-hot encoding to get 3 outputs (Setosa, Virginica, Versicolor)
species_one_hot = pd.get_dummies(df_sample.species) # Convert categorical variable into dummy/indicator variables.
df = df_sample.join(species_one_hot)
df = df.drop(["Id", "species"], axis=1) 
df.loc[[145, 60, 141]]  # Select few samples to see the DataFrame

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,Iris-setosa,Iris-versicolor,Iris-virginica
145,6.7,3.0,5.2,2.3,0,0,1
60,5.0,2.0,3.5,1.0,0,1,0
141,6.9,3.1,5.1,2.3,0,0,1


### Feedforward and Back Propagation 

Feedforward flows in one direction from input to output by passing through the hidden layers and compare it with the real value to get the error. To minimize the error, propagate backward by finding the derivative of the error with respect to each weight and then subtract this value from the weight value.

In [18]:
y = df[['Iris-setosa','Iris-versicolor','Iris-virginica']].values # Labels
x = df[['sepal_length','sepal_width','petal_length','petal_width']].values # Features
N = y.size
print(y.shape)
print(x.shape)
type(x), type(y)

(150, 3)
(150, 4)


(numpy.ndarray, numpy.ndarray)

In [22]:
print(y[145])
print(y[60])
print(y[141])

[0 1 0]
[1 0 0]
[0 0 1]


#### Activation Function

Sigmoid function is a S-shaped curve which predicts the probability as an ouput.

![alt text](https://cdn.talentsprint.com/aiml/Experiment_related_data/sigmoid.png   )


In [23]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x)) # YOUR CODE HERE

#### Defining Hyperparameters for MLP

![alt text](https://cdn.talentsprint.com/aiml/Experiment_related_data/neural_net.png)

In [24]:
# Initializing the parameters
learning_rate = 0.01

n_input = 4  # Input features are 4
n_hidden = 2 # Hidden neurons are 2
n_output = 3 # Output neurons are 3

#### Randomize weights

Generate random input weights and hidden weights. Input weights shape is 4X2 as there are 4 input features and 2 hidden neurons. Similarly, Hidden weights shape is 2X3 as there are 2 hidden neurons and 3 output neurons.

**Note:** Refer to [np.random.normal](https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html)


In [25]:
# Randomly selecting the weights
np.random.seed(10)
weights_1 = np.random.normal(size=(n_input, n_hidden))   # Input weights shape is (4, 2)
print("weight1\n", weights_1)

weights_2 = np.random.normal(size=(n_hidden, n_output))  # Hidden weights shape is (2, 3)
print("weight2\n", weights_2)

print(sep='\n')
print(weights_1.shape, weights_2.shape)

weight1
 [[ 1.3315865   0.71527897]
 [-1.54540029 -0.00838385]
 [ 0.62133597 -0.72008556]
 [ 0.26551159  0.10854853]]
weight2
 [[ 0.00429143 -0.17460021  0.43302619]
 [ 1.20303737 -0.96506567  1.02827408]]

(4, 2) (2, 3)


#### Update the weights using feedforward and backpropagation

![alt text](https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/BackProp.PNG)

Basic Algorithm which explains feedforward and backpropagation

1. Inputs are taken
2. The weights are usually randomly selected.
3. Calculate the output of every neuron from the input layer, to the hidden layers and to the output layer.
4. Apply sigmoid function at hidden and output layers.
5. Calculate the error in the outputs

    *Error = Predicted Output – Actual Output*
5. Travel back from the output layer to the hidden layer to adjust the weights such that the error is minimized.







In [26]:
# Feedforward
hidden_layer_inputs = np.dot(x, weights_1)
hidden_layer_outputs = sigmoid(hidden_layer_inputs)

output_layer_inputs = np.dot(hidden_layer_outputs, weights_2)
output_layer_outputs = sigmoid(output_layer_inputs)

Calculate the Mean Square Error

In [27]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y, output_layer_outputs) # YOUR CODE HERE to calculate the MSE with initial weights i.e. 'y' and the 'output_layer_outputs'
mse

0.3365993169123329

**Chain rule to find the change in error with respect to W2 (Weights):**

$\frac{\partial E}{\partial W_{2}} = \frac{\partial E}{\partial O} \cdot \frac{\partial O}{\partial Z_{2}} \cdot \frac{\partial Z_{2}}{\partial W_{2}}$

* **Change in error with respect to output**

  Suppose the actual output is represented as y and the predicted output is represented as O, then the error would be calculated as:

  $E = \frac{1}{2}\left ( y-0 \right )^{2}$

  Differentiate the error with respect to the output

  $\frac{\partial E}{\partial O} = -\left (y-O  \right )$

   $\frac{\partial E}{\partial O} = O-y$

  In the code, predicted output is represented as output_layer_outputs and the actual output is represented as y

* **Change in output with respect to Z2** 

  Thus, $\frac{\partial O}{\partial Z_{2}}$ is effectively the derivative of Sigmoid

  $f\left ( x \right ) = \frac{1}{1+e^{-x}}$

  $f'(x) = (1+e^{-x})-1[1-(1+e^{-x})-1]$

  ${f}'\left ( x \right ) = sigmoid(x)(1-sigmoid(x))$

  $\frac{\partial O}{\partial Z_{2}} = (O)(1-O)$ 

  In the code,  O is represented as output_layer_outputs

* **Change in $Z_2$ with respect to $W_2$ (Weights)**

  $Z_2 = W^T.A_1 $

  On differentiating $Z_2$ with respect to $W_2$, we will get the value $A_1$ itself

  $\frac{\partial Z_{2}}{\partial W_{2} }$ = $A_1$ where $A_1$ = $sigmoid(Z_{1})$

**Chain rule to find the change in error with respect to W2 (Weights):**

$\frac{\partial E}{\partial W_{2}} = \frac{\partial E}{\partial O} \cdot \frac{\partial O}{\partial Z_{2}} \cdot \frac{\partial Z_{2}}{\partial W_{2}}$

$\frac{\partial E}{\partial W_{2}} = (O-y) . O(1-O). A_1$

**Update the weights using Gradient Descent**

$W_{new} = W_{old} - lr * \frac{\partial E}{\partial W_{2}} $

In the code, $W_{old}$ is represented as weights_2,  $lr$ is represented as learning rate and $\frac{\partial E}{\partial W}$ is represented as Weight_2_update

Similarly, W1 (Weights) are updated using chain rule.


In [28]:
# Updating the weights with respect to w2 (hidden weights)

derivative_error = output_layer_outputs - y  # ∂E/∂O
derivative_output = output_layer_outputs * (1 - output_layer_outputs) # ∂O/∂Z2
Change_in_output = derivative_output * derivative_error # ∂E/∂O * ∂O/∂Z2
weights_2_update =   np.dot(hidden_layer_outputs.T, Change_in_output) # ∂E/∂W2 = ∂E/∂O * ∂O/∂Z2 * ∂Z2/∂W2

# Updating the weights with respect to w1 (input weights)
hidden_layer_error = np.dot(Change_in_output, weights_2.T)
hidden_layer_delta = hidden_layer_error * hidden_layer_outputs * (1 - hidden_layer_outputs)
weights_1_update = np.dot(x.T, hidden_layer_delta)

# Updating weights using gradient descent
Weights_2 = weights_2 - learning_rate * weights_2_update
Weights_1 = weights_1 - learning_rate * weights_1_update
print("Updated Weights_2\n",Weights_2)
print("Updated Weight_1\n", Weights_1)

Updated Weights_2
 [[-0.12055543 -0.15566187  0.32814046]
 [ 1.11539675 -0.95315439  0.93061291]]
Updated Weight_1
 [[ 1.32146309  0.43668387]
 [-1.55238712 -0.13935396]
 [ 0.61825296 -0.92535942]
 [ 0.26495896  0.04267472]]


In [29]:
# Feed Forward with the updated weights
Hidden_layer_inputs = np.dot(x, Weights_1)
Hidden_layer_outputs = sigmoid(Hidden_layer_inputs)

Output_layer_inputs = np.dot(Hidden_layer_outputs, Weights_2)
Output_layer_outputs = sigmoid(Output_layer_inputs)

In [30]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(Output_layer_outputs,y) # YOUR CODE HERE to calculate the MSE with updated weights i.e. 'y' and the 'Output_layer_outputs'
mse

0.2568535636380534

Our goal with backpropagation is to update each of the weights in the network so that predicted output will be closer to the actual output

### Please answer the questions below to complete the experiment:




In [31]:
#@title  What is backpropagation? { run: "auto", form-width: "500px", display-mode: "form" }
Answer = "It is the transmission of error back through the network to allow weights to be adjusted so that the network can learn." #@param ["","It is the transmission of error back through the network to allow weights to be adjusted so that the network can learn.", "To develop learning algorithm for single-layer feedforward neural network"]


In [32]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "Good, But Not Challenging for me" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [34]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "nn" #@param {type:"string"}


In [35]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "Yes" #@param ["","Yes", "No"]


In [36]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [37]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [38]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [39]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

Your submission is successful.
Ref Id: 11205
Date of submission:  12 Dec 2020
Time of submission:  02:24:04
View your submissions: https://aiml.iiith.talentsprint.com/notebook_submissions
