# Connect Intensive - Machine Learning Nanodegree
# Lesson 4: Neural Nets Mini-Project


## Objectives
  - Understand the fundamental concepts underlying [Artificial Neural Networks](https://en.wikipedia.org/wiki/Artificial_neural_network).
  - Use Python to develop and test simple Artificial Neural Networks.
 
 
## Acknowledgements
  - This lesson is adapted from one of [Thomas Corcoran's sessions](https://github.com/tccorcoran/Connect).


## 1. Build a Perceptron

In [None]:
# ----------
# 
# In this exercise, you will put the finishing touches on a perceptron class.
#
# Finish writing the activate() method by using np.dot to compute signal
# strength and then add in a threshold for perceptron activation.
#
# ----------


import numpy as np


class Perceptron:
    """
    This class models an artificial neuron with step activation function.
    """

    def __init__(self, weights = np.array([1]), threshold = 0):
        """
        Initialize weights and threshold based on input arguments. Note that no
        type-checking is being performed here for simplicity.
        """
        self.weights = weights
        self.threshold = threshold
    
    def activate(self,inputs):
        """
        Takes in @param inputs, a list of numbers equal to length of weights.
        @return the output of a threshold perceptron with given inputs based on
        perceptron weights and threshold.
        """ 

        # INSERT YOUR CODE HERE

        # TODO: calculate the strength with which the perceptron fires

        # TODO: return 0 or 1 based on the threshold
            
        return result


def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    p1 = Perceptron(np.array([1, 2]), 0.)
    assert p1.activate(np.array([ 1,-1])) == 0 # < threshold --> 0
    assert p1.activate(np.array([-1, 1])) == 1 # > threshold --> 1
    assert p1.activate(np.array([ 2,-1])) == 0 # on threshold --> 0

    print "Tests successfully completed"
if __name__ == "__main__":
    test()
    


## 2. Threshold Meditation

What do you think the advantage of a **perceptron** is, compared with simply returning the dot product without a threshold? 

## 3. Where to train Perceptrons

We want to build networks of **perceptrons** that contain interesting functions. What are the parameters of the **perceptrons** that we will want to modify? 

•	Output functions  
•	Input values  
•	Thresholds   
•	Weights   
•	Input functions   






## 4. Perceptron Inputs

**Neural networks** are built out of components like perceptron units. What do inputs to networks of perceptrons look like? 

•	A matrix of numerical values, with some labeled output.  
•	A directed graph.  
•	An unlabeled matrix of numerical values.  
•	A set of classifications of numerical values.  
•	A matrix of numerical values with classifications for each row.   




## 5. Neural Net Outputs

What information can we get as the output of a **neural network**? 

•	A directed graph, the neural network itself.   
•	A single scalar-valued number.   
•	The classification of a vector.  
•	A vector valued output for any vector input.   





## 6. Perceptron Update Rule


In [None]:
# ----------
#
# In this exercise, you will update the perceptron class so that it can update
# its weights.
#
# Finish writing the update() method so that it updates the weights according
# to the perceptron update rule.
# 
# ----------

import numpy as np


class Perceptron:
    """
    This class models an artificial neuron with step activation function.
    """

    def __init__(self, weights = np.array([1]), threshold = 0):
        """
        Initialize weights and threshold based on input arguments. Note that no
        type-checking is being performed here for simplicity.
        """
        self.weights = weights
        self.threshold = threshold
        
        print self.weights 
        print self.threshold


    def activate(self, values):
        """
        Takes in @param values, a list of numbers equal to length of weights.
        @return the output of a threshold perceptron with given inputs based on
        perceptron weights and threshold.
        """
               
        # First calculate the strength with which the perceptron fires
        strength = np.dot(values,self.weights)
        
        # Then return 0 or 1 depending on strength compared to threshold  
        return int(strength > self.threshold)


    def update(self, values, train, eta=.1):
        """
        Takes in a 2D array @param values consisting of a LIST of inputs and a
        1D array @param train, consisting of a corresponding list of expected
        outputs. Updates internal weights according to the perceptron training
        rule using these values and an optional learning rate, @param eta.
        """
        
        
        
        
          # YOUR CODE HERE

            # TODO: for each data point...
        
            # TODO: obtain the neuron's prediction for that point

            # TODO: update self.weights based on prediction accuracy, learning
            # rate and input value

            
            
            
            

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    def sum_almost_equal(array1, array2, tol = 1e-6):
        return sum(abs(array1 - array2)) < tol

    p1 = Perceptron(np.array([1,1,1]),0)
    p1.update(np.array([[2,0,-3]]), np.array([1]))
    assert sum_almost_equal(p1.weights, np.array([1.2, 1, 0.7]))

    p2 = Perceptron(np.array([1,2,3]),0)
    p2.update(np.array([[3,2,1],[4,0,-1]]),np.array([0,0]))
    assert sum_almost_equal(p2.weights, np.array([0.7, 1.8, 2.9]))

    p3 = Perceptron(np.array([3,0,2]),0)
    p3.update(np.array([[2,-2,4],[-1,-3,2],[0,2,1]]),np.array([0,1,0]))
    assert sum_almost_equal(p3.weights, np.array([2.7, -0.3, 1.7]))
    
print "Tests successfully completed"

if __name__ == "__main__":
    test()

## 7. Layered Network Example

In general we place units together to form **layered networks**. This will be represented as follows. 

[[node,node,node], #input layer  
[node,node],       #hidden layer  
[node]]            #output layer  

Given weights for hidden layer of [1,1,-5] and [3,-4,2],  
and weights for the output layer of [2,-1],  
what will this network output for inputs [1,2,3]? 





In [None]:
from IPython.display import display

import numpy as np 

input_layer = [1,2,3]
hidden_layer = [1,1,-5]
hidden_layer_2 = [3,-4,2]
output_layer = [2,-1]

step_1 = 

step_2 = 

step_3 = 

step_4 = 


step_4

## 8. Linear Representational Power

We would like it if these additional layers gave us more representational power. For linear units, just taking weighted sums, they will not. Given the following network, where each node simply passes along the dot product of its inputs with its weights, write down the weights of a single linear node that computes the same function. 

[[input,input],  
[[3,2],[-1,4],[3,-5]],  
[[1,2,-1]]]  


[,]



## 9. Build the XOR Network


In [None]:
# ----------
#
# In this exercise, you will create a network of perceptrons that can represent
# the XOR function, using a network structure like those shown in the previous
# quizzes.
#
# You will need to do two things:
# First, create a network of perceptrons with the correct weights
# Second, define a procedure EvalNetwork() which takes in a list of inputs and
# outputs the value of this network.
#
# ----------

import numpy as np

class Perceptron:
    """
    This class models an artificial neuron with step activation function.
    """

    def __init__(self, weights = np.array([1]), threshold = 0):
        """
        Initialize weights and threshold based on input arguments. Note that no
        type-checking is being performed here for simplicity.
        """
        self.weights = weights
        self.threshold = threshold


    def activate(self, values):
        """
        Takes in @param values, a list of numbers equal to length of weights.
        @return the output of a threshold perceptron with given inputs based on
        perceptron weights and threshold.
        """
               
        # First calculate the strength with which the perceptron fires
        strength = np.dot(values,self.weights)
        
        # Then return 0 or 1 depending on strength compared to threshold  
        return int(strength > self.threshold)
    

# Part 1: Set up the perceptron network
Network = [
    # input layer, declare input layer perceptrons here
    [ ... ], \
    # output node, declare output layer perceptron here
    [ ... ]
]

# Part 2: Define a procedure to compute the output of the network, given inputs
def EvalNetwork(inputValues, Network):
    """
    Takes in @param inputValues, a list of input values, and @param Network
    that specifies a perceptron network. @return the output of the Network for
    the given set of inputs.
    """
    
    # YOUR CODE HERE

    # Be sure your output value is a single number
    return OutputValue

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    """
    print "0 XOR 0 = 0?:", EvalNetwork(np.array([0,0]), Network)
    print "0 XOR 1 = 1?:", EvalNetwork(np.array([0,1]), Network)
    print "1 XOR 0 = 1?:", EvalNetwork(np.array([1,0]), Network)
    print "1 XOR 1 = 0?:", EvalNetwork(np.array([1,1]), Network)

if __name__ == "__main__":
    test()

## 10. Discretion Quiz

One problem that perceptron units have is that their outputs are discrete. This makes it difficult to address regression problems with them, and requires them to have more complexity to capture some concepts. 

For example: 

Given a network of perceptrons with structure [2,2,1], that is, two input nodes, two hidden nodes, and one output node, how many possible prices could the network assign to a house? 





## 11. Activation Function Sandbox

As discussed in the previous lesson, to solve the problem of having only a very few discrete outputs from our neural net, we'll apply a transition function.

We'll start by letting you test out a variety of functions, numerically approximating their derivatives in order to apply a gradient descent update rule.




In [None]:
#This will have to be conducted within the online quiz 


# ----------
# 
# Python Neural Networks code originally by Szabo Roland and used with
# permission
#
# Modifications, comments, and exercise breakdowns by Mitchell Owen,
# (c) Udacity
#
# Retrieved originally from http://rolisz.ro/2013/04/18/neural-networks-in-python/
#
#
# Neural Network Sandbox
#
# Define an activation function activate(), which takes in a number and
# returns a number.
# Using test run you can see the performance of a neural network running with
# that activation function, where the inputs are 8x8 images of digits (0-9) and
# the outputs are digit predictions made by the network.
#
# ----------

import numpy as np


def activate(strength):
    # Try out different functions here. Input strength will be a number, with
    # another number as output.
    return np.power(strength,2)
    
def activation_derivative(activate, strength):
    #numerically approximate
    return (activate(strength+1e-5)-activate(strength-1e-5))/(2e-5)

## 12. Activation Function Quiz

We have decided that we need a function which is continuous (to avoid the discrete problem of perceptrons) but not linear (to allow us to represent non-linear functions). Which of the following seems appropriate? 


•	Sine  
•	Arctangent   
•	Natural logarithm   
•	Cube root  
•	Logistic function   


## 13. Perceptron Vs Sigmoid


What will the difference be between a single perceptron and a sigmoid unit on binary classification problems? 

•	There will be no difference.   
•	One gives more information but they will give the same answer.   
•	They will sometimes give different answers.   
•	They will always give different answers.  
•	They will give different answers in certain rare circumstances.   











## 14. Sigmoid Learning

We need to train our networks of sigmoid units like we trained networks of perceptrons. How should we determine our update rules? 

•	Arbitrarily.   
•	Using our intuition.    
•	Using domain knowledge.   
•	Using calculus.   
•	Using trigonometry.   





## 15. Gradient Descent Issues

Using calculus, **gradient descent** can provide us with a locally optimal set of weight changes, under certain assumptions. However, some issues can arise. Which of the following do you think could be problematic? 

•	Local extrema.   
•	Lengthy run times.   
•	Infinite loops.  
•	Failure to completely converge.   



## 16. Sigmoid Programming Exercise


In [None]:
# ----------
# 
# As with the previous perceptron exercises, you will complete some of the core
# methods of a sigmoid unit class.
#
# There are two functions for you to finish:
# First, in activate(), write the sigmoid activation function.
# Second, in update(), write the gradient descent update rule. Updates should be
#   performed online, revising the weights after each data point.
# 
# ----------


class Sigmoid:
    """
    This class models an artificial neuron with sigmoid activation function.
    """

    def __init__(self, weights = np.array([1])):
        """
        Initialize weights based on input arguments. Note that no type-checking
        is being performed here for simplicity of code.
        """
        self.weights = weights

        # NOTE: You do not need to worry about these two attribues for this
        # programming quiz, but these will be useful for if you want to create
        # a network out of these sigmoid units!
        self.last_input = 0 # strength of last input
        self.delta      = 0 # error signal
    
    def activate(self, values):
        """
        Takes in @param values, a list of numbers equal to length of weights.
        @return the output of a sigmoid unit with given inputs based on unit
        weights.
        """
        
        # YOUR CODE HERE
        
        # First calculate the strength of the input signal.
        strength = np.dot(values, self.weights)
        self.last_input = strength
        
        # TODO: Modify strength using the sigmoid activation function and
        # return as output signal.
        # HINT: You may want to create a helper function to compute the
        #   logistic function since you will need it for the update function.
        
        return result
    
    def update(self, values, train, eta=.1):
        """
        Takes in a 2D array @param values consisting of a LIST of inputs and a
        1D array @param train, consisting of a corresponding list of expected
        outputs. Updates internal weights according to gradient descent using
        these values and an optional learning rate, @param eta.
        """

        # TODO: for each data point...
        for X, y_true in zip(values, train):
            # obtain the output signal for that point
            y_pred = self.activate(X)

            # YOUR CODE HERE

            # TODO: compute derivative of logistic function at input strength
            # Recall: d/dx logistic(x) = logistic(x)*(1-logistic(x))

            # TODO: update self.weights based on learning rate, signal accuracy,
            # function slope (derivative) and input value
            

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    def sum_almost_equal(array1, array2, tol = 1e-5):
        return sum(abs(array1 - array2)) < tol

    u1 = Sigmoid(weights=[3,-2,1])
    assert abs(u1.activate(np.array([1,2,3])) - 0.880797) < 1e-5
    
    u1.update(np.array([[1,2,3]]),np.array([0]))
    assert sum_almost_equal(u1.weights, np.array([2.990752, -2.018496, 0.972257]))

    u2 = Sigmoid(weights=[0,3,-1])
    u2.update(np.array([[-3,-1,2],[2,1,2]]),np.array([1,0]))
    assert sum_almost_equal(u2.weights, np.array([-0.030739, 2.984961, -1.027437]))
    return True
if  test():
    print "All tests completed sucessfully"