# Intro to Neural Networks

Recall that Neural Networks are layered networks of nodes. One of the most simple nodes is a called the perceptron


### Perceptrons
Perceptrons are not new to machine learning. In fact, here is an image of a machine designed by the US Navy for image classification.  
![made of wires](Mark_I_perceptron.jpeg "Mark I perceptron Machine")

Most simply a perceptron maps a series of inputs to an output. Each input is weighted by a cetrain amount before some mapping is applied to the inputs that determines the output of the perceptron. We will compute the weighting of the inputs by taking the dot product of the input vector and the weight vector. This number is also known as the strenth or the activity of the inputs. Our mapping to the perceptron output will be a simple step function that takes in the strenth of the inputs and compares the strength to some predefined threshold value.

In [1]:
# ----------
# 
# In this exercise, you will put the finishing touches on a perceptron class.
#
# Finish writing the activate() method by using np.dot to compute signal
# strength and then add in a threshold for perceptron activation.
#
# ----------


import numpy as np


class Perceptron0(object):
    """
    This class models an artificial neuron with step activation function.
    """

    def __init__(self, weights = np.array([1]), threshold = 0):
        """
        Initialize weights and threshold based on input arguments. Note that no
        type-checking is being performed here for simplicity.
        """
        self.weights = weights
        self.threshold = threshold
    
    def activate(self,inputs):
        """
        Takes in @param inputs, a list of numbers equal to length of weights.
        @return the output of a threshold perceptron with given inputs based on
        perceptron weights and threshold.
        """ 
        activation = np.sum(np.dot(inputs, self.weights)) - self.threshold
        result = (1 if activation > self.threshold else 0)
            
        return result


def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    p1 = Perceptron0(np.array([1, 2]), 0.)
    assert p1.activate(np.array([ 1,-1])) == 0 # < threshold --> 0
    assert p1.activate(np.array([-1, 1])) == 1 # > threshold --> 1
    assert p1.activate(np.array([ 2,-1])) == 0 # on threshold --> 0
    return True

if test():
    print "All tests completed"

All tests completed


### Question:
What are the advantages of using some threshold and step function rather than just outputting the weighted inputs (dot product)?

### Question
What _parameter_ is learnable in a perceptron? In other words, what can be modified to allow the perceptron or a network of perceptrons to model an arbitrary function. 

### Question
What does the input to a network of perceptrons look like?
A) Tensor of weights
B) Matrix of numerical values
C) Matrix of classifcations 
D) Matrix of numerical values and classifications for each row

### Question
Are Neural Networks used for classification or regression?

### Perceptron update rule

Remember the update rule for perceptrons.
$$ \Delta w = (\eta * (y_i - \hat{y})) * x_i $$


In [56]:
# ----------
#
# In this exercise, you will update the perceptron class so that it can update
# its weights.
#
# Finish writing the update() method so that it updates the weights according
# to the perceptron update rule. Updates should be performed online, revising
# the weights after each data point.
# 
# ----------


class Perceptron1(Perceptron0):
    """
    This class models an artificial neuron with step activation function.
    """

    def __init__(self, *args):
        """
        Initialize weights and threshold based on input arguments. Note that no
        type-checking is being performed here for simplicity.
        """
        super(Perceptron1,self).__init__(*args )


    def update(self, X, y, eta=.1):
        """
        Takes in a 2D array @param X consisting of a LIST of inputs and a
        1D array @param y, consisting of a corresponding list of expected
        outputs. Updates internal weights according to the perceptron training
        rule using these values and an optional learning rate, @param eta.
        """
        
        for setData, yVal in zip(X, y):
            yprediction = self.activate(setData)
            self.weights = self.weights + np.dot(np.dot(eta, (yVal - yprediction)), setData)

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    def sum_almost_equal(array1, array2, tol = 1e-6):
        return sum(abs(array1 - array2)) < tol

    p1 = Perceptron1(np.array([1,1,1]),0)
    p1.update(np.array([[2,0,-3]]), np.array([1]))
    assert sum_almost_equal(p1.weights, np.array([1.2, 1, 0.7]))
    
    p2 = Perceptron1(np.array([1,2,3]),0)
    p2.update(np.array([[3,2,1],[4,0,-1]]),np.array([0,0]))
    assert sum_almost_equal(p2.weights, np.array([0.7, 1.8, 2.9]))
    
    p3 = Perceptron1(np.array([3,0,2]),0)
    p3.update(np.array([[2,-2,4],[-1,-3,2],[0,2,1]]),np.array([0,1,0]))
    assert sum_almost_equal(p3.weights, np.array([2.7, -0.3, 1.7]))


    return True
if test():
    print "All tests completed sucessfully"

[1 1 1]
[1 2 3]
[3 0 2]
All tests completed sucessfully


## Networks
So far we've just covered single node perceptron units. As you saw in the videos this week, there are limitations to the types of data single percertron units can classifiy--namely they only work on linear serperable data. Let's investigate putting together mutiple layers of perceptron units. Networks of these units work by passing around weights though the nodes of the network until they reach the final output node layer

### Layered Network
Layered networks consist of some input layer, some number hidden nodes, and some output layer that outputs the classifcation or regression results. 

#### Question
Given the network structure below with weights on the edges of the graph what will be the output of this network?

![](Network1.png "NN")

```
[
[ [1], [2], [3] ]     # Input layer (these are not weights, but input values)
[[1,1,-5],[3,-4,2] ]  # Hidden layer
[ [2,-1] ]            # Output layer
]
```

## Build an XOR Network

In this exercise you will build a network capable of modeling the exclusive OR funtion. The weights are given below for a threshold of 1.
![](XORweights.png "XOR")


In [54]:
# ----------
#
# In this exercise, you will create a network of perceptrons that can represent
# the XOR function, using a network structure like those shown in the previous
# quizzes.
#
# You will need to do two things:
# First, create a network of perceptrons with the correct weights
# Second, define a procedure EvalNetwork() which takes in a list of inputs and
# outputs the value of this network.
#
# ----------


# Part 1: Set up the perceptron network
Network = [
    # input layer, declare input layer perceptrons here
    [ ... ], \
    # output node, declare output layer perceptron here
    [ ... ]
]

# Part 2: Define a procedure to compute the output of the network, given inputs
def EvalNetwork(inputValues, Network):
    """
    Takes in @param inputValues, a list of input values, and @param Network
    that specifies a perceptron network. @return the output of the Network for
    the given set of inputs.
    """
    
    # YOUR CODE HERE

    # Be sure your output value is a single number
    return OutputValue

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    """
    print "0 XOR 0 = 0?:", EvalNetwork(np.array([0,0]), Network)
    print "0 XOR 1 = 1?:", EvalNetwork(np.array([0,1]), Network)
    print "1 XOR 0 = 1?:", EvalNetwork(np.array([1,0]), Network)
    print "1 XOR 1 = 0?:", EvalNetwork(np.array([1,1]), Network)


SyntaxError: invalid syntax (<ipython-input-54-68cb1796d015>, line 18)

## Discretion

The outputs of perceptron units are discrete. Consider a network with the structure [2,2,1], that is 2 input nodes, two hidden nodes, and 1 output node. How many possible outputs to this network are there? _Hint: The answer is NOT two_

## Coninuity 
As discussed in the previous lesson, to solve the problem of having only a very few discrete outputs from our neural net, we'll apply a transition function.

We'll start by letting you test out a variety of functions, numerically approximating their derivatives in order to apply a gradient descent update rule.

**You'll want to run this in the [Udacity sandbox](https://classroom.udacity.com/nanodegrees/nd009/parts/596c7dc6-8049-4785-adfe-7c83ca19b00f/modules/d9424f23-4c4f-452f-8215-c48ae6fc0a56/lessons/7224990821/concepts/72258808312589030923#) to test on their dataset**



In [None]:
# ----------
# 
# Python Neural Networks code originally by Szabo Roland and used with
# permission
#
# Modifications, comments, and exercise breakdowns by Mitchell Owen,
# (c) Udacity
#
# Retrieved originally from http://rolisz.ro/2013/04/18/neural-networks-in-python/
#
#
# Neural Network Sandbox
#
# Define an activation function activate(), which takes in a number and
# returns a number.
# Using test run you can see the performance of a neural network running with
# that activation function, where the inputs are 8x8 images of digits (0-9) and
# the outputs are digit predictions made by the network.
#
# ----------


def activate(strength):
    # Try out different functions here. Input strength will be a number, with
    # another number as output.
    return np.power(strength,2)
    
def activation_derivative(activate, strength):
    #numerically approximate
    return (activate(strength+1e-5)-activate(strength-1e-5))/(2e-5)

## Sigmoid unit

A sigmoid unit is similar to a perceptron unit, but the activation funtion is the non-linear and differentiable logistic function. 

$$ \frac{\mathrm d}{\mathrm d x} \left( \frac{\mathrm 1}{\mathrm 1 + e ^ x} \right) =\frac{\mathrm 1}{\mathrm 1 + e ^ x} (1 - \frac{\mathrm 1}{\mathrm 1 + e ^ x} )  $$

$$ \Delta w = \eta*(y-\hat{y}) *( \frac{\mathrm d}{\mathrm d x} S(x_i) )*( x_i ) $$

In [2]:
# ----------
# 
# As with the previous perceptron exercises, you will complete some of the core
# methods of a sigmoid unit class.
#
# There are two functions for you to finish:
# First, in activate(), write the sigmoid activation function.
# Second, in update(), write the gradient descent update rule. Updates should be
#   performed online, revising the weights after each data point.
# 
# ----------

import numpy as np


class Sigmoid:
    """
    This class models an artificial neuron with sigmoid activation function.
    """

    def __init__(self, weights = np.array([1])):
        """
        Initialize weights based on input arguments. Note that no type-checking
        is being performed here for simplicity of code.
        """
        self.weights = weights

        # NOTE: You do not need to worry about these two attribues for this
        # programming quiz, but these will be useful for if you want to create
        # a network out of these sigmoid units!
        self.last_input = 0 # strength of last input
        self.delta      = 0 # error signal

    def activate(self, values):
        """
        Takes in @param values, a list of numbers equal to length of weights.
        @return the output of a sigmoid unit with given inputs based on unit
        weights.
        """
        
        # YOUR CODE HERE
        
        # First calculate the strength of the input signal.
        strength = np.dot(values, self.weights)
        self.last_input = strength
        
        # TODO: Modify strength using the sigmoid activation function and
        # return as output signal.
        # HINT: You may want to create a helper function to compute the
        #   logistic function since you will need it for the update function.
        
        return result
    
    def update(self, values, train, eta=.1):
        """
        Takes in a 2D array @param values consisting of a LIST of inputs and a
        1D array @param train, consisting of a corresponding list of expected
        outputs. Updates internal weights according to gradient descent using
        these values and an optional learning rate, @param eta.
        """

        # TODO: for each data point...
        for x_i, y_true in zip(values, train):
            # obtain the output signal for that point
            y_pred = self.activate(x_i)

            # YOUR CODE HERE

            # TODO: compute derivative of logistic function at input strength
            # Recall: d/dx logistic(x) = logistic(x)*(1-logistic(x))

            # TODO: update self.weights based on learning rate, signal accuracy,
            # function slope (derivative) and input value
            

def test():
    """
    A few tests to make sure that the perceptron class performs as expected.
    Nothing should show up in the output if all the assertions pass.
    """
    def sum_almost_equal(array1, array2, tol = 1e-5):
        return sum(abs(array1 - array2)) < tol

    u1 = Sigmoid(weights=[3,-2,1])
    assert abs(u1.activate(np.array([1,2,3])) - 0.880797) < 1e-5
    
    u1.update(np.array([[1,2,3]]),np.array([0]))
    assert sum_almost_equal(u1.weights, np.array([2.990752, -2.018496, 0.972257]))

    u2 = Sigmoid(weights=[0,3,-1])
    u2.update(np.array([[-3,-1,2],[2,1,2]]),np.array([1,0]))
    assert sum_almost_equal(u2.weights, np.array([-0.030739, 2.984961, -1.027437]))

if  test():
    print "All tests completed sucessfully"

NameError: global name 'result' is not defined