## Build a Perceptron

In [52]:
from numpy import asarray, insert, ndarray, piecewise
class Perceptron(ndarray):
    
    def __new__(cls, weights):
        return asarray(weights).view(cls)
    
    def activate(self,inputs):
        try:
            assert(len(inputs)+1==len(self))
        except:
            return 'input vector of dimension {} is not valid for a bias/weight vector of dimension {}'.format(
                len(inputs), len(self))
        inputs = insert(inputs,0,-1)
        return self._threshold(self.dot(inputs))

    def update(self,inputs,target,eta=.1):

        result = self.activate(inputs)
        inputs = insert(inputs,0,-1)
        self[1:] += eta*(target-result)*inputs[1:]
          
    def _threshold(self, energy):
            
        return piecewise(energy, [energy < 0, energy >= 0], [0 , 1])  
                           
            

In [53]:
p_1 = Perceptron([0,-1,.2,3])

In [54]:
p_1.activate([3,3,7])

8.3583900315119233e-09

In [55]:
p_1.activate([1,1,1])

0.099750489119685135

In [64]:
print(p_1)
for _ in range(1000):
    p_1.update([1,1,1],1)
print(p_1)
p_1.activate([1,1,1])

[ 2.44828551 -3.44828551 -2.24828551  0.55171449]
[ 2.49415566 -3.49415566 -2.29415566  0.50584434]


0.99958074949915265

In [105]:
import requests
requests.post?

## Threshold Meditation
What do you think the advantage of a **perceptron** is, compared with simply returning the dot product without a threshold?

## Where to train Perceptrons
We want to build networks of **perceptrons** that contain interesting functions. What are the parameters of the perceptrons that we will want to modify?

1. Output functions
1. Input Values
1. Thresholds
1. Weights
1. Input functions

<!-- 3, 4 Sure! We can modify the thresholds, or we could also include threshold changes in the weights. -->

## Perceptron Inputs
**Neural networks** are built out of components like perceptron units. What do inputs to networks of perceptrons look like?

1. A matrix of numerical, values, with some labeled input
1. A directed graph.
1. An unlabeled matrix of numerical values.
1. A set of classifications of numerical values.
1. A matrix of numerical values with classifications for each row.

<!-- 5 A single perceptron is very much like linear regression. Therefore it should take the same kinds of inputs. However the outputs of perceptrons will generally be classifications, not numerical. -->

## Neural Net Inputs
What information can we get as the output of a **neural network**?

1. A directed graph, the neural network itself.
1. A single scalar-valued number.
1. The classification of a vector.
1. A vector valued output for any vector input.

<!-- All. In general, neural nets are much more flexible than thresholded perceptron networks! -->

## Perceptron Update Rule

In [47]:
import numpy as np

class Perceptron:
        
    def __init__(self,weights,threshold):
        self.weights = np.array(weights)
        self.threshold = np.array(threshold)
    
    def activate(self,inputs):
        '''Takes in @param inputs, a list of numbers.
        @return the output of a threshold perceptron with
        given weights, threshold, and inputs.
        ''' 
        
        activation = np.dot(self.weights,inputs)
        return self._threshold(activation)
        

    def update(self,values,train,eta=.1):
        '''Takes in a 2D array @param values consisting of a LIST of inputs
        and a 1D array @param train, consisting of a corresponding list of 
        expected outputs.
        Updates internal weights according to the perceptron training rule
        using these values and an optional learning rate, @param eta.
        '''
        for target,value in zip(targets,values):
            result = self.activate(value)
            self.weights = self.weights + eta*(target-result)*value
               
        
    def _threshold(self, energy):
            
        return np.piecewise(energy, [energy < self.threshold, energy >= self.threshold], [0 , 1])  

In [48]:
p_1 = Perceptron([-1,.2,3],0)

In [49]:
values = [np.array([3,1,2]),np.array([2,-1,4]),np.array([0,-5,10]),np.array([1,1,2])]
targets = np.array([0,0,1,1])
for value in values:
    print(p_1.activate(value))

1.0
1.0
1.0
1.0


In [50]:
p_1.update(values,targets)
p_1.weights
for value in values:
    print(p_1.activate(value))

1.0
1.0
1.0
1.0


In [51]:
p_1.update(values,targets)
p_1.weights
for value in values:
    print(p_1.activate(value))

0.0
1.0
1.0
1.0


In [52]:
p_1.update(values,targets)
p_1.weights
for value in values:
    print(p_1.activate(value))

0.0
1.0
1.0
1.0


In [53]:
p_1.update(values,targets)
p_1.weights
for value in values:
    print(p_1.activate(value))

0.0
0.0
1.0
1.0


## Layered Network Update
In general we place units together to form **layered networks**. This will be represented as follows:

     [[node, node, node],   # input layer
     [node, node],          # hidden layer
     [node]]                # output layer
     
Given weights for the hidden layer of `[1, 1, -5]` and `[3, -4, 2]`, and weights for the output layer of `[2, -1]`, what will this network output for inputs `[1, 2, 3]`?

<!-- -25 -->

## Linear Representational Power
We would like it if these additional layers gave us more representational power. For linear units, just taking weighted sums, they will not. Given the following network, where each node simply passes along the dot product of its inputs with its weights, write down the weights of a single linear node that computes the same function.

    [[input_1, input_2],
    [[3, 2], [-1, 4], [3, -5]],
    [[1, 2, -1]]]
    
    input_1 = 
    input_2 = 
    
<!-- -2, 15 -->    

## Build the XOR Network

In [None]:
#
#   In this exercise, you will create a network of perceptrons which
#   represent the xor function use the same network structure you used
#   in the previous quizzes.
#
#   You will need to do two things:
#   First, create a network of perceptrons with the correct weights
#   Second, define a procedure EvalNet() which takes in a list of 
#   inputs and ouputs the value of this network.

import numpy as np

class Perceptron:

    def evaluate(self,values):
        '''Takes in @param values, @param weights lists of numbers
        and @param threshold a single number.
        @return the output of a threshold perceptron with
        given weights and threshold, given values as inputs.
        ''' 
               
        #First calculate the strength with which the perceptron fires
        strength = np.dot(values[i],self.weights[i])
        
        #Then evaluate the return value of the perceptron
        if strength >= self.threshold:
            result = 1
        else:
            result = 0

        return result

    def __init__(self,weights=None,threshold=None):
        if weights is not None:
            self.weights = weights
        if threshold is not None:
            self.threshold = threshold
            

Network = [
    #input layer, declare perceptrons here
    [ ... ], \
    #output node, declare one perceptron here
    [ ... ]
]


def EvalNetwork(inputValues, Network):
    
    
    # Be sure your output values are single numbers
    return OutputValues

## Discretion Quiz
One problem that perceptron units have is that their outpus are discrete. This makes it difficult to address regression problems with them, and requires them to have more complexity to capture some concepts. 

For example:

Given a network of perceptrons with struction `[2, 2, 1]`, that is, two input nodes, two hidden nodes, and one output node, how many possible prices could the network assign to a house?

<!-- 4 -->

## Continuity
As discussed in the previous lesson, to solve the problem of having only a very few discrete outputs from our neural net, we'll apply a transition function.

We'll start by letting you test out a variety of functions, numerically approximating their derivatives in order to apply a gradient descent update rule.

## Activation Function Sandbox

In [None]:
#
#   Python Neural Networks code originally by Szabo Roland and used by permission
#
#   Modifications, comments, and exercise breakdowns by Mitchell Owen, (c) Udacity
#
#   Retrieved originally from http://rolisz.ro/2013/04/18/neural-networks-in-python/
#
#
#	Neural Network Sandbox
#
#	Define an activation function activate(), which takes in a number and returns a number.
#	Using test run you can see the performance of a neural network running with that activation function.
#
import numpy as np


def activate(strength):
    return np.power(strength,2)
    
def activation_derivative(activate, strength):
    #numerically approximate
    return (activate(strength+1e-5)-activate(strength-1e-5))/(2e-5)

## Activation Function
We have decided that we need a function which is continous (to avoid the discrete problem of perceptrons) but not linear (to allow us to represent non-linear functions). Which of the following seems appropriate?

1. Sine
1. Arctangent
1. Natural logarithm
1. Cube root
1. Logistic Function

<!-- 5 Great choice! Computing the derivative is essentially the same as computing the function itself, so this is also a relatively efficient choice. --> 

## Perceptron Vs Sigmoid
What will the difference be between a single perceptron and a sigmoid unit on binary classification problems?

1. There will be no difference
1. One gives more information but they will give the same answer
1. They will sometimes give different answers.
1. They will always give different answers.
1. They will give different answers in certain rare circumstances.

<!-- 2 Great choice! Computing the derivative is essentially the same as computing the function itself, so this is also a relatively efficient choice. -->

## Sigmoid Learning
We need to train our network of sigmoid units like we trained network of perceptrons. How should we determine our update rules?

1. Arbitrarily
1. Using our intuition
1. Using domain knowledge
1. Using calculus
1. Using trigonometry

<!-- 4 Right! We want to deal with small, gradual changes of continuous functions. This is exactly where we should use calculus!
-->

## Gradient Descent Issues
Using calculus, **gradient descent** can provide us with a locally optimal set of weight changes, under certain assumptions. However, some issues can arise. Which of the following do you think could be problematic?

1. local extrema
1. lengthy run times
1. infinite loops
1. failure to completely converge

<!-- 1,2, 4 -->

## Sigmoid Programming Exercise

In [None]:
#
#   As with the perceptron exercise, you will modify the
#   last functions of this sigmoid unit class
#
#   There are two functions for you to finish:
#   First, in activate(), write the sigmoid activation function
#
#   Second, in train(), write the gradient descent update rule
#
#   NOTE: the following exercises creating classes for functioning
#   neural networks are HARD, and are not efficient implementations.
#   Consider them an extra challenge, not a requirement!

import numpy as np

class Sigmoid:
        
    def activate(self,values):
        '''Takes in @param values, @param weights lists of numbers
        and @param threshold a single number.
        @return the output of a threshold perceptron with
        given weights and threshold, given values as inputs.
        ''' 
               
        #First calculate the strength with which the perceptron fires
        strength = self.strength(values)
        self.last_input = strength
        
        #YOUR CODE HERE
        #modify strength using the sigmoid activation function
        
        return result
        
    def strength(self,values):
        strength = np.dot(values,self.weights)
        return strength
        
    def update(self,values,train,eta=.1):
        '''
        Updates the sigmoid unit with expected return
        values @param train and learning rate @param eta
        
        By modifying the weights according to the gradient descent rule
        '''
        
        #YOUR CODE HERE
        #modify the perceptron training rule to a gradient descent
        #training rule you will need to use the derivative of the
        #logistic function evaluated at the last input value.
        #Recall: d/dx logistic(x) = logistic(x)*(1-logistic(x))
        
        result = self.activate(values)
        for i in range(0,len(values)):
            self.weights[i] += eta*(train - result)*values[i]
        
    def __init__(self,weights=None):
        if weights:
            self.weights = weights
            
            
unit = Sigmoid(weights=[3,-2,1])
unit.update([1,2,3],[0])
print unit.weights
#Expected: [2.99075, -2.0185, .97225]