# The Perceptron Algorithm.

### A perceptron is the building block of Neural Networks.
#### Below, I will be implementing the perceptron algorithm in python and documenting it extensively.



# The Concept

- A perceptron is basically an individual node in the neural network that receives an input signal in the form of "training data" and it processes that data as a linear function defined below.

```python
activation = sum(weight_of_input * input) + bias
```

- The above defined activation function is then utilized to predict the output. The output of our perceptron is dependent upon the result of the activation function.

```python
prediction = 1.0 if activation >= 0.0 else 0.0
```
- The above line of code simply goes to show that the Perceptron is a classification problem with two classes (1.0 & 0.0)

##### The "weight_of_input" mentioned above is estimated using Stochastic Gradient Descent which is what I will extrapolate on next.

## Stochastic Gradient Descent

- Gradient Descent is the process of minimizing a function following the gradient of a cost function.
- In order to evaluate and update the weights of the perceptron at each iteration of training, Stochastic Gradient Descent is implemented.

###### The basic concept of stochastic gradient descent is that each instance of the training data is shown to the perceptron model one at a time and the model makes a decision based on each training instance. The error is calculated and the model is updated to reduce the error for the next prediction.

- The stochastic gradient descent model is used to find the set of weights that result in the smallest error in the output.

```python
w = w + learning_rate * (expected_value - predicted_value) * x
```
- "w" defined above is the weight that is beign optimized.
- the "learning_rate" is configured manually (e.g. 0.01).
- the "expected_value - predicted_value" is the prediction error of the model.
- "x" is the input value to the perceptron.

# The Implementation of the Perceptron Algorithm.

## Step - 1 :

- In order for the perceptron algorithm to make accurate predictions, we need to define a function incharge of those predictions.

- This prediction function will be essential in evaluating weight values during stochastic gradient descent as well as after the model is finalized and when we want to make novel predictions.

In [6]:
# The following function predicts the output value of a row, when it is provided with a set of weights.
# sample row = [2.7810836,2.550537003,0]
# sample weights = [-0.1, 0.20653640140000007, -0.23418117710000003]
def make_prediction(row, weights):
    # The initial activation value is always the bias because it is independent and not responsible for any specific input.
    activation = weights[0]
#     print "Initial Activation:",activation
    for i in range(len(row) - 1):
        activation += weights[i + 1] * row[i]
#         print "Iterative Activation:",activation
#         print "Iterative Weights",weights[i+1]
#         print "Iterative Rows",row[i]
    return 1.0 if activation >= 0.0 else 0.0

w = [-0.1, 0.20653640140000007, -0.23418117710000003]
r = [2.7810836,2.550537003,0]

mp = make_prediction(r,w)

# print "AF",activation
print "Prediction:", mp

Prediction: 0.0


### Now that we see how the function make_prediction above works, we can apply it to a bigger test dataset to see if it works as expected.

In [7]:
# test predictions
dataset = [[2.7810836,2.550537003,0],
    [1.465489372,2.362125076,0],
    [3.396561688,4.400293529,0],
    [1.38807019,1.850220317,0],
    [3.06407232,3.005305973,0],
    [7.627531214,2.759262235,1],
    [5.332441248,2.088626775,1],
    [6.922596716,1.77106367,1],
    [8.675418651,-0.242068655,1],
    [7.673756466,3.508563011,1]]
weights = [-0.1, 0.20653640140000007, -0.23418117710000003]
for row in dataset:
    prediction = make_prediction(row, weights)
    print("Expected=%d, Predicted=%d" % (row[-1], prediction))

Expected=0, Predicted=0
Expected=0, Predicted=0
Expected=0, Predicted=0
Expected=0, Predicted=0
Expected=0, Predicted=0
Expected=1, Predicted=1
Expected=1, Predicted=1
Expected=1, Predicted=1
Expected=1, Predicted=1
Expected=1, Predicted=1


In [23]:
# Estimate Perceptron weights using stochastic gradient descent
def train_weights(train, l_rate, n_epoch):
    # Initially, the weights are set to zero so that it can be calculated
    weights = [0.0 for i in range(len(train[0]))]
    # Looping through each epoch(number of times weights are trained over a certain training data)
    for epoch in range(n_epoch):
        sum_error = 0.0
        # For each row in the training data...
        for row in train:
            # We run the above defined make_prediction function
            prediction = make_prediction(row, weights)
            # This is the first row for 
            print "row", row
            print "weights", weights
            print "prediction", make_prediction(row, weights)
            error = row[-1] - prediction
            print "EROR",error
            sum_error += error**2
            print "SUM_EROR", sum_error
            weights[0] = weights[0] + l_rate * error
            # For the last 2 values in each row
            for i in range(len(row)-1):
                weights[i + 1] = weights[i + 1] + l_rate * error * row[i]
#                 print "W",weights
#         print "W", weights
        print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))
    return weights
 
# Training weights with a learning rate, epoch and training data.
dataset = [[2.7810836,2.550537003,0],
    [1.465489372,2.362125076,0],
    [3.396561688,4.400293529,0],
    [1.38807019,1.850220317,0],
    [3.06407232,3.005305973,0],
    [7.627531214,2.759262235,1],
    [5.332441248,2.088626775,1],
    [6.922596716,1.77106367,1],
    [8.675418651,-0.242068655,1],
    [7.673756466,3.508563011,1]]
l_rate = 0.1
n_epoch = 5
weights = train_weights(dataset, l_rate, n_epoch)
print "WEIGHTS",(weights)

row [2.7810836, 2.550537003, 0]
weights [0.0, 0.0, 0.0]
prediction 1.0
EROR -1.0
SUM_EROR 1.0
row [1.465489372, 2.362125076, 0]
weights [-0.1, -0.27810836, -0.2550537003]
prediction 0.0
EROR 0.0
SUM_EROR 1.0
row [3.396561688, 4.400293529, 0]
weights [-0.1, -0.27810836, -0.2550537003]
prediction 0.0
EROR 0.0
SUM_EROR 1.0
row [1.38807019, 1.850220317, 0]
weights [-0.1, -0.27810836, -0.2550537003]
prediction 0.0
EROR 0.0
SUM_EROR 1.0
row [3.06407232, 3.005305973, 0]
weights [-0.1, -0.27810836, -0.2550537003]
prediction 0.0
EROR 0.0
SUM_EROR 1.0
row [7.627531214, 2.759262235, 1]
weights [-0.1, -0.27810836, -0.2550537003]
prediction 0.0
EROR 1.0
SUM_EROR 2.0
row [5.332441248, 2.088626775, 1]
weights [0.0, 0.48464476140000007, 0.020872523199999993]
prediction 1.0
EROR 0.0
SUM_EROR 2.0
row [6.922596716, 1.77106367, 1]
weights [0.0, 0.48464476140000007, 0.020872523199999993]
prediction 1.0
EROR 0.0
SUM_EROR 2.0
row [8.675418651, -0.242068655, 1]
weights [0.0, 0.48464476140000007, 0.02087252319