# Representing a Perceptron

A perceptron is an artificial neuron that can make a simple decision. It has three main components:

 - **inputs** each input corresponds to a feature, e.g. age, weight, height, etc
 
 - **weights** which assigns a certain amount of importance to each input. The larger the weight, the bigger the role the input plays in determining the output 
 
 - **output** the inputs and weights produce an output. The type of output varies with the nature of the problem, e.g. it could be a binary 1 or 0(Yes or No), or could be any value within a range of values.

In [1]:
class Perceptron:
  def __init__(self, num_inputs=2, weights=[1,1]):
    # complete the default constructor method
    self.num_inputs = num_inputs
    self.weights = weights
    
cool_perceptron = Perceptron()    
print(cool_perceptron)
print(type(cool_perceptron))

<__main__.Perceptron object at 0x7f80cc675ef0>
<class '__main__.Perceptron'>


### Generating an Output

The process of turning the input and weights into an output involves a 2-step process:

1. **determining the weighted sum**

Determining the weighted sum of the outputs is the sum of the products of each input and corresponding weight, `x` is the input and `w` the weight.

![Perceptron](img/perceptron-1.png)

We can implement it using the following process:

1. Start with a `weighted_sum` of 0.

2. Start with the first input and multiply it by its corresponding weight. Add this result to `weighted_sum`.

3. Go to the next input and multiply it by its corresponding weight. Add this result to `weighted_sum`.

4. Repeat this process for all inputs.

In [2]:
class Perceptron:
  def __init__(self, num_inputs=2, weights=[2,1]):
    self.num_inputs = num_inputs
    self.weights = weights
    
  def weighted_sum(self, inputs):
    # create variable to store weighted sum
    weighted_sum = 0
    for i in range(self.num_inputs):
      weighted_sum += inputs[i] * self.weights[i]
      
    return weighted_sum  
cool_perceptron = Perceptron()
print(cool_perceptron.weighted_sum([24, 55]))

103


2. **constrain the weighted sum to produce a desired output**

An `activation function` is used to transform the weighted sum into the 'desired' and 'constrained' output, e.g if the input was in the range 100-1000, but the desired out put was a binary 1 or 0 (Yes or No), an `activation function` would be used to transform this.

If you want to train a perceptron to detect whether a point is above or below a line, you could use the `sign activation function`. It returns `+1` if the weighted sum is positive, and `-1` if the weighted sum is negative.

In [3]:
class Perceptron:
  def __init__(self, num_inputs=2, weights=[1,1]):
    self.num_inputs = num_inputs
    self.weights = weights
    
  def weighted_sum(self, inputs):
    weighted_sum = 0
    for i in range(self.num_inputs):
      weighted_sum += self.weights[i]*inputs[i]
    return weighted_sum
  
  def activation(self, weighted_sum):
    if weighted_sum >= 0:
      return 1
    else:
      return -1
    

cool_perceptron = Perceptron()
print(cool_perceptron.weighted_sum([24, 55]))
print(cool_perceptron.activation(55))

79
1


## Training the Perceptron

At the moment our perceptron will be particularly bad at any predictions since we're using random weights. We can train the perceptron by providing it a training set, a collection of random inputs with correctly predicted outputs. Each time we execute a 'training cycle' we change it's weights slightly until we can correctly match all the input-output pairs.

Every time the output mismatches the expected label, we say that the perceptron has made a training error — a quantity that measures 'how bad' the perceptron is performing. The goal is to continue training the perceptron until the training error reaches 0.

The training error is calculated by subtracting the predicted label value from the actual label value.

In our example, since we're using the `Sign activation Function`, the output of the perceptron will be `+1` of `-1`. Since the labels are also `+1` or `-1` there a four possible outcomes:

![Perceptron](img/perceptron-2.png)

In [7]:
class Perceptron:
  def __init__(self, num_inputs=2, weights=[1,1]):
    self.num_inputs = num_inputs
    self.weights = weights
    
  def weighted_sum(self, inputs):
    weighted_sum = 0
    for i in range(self.num_inputs):
      weighted_sum += self.weights[i]*inputs[i]
    return weighted_sum
   
  def activation(self, weighted_sum):
    if weighted_sum >= 0:
      return 1
    if weighted_sum < 0:
      return -1
    
  def training(self, training_set):
    for inputs in training_set:                   
      prediction = self.activation(self.weighted_sum(inputs))
      actual = training_set[inputs]
      error = actual - prediction
      
cool_perceptron = Perceptron()

### Tweaking the Weights

We tweak the weights of our perceptron, improving it's performance, until the error is 0. We can't change the inputs. The only parameter we can change are the weights.

The goal is to find the optimal combination of weights that will produce the correct output for as many points as possible in the dataset.

We can’t just play around randomly with the weights until the correct combination magically pops up. There needs to be a way to guarantee that the perceptron improves its performance over time.

This is where the Perceptron Algorithm comes in. Although complex, the most important part of the algorithm updates the weight:

![Perceptron](img/perceptron-3.png)

```py
# formula above refactored

weight += error * input
```

We keep on tweaking the weights until all possible labels are correctly predicted by the perceptron. This means that multiple passes might need to be made through the `training_set` before the Perceptron Algorithm comes to a halt. If the algorithm doesn't find an error, the perceptron must have correctly predicted the labels for all points.

### The Bias Weight

There are times when a minor adjustment is needed for the perceptron to be more accurate. We use the bias weight. It takes a default input value of 1 and some random weight value. The updated `weighted_sum` formula:

![Perceptron](img/perceptron-4.png)

This means updating the code, `num_inputs` becomes 3(instead of 2), and we need to add a bias weight to the list of weights, becomes `[1,1,1]` instead of `[1,1]`.

In [14]:
# complete perceptron code
class Perceptron:
  def __init__(self, num_inputs=2, weights=[1,1]):
    self.num_inputs = num_inputs
    self.weights = weights
    
  def weighted_sum(self, inputs):
    weighted_sum = 0
    for i in range(self.num_inputs):
      weighted_sum += self.weights[i]*inputs[i]
    return weighted_sum
  
  def activation(self, weighted_sum):
    if weighted_sum >= 0:
      return 1
    if weighted_sum < 0:
      return -1
    
  def training(self, training_set):
    foundLine = False
    while not foundLine:
      total_error = 0
      for inputs in training_set:
        prediction = self.activation(self.weighted_sum(inputs))
        actual = training_set[inputs]
        error = actual - prediction
        total_error += abs(error)
        for i in range(self.num_inputs):
          self.weights[i] = self.weights[i] + (error * inputs[i])
      if total_error == 0:
        foundLine = True
      
cool_perceptron = Perceptron()
small_training_set = {(0,3):1, (3,0):-1, (0,-3):-1, (-3,0):1}

cool_perceptron.training(small_training_set)

### Finding a Linear Classifier

A perceptron's weights can be used to find the slope and intercept of the line that the perceptron represents which can be plotted.

```py
slope = -self.weights[0]/self.weights[1]
intercept = -self.weights[2]/self.weights[1]
```

This allows us to visualize the perceptron, the 1st iteration of the training process and the last.

<img src="img/perceptron-5.png" width="300"><img src="img/perceptron-6.png" width="300">

You can see in the 2nd plot(last iteration) the perceptron found the `linear classifier`, or `decision boundary`, that separates the two distinct set of points in the training set.

### Limitations

There a limits to using a single perceptron. The example we've covered consisted of data points that were `linearly separable`, i.e. a single line could easily separate the two dissimilar sets of points.

What would happen if the data points were scattered in such a way that a line could no longer classify the points? A single perceptron with only two inputs wouldn't work for such a scenario because it cannot represent a non-linear decision boundary.

That's when more perceptrons and features(inputs) come into play!

By increasing the number of features and perceptrons, we can give rise to the Multilayer Perceptrons, also known as Neural Networks, which can solve much more complicated problems.