### 10.2: Neural Networks: Perceptron Part 1 - The Nature of Code

[Playlist link](https://www.youtube.com/watch?v=ntKn5TPHHAk&list=PLRqwX-V7Uu6Y7MdSCaIfsxc561QI0U0Tb&index=2)

[README link](https://www.youtube.com/redirect?v=ntKn5TPHHAk&event=video_description&redir_token=-D8YTb5wSbqv2u0AQ2wgfaQYApl8MTUzNzEyNjY3MkAxNTM3MDQwMjcy&q=https%3A%2F%2Fgithub.com%2Fshiffman%2FNOC-S17-2-Intelligence-Learning%2Ftree%2Fmaster%2Fweek4-neural-networks)

#### What is a Perceptron

Perceptron is a model of a single neuron,the simplest possible ANN that we can build

![](https://natureofcode.com/book/imgs/chapter10/ch10_03.png)

There is a single neuron. It receives 2 ips (X0 and X1). These ips come into the neuron. Some type of mathematical process happens in the neuron and then there is an op (Y)

This neuron /preceptron is actually like the ML recipe which receives ip and gives some op

In order to understand what exactly is happenning inside the NN we need to come up with some sort of a scenario.

Say I have a 2D space. This space is divided by a line. Some pts will be on one side of the line (class A). Other pts will be on the other side (class B)

We want to use the Perceptron for a **Classification** prob. For any pt in our 2d dataset it has a coordinate (x, y)

Here x is input 0 (X0) to our perceeptron and y is input 1 (X1). The op is Class A or Class B.

The Perceptron will op +1 (if class A) or -1 (if class B)

![](./data/img/diag9.png)

We are going to use Supervised Learning for this. We are going to give the perceptron a pt we already know the class of. Say we give it a pt of Class A. Perceptron guesses the class. If A, all good.. we just repeat the training process
If B, we tweak the algo to try and get the correct answer. This tweaking is a process called Gradient Descent. 

#### What is the algo that runs inside the neuron

Ips as they flow into the neuron are **weighted**.
Each one of the connections has a wt

wt for X0 = w0
wt for X1 = w1

Algo: create a sum of all ips multiplied by the wts(weighted sum of all the ips)

Step 1.

Sum = X0.w0 + X1.w1

Step 2.

**Activation Function**: It allows us to conform the op to some desired range. There are many diff AFs

Here we will use a very simple AF. We want only 2 ops: +1 or -1. So we have to take the Sum and transform it

SIGN(n) = +1 if n >= 0
-1 if n < 0

---

This entire process is called **Feed Forward**: The ips come in, they get mul by the wts, get added, the weighted sum gets passed thru an activation function, we get +1 or -1 as op

#### How to pick the wts

The idea is that through the SL process we search for the optimal wt values that will give us results with least error.

We can start off by taking random values of wts. **However there are many other methods for choosing initial wt values.**



#### Setting up basic Perceptron class

``` java

class Perceptron {
// create an array to store wts

float[] weights = new float[2];

// constructor: here we loop through all the wts and give them a random value bw -1 and +1
Perceptron(){
  for(int i = 0; i < weights.length; ++i){
    weights[i] = random(-1, 1);
  }
}

}
```

#### A Perceptron should be able to receive ips and compute a guess(op) using a weighted sum and Activation Function:

``` java

int sign(float n){
    // this is our Activation Function
    if (n >= 0){
      return 1;
    }
    else{
      return -1;
    }
  }
  
  int guess(float[] inputs){
    //here we compute the weighted sum
    float sum = 0;
    for (int i = 0; i < weights.length; ++i){
      sum += inputs[i] * weights[i];
    }
    
    int output = sign(sum);
    
    return output;
    
  }

```

#### Run the class

``` java

Perceptron p;

void setup(){
  size(200, 200);
  p = new Perceptron();
  float inputs[] = {-1, 0.5};
  int guess = p.guess(inputs);
  println(guess);
}

void draw(){


}

```
Here we have coded the basic structure and functionalities of the Perceptron

---

Lets consider the line **y=x** in our 2d dataspace
Anything ie above y=x is +1, anything below is -1

Basically we want to create a **known dataset to train the Perceptron**

``` java

class Point {
  float x;
  float y;
  int label;
  
  //Constructor
  Point(){
    x = random(width);
    y = random(height);
    if (x < y){
      label = 1;
    } else {
      label = -1;
    }
  }
  
  void show(){
    stroke(0);
    if (label == 1){
      fill(255);
    } else {
      fill(0);
    }
    
    ellipse(x, y, 8, 8);
  }
}

```

So now we have points randomly scattered on either side of a line. This is basically or **known Training Data**

1. We need to take this training data, one at a time and pass it in as an ip

2. We get a guess from the Perceptron (+1 and -1) by using the weighted sum and the activation function

3. We need to do **something** whether the guess is correct or incorrect

#### What is this something

We have a guess from the Perceptron (+1 and -1) by using the weighted sum and the activation function

We also have an answer

We can compute error

    error = answer - guess
    
    Error will be either 0, +2 or -2
    
We are trying to find out the optimal wts

$error =  answer - guess$


![](./data/img/diag10.png)

$ w_{0} = w_{0} + \Delta w_{0} $

$ w_{1} = w_{1} + \Delta w_{1} $



If there is a mistake, I want to tweak the wts.. Maybe the wt sum got me to -1 when it should have been +1. So we can make the wt higher to push the op to +1 


#### How to change the wts

We do this by a process called **Gradient Descent**

We have a velocity vector and a target. We have a desired vel vector

$steer = desired - target$

If we get this steer vector and add it to the current vel, its going to cause me to turn and go towards that target

![](./data/img/diag11.png)

- This steer vector is like the error

- The desired vector is like the answer

- The vel is like my guess

$\Delta w_{0} = error \times x_{0}$

$\Delta w_{1} = error \times x_{1}$

**How much to steer** - This is controlled by the **Learning Rate**

$\Delta w_{0} = error \times x_{0} \times learningrate$

$\Delta w_{1} = error \times x_{1} \times learningrate$

**Supervised Learning Algorithm**:

1. Provide the perceptron with ips for which there is a known answer

2. Ask the perceptron to guess an answer

3. Compute the error (Did it get the answer right or wrong)

4. Adjust all the wts acc to the error

5. Return to Step 1 and repeat


---

We need a train() function which receives ip and the target (correct answer):

``` java
// training function
  
  void train(float[] inputs, int target){
    // compute the guess for the set of ips
    int guess = guess(inputs);
    // compute the error
    int error = target - guess;
    
    // now that the error is known tune each wt
    
    for (int i = 0; i < weights.length; ++i){
      weights[i] += error * inputs[i] * lr;
    }
    
    println("After tuning the wts are ...........................");
    println(weights[0], weights[1]);
    
    
  }

```

Train each pt:

``` java

for (Point pt: points){
    float[] inputs = {pt.x, pt.y};
    int target = pt.label;
    
    brain.train(inputs, target);
    
    int guess = brain.guess(inputs);
    if (guess == target){
      fill(0, 255, 0);
    }
    else {
      fill(255, 0, 0);
    }
    noStroke();
    ellipse(pt.x, pt.y, 16, 16);
  }

```

When we comment ot the line and run

If we comment out the line:

    brain.train(inputs, target);
    
and if we run the code with the initial guess for the wts (without tuning the wts at every iteration):

![](./data/img/diag12.png)

On the other hand, if we tune the wts at each iteration:

![](./data/img/diag13.png)



#### Demonstratingthe training process with a mouse click

Basically the idea here is to run the train() function everytime the mouse is pressed and see how the changed wts affect the points

We keep track of how many ppts have been used to train using a global var: counter

``` java

// keep a counter to track how many pts  have been used to train the perceptron
int counter = 0;

```

We write a mousePressed() function which will train the Perceptron using the next point. The wts will be tuned. As the draw() function keeps running in the background the points will be colored red or green (based on correct or incorrect)

draw() function:

``` java

void draw(){
  background(255);
  
  // boundry
  
  stroke(0);
  line(0, 0, width, height);
  
  //for point in points, display the points
  
  for (Point pt : points){
    pt.show();
  }
  
  // color each pt red or green based on correct or incorrect
  for (Point pt: points){
    float[] inputs = {pt.x, pt.y};
    int target = pt.label;
    int guess = brain.guess(inputs);
    if (guess == target){
      fill(0, 255, 0);
    }
    else {
      fill(255, 0, 0);
    }
    noStroke();
    ellipse(pt.x, pt.y, 16, 16);
  }
}

```

mousePressed() function:

``` java

void mousePressed(){

  // train the perceptron using the next point and see how the wt changes and how that affects the labelling
  Point required_point = points[counter];
  float[] inputs = {required_point.x, required_point.y};
  int target = required_point.label;
  brain.train(inputs, target);
  if (counter < points.length){
    counter += 1;
  } else {
    return;
  }
  
}


```

The mousePressed function takes each pt and basically computes a guess and changes the wts acc to the error

Now the draw() function just runs in the background continuosly and colors the pts 



#### Bias

Say there is a coordinate system where (0, 0) is at the position shown

Consider the black line. I want to use a perceptron t classify data above or below that line

![](./data/img/diag14.png)

Or maybe we want to use the same system but I want to categorize data above or below the orange line

![](./data/img/diag15.png)

We want the perceptron (same exact code) to categorize data in both these scenarios

In orange case (0,0) should be +1

In black case (0, 0) should be -1

But in our current Preceptron, if I am feeding in (0, 0) as the ips whatever the value of wts the wt. sum = 0

sign(0)  = 1

So its always 1

So its always class A

But (0, 0) in some scenarios cab be class A, sometimes it can be Class B

So we need a **bias**.

This is like a 3rd ip that is always = 1, and its wt is **Wbias**

![](./data/img/diag16.png)

Now imagine u have (0, 0).. So if Wbias >= 0 we get +1 else we get -1

---

NN are designed to solve a function. Think about LinReg when we were solving y = mx + b

So Wbias is really there to solve the y intercept

And w0 and w1 are solving the slope