# What is classification?

#### Process of categorizing a given set of data into classes
#### Can be performed on both structured and unstructured data
#### Starts with predicting the class ofa  given data points.
#### Classes are referred to as targets, or labels

---


### Perceptron is single layered network in a neural network

#### We will use the OR Gate for example



In [3]:
import numpy as np 
x = np.array([[0,0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 1]) 
x

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

In [4]:
y

array([0, 1, 1, 1])

#### Weight vector

* It is a vector that assigns importance to each input feature
* Since each input is a 2 element vector, the weights are also 2-element: [w1, w2]
* We are treating both inputs equally since both are 1

In [6]:
w = np.array([1, 1])
w

array([1, 1])

#### Bias

* Bias shifts the decision boundary
* It is like a default outpt before seeing any input - helps the model learn functions that dont pass through the origin
* Moves the treshold so that only [0, 0] produces a 0

In [7]:
b = -0.5

In [8]:
def activation(z):
    if z >= 0:
        return 1
    else:
        return 0

In [9]:
pred = []
for a in x:
    y_hat = np.dot(a,w)+b
    pred.append(activation(y_hat))

In [10]:
pred

[0, 1, 1, 1]

---

## Perceptron Learning

1) Concept: Train weights over epochs, and iterate prediction error back to the perceptron for each epoch
2) During error propogation process, we will make changes in our weight values till the time our perceptron makes accurate predictions
3) Weight(new) = Weight(old) - Learning_rate * (Gradient of error with respect to weight)
4) Error = Output - Predicted_Output

In [11]:
import math



Epochs are the of times you train over the full dataset
Alpha is the learning rate, higher = faster learning, but riskier (overshooting)

In [29]:
epochs = 4
alpha = 0.7

In [13]:
w0 = np.random.random() 
w1 = np.random.random()  
w2 = np.random.random() 
print("Initial weights:")
print("w0: ", w0, " w1: ", w1, " w2: ", w2)

Initial weights:
w0:  0.9321873169825927  w1:  0.44509219729200633  w2:  0.09739391520545049


These store weight updates (deltas). Initially set to 1, but will be recalculated every training step.

In [14]:
del_w0 = 1
del_w2 = 1
del_w1 = 1

In [15]:
train_data_temp = [[0, 0, 0], [0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 0], [1, 1, 1]]
train_data = np.asarray(train_data_temp)
op = np.array([0, 1, 1, 1, 1, 1, 1, 1])

In [16]:
train_data

array([[0, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [1, 0, 0],
       [0, 1, 1],
       [1, 0, 1],
       [1, 1, 0],
       [1, 1, 1]])

In [17]:
op

array([0, 1, 1, 1, 1, 1, 1, 1])

y = (x*w) + b

#### This function does the following:
1) Initializes the bias
2) Training Loop:
   
       a) Outer loop repeats training for epochs times
       b) Inner loop goes through each training sample x and learns from it

   
4) y_hat is the prediction. Linear combination of inputs and weights, like a raw score before activation
5) Activate function: If weighted sum > 0, predict 1, else predict 0
6) This function turns the continuous score y_hat into a discrete classification decision
7) Error is calculated using the correct label op[j] and predicted label act
8) Then it uses the Perceptron Learning rule
   
       a) delta_w = alpha * input * error
       b) if error = 0, dont change anything
       c) if error = +1 (prediction too low), increase weight
       d) if error = -1 (prediction too high), decrease weight

   
10) Then we update the weights using w = w + del_w
11) This is the core of training, slowly nudging the weights toward better predictions

In [30]:
bias = 0 # Can also be trained like weights
for i in range(epochs):
    j = 0
    for x in train_data: 
        y_hat = w0*x[0] + w1*x[1] + w2*x[2] + bias 

        if(y_hat > 0):
            act = 1
        else: 
            act = 0
        err = op[j] - act

        del_w0 = alpha*x[0]*err 
        del_w1 = alpha*x[1]*err
        del_w2 = alpha*x[2]*err

        w0 = w0 + del_w0
        w1 = w1 + del_w1
        w2 = w2 + del_w2

        j = j+1
        #print("epoch: ", i+1, " error: ", err)
        #print(del_w0, del_w1, del_w2)
print("\nFinal Weights = ")
print("w0: ", w0, " w1: ", w1, " w2: ", w2)


Final Weights = 
w0:  0.9321873169825927  w1:  0.44509219729200633  w2:  0.09739391520545049


--- 


# Types of Classification

## Binary Classification

### Requires two classes, one in normal state, other in aberrant state

## MultiClass Classification

### Data points are grouped into several well known classes

## MultiLabeled Classification

### Features more then two class labels

## Imbalance Classification

### Each class has different distribution


--- 

# Learner Type Terminology

## Lazy Learner

### Lazy learner delays the generalization of training data until a query is made to the system. 
#### Example: KNN

## Eager Learner

### Before obtaining a test dataset, eager learners build a classification model using a training dataset. 
#### Example: ANN, Naive Bayes