# Perceptron
##### Author: Chenyang Skylar Li

# Table of Contents

    
## Introduction
The Perceptron, one of the earliest machine learning models, is a binary classification algorithm that was first introduced by Frank Rosenblatt in 1957. Inspired by the workings of biological neurons, Rosenblatt proposed the algorithm while he was at the Cornell Aeronautical Laboratory with an aim to create a simple model for pattern recognition tasks. As a type of artificial neuron, the Perceptron is primarily used for linearly separable datasets and serves as the foundational building block for many modern neural networks.

## Mathematical Fundations

The Perceptron computes a linear combination of input features and a bias term. The resulting value is passed through an activation function to produce the output.

`output = activation(sum(w_i * x_i) + b)`

The common activation function used in Perceptrons is the Heaviside step function or sign function:

![activation function](/assests/images/perceptron_activation_function.png)




### Weight Update Rule

If an instance is misclassified, update the weights and bias as follows:

`w_i = w_i + learning_rate * (target - output) * x_i`

`b = b + learning_rate * (target - output)`

## Learning Algorithm

1. Initialize the weights and bias to zero or small random values.
2. For each training instance:
   - Compute the output using the activation function.
   - Update the weights and bias if the output is not correct.
3. Repeat step 2 for the desired number of epochs or until convergence.


In [6]:
import numpy as np

# Define a Perceptron class
class Perceptron:
    """
    Perceptron(n_features, learning_rate=0.01, epochs=1000)
    
    Perceptron, -A simple binary classification algorithm that learns a linear decision boundary between two classes.
    
    Parameters
    ----------
    n_features : int
        The number of input features for each training example.
    learning_rate : float, optional (default=0.01)
        The learning rate used to update the weights and bias during training.
    epochs : int, optional (default=1000)
        The number of iterations to train the model for.
    
    Attributes
    ----------
    weights : array-like, shape (n_features,)
        The weights learned by the perceptron during training.
    bias : float
        The bias learned by the perceptron during training.
    
    Methods
    -------
    fit(X, y)
        Train the perceptron on the given training data (input vectors X and target outputs y).
    predict(X)
        Make predictions for the given input vectors X based on the learned weights and bias.
    """
    
    # Constructor for the Perceptron class
    def __init__(self, n_features, learning_rate=0.01, epochs=1000):
        self.n_features = n_features
        self.learning_rate = learning_rate
        self.epochs = epochs
        # self.weights = np.zeros(n_features)  # Use this line to replace below line if you want to initialize the weights to zero
        self.weights = np.random.randn(n_features)  # Initialize the weights randomly
        self.bias = 0  # Initialize the bias to zero
        
    # Define the predict method for the Perceptron class
    def predict(self, X):
        # Calculate the linear output by taking the dot product of the input X and the weights, and adding the bias
        linear_output = np.dot(X, self.weights) + self.bias
        # Apply the heaviside step function to the linear output to get the predicted class label (0 or 1)
        return np.where(linear_output > 0, 1, 0)
        # return np.sign(linear_output)  # Use this line to replace above line if you want to use the sign function instead of the heaviside step function
    
    # Define the fit method for the Perceptron class
    def fit(self, X, y):
        # Iterate through the specified number of epochs
        for _ in range(self.epochs):
            # Iterate through each training example (input vector) and its corresponding target output
            for xi, target in zip(X, y):
                # Make a prediction for the input vector
                output = self.predict(xi)
                # Calculate the update factor for the weights and bias based on the difference between the predicted output and the target output
                update = self.learning_rate * (target - output)
                # Update the weights and bias based on the update factor and the input vector
                self.weights += update * xi
                self.bias += update

## Pros and Cons

**Pros:**
- Simple and easy to implement.
- Efficient for linearly separable data.
- Can be used as a building block for more complex models (e.g., multi-layer perceptrons).

**Cons:**
- Cannot solve non-linearly separable problems.
- Sensitive to the choice of learning rate and initial weights.
- No guarantee of convergence for non-linearly separable data.

## Suitable Tasks and Datasets

The Perceptron algorithm is a binary classification model that is well-suited to certain types of tasks and datasets:

1. **Binary Classification Tasks**: The Perceptron is designed to handle problems where each instance can be classified into one of two classes. This makes it useful for tasks such as classifying emails as spam or not spam, or classifying loan applicants as high risk or low risk.

2. **Linearly Separable Data**: The Perceptron algorithm is most effective when the classes in the data can be separated by a hyperplane in the feature space, a condition known as linear separability. If you can draw a straight line (in 2D) or a plane (in 3D) to separate the two classes in the data, then the Perceptron will be able to learn this decision boundary.

3. **Large Scale Datasets**: The Perceptron, due to its simplicity and efficiency, can effectively handle large scale datasets. It learns iteratively from the data, making it suitable for online learning tasks where data is processed sequentially.

The Perceptron is best suited for linearly separable datasets and binary classification tasks. Some examples include:
- Predicting whether an email is spam or not.
- Classifying handwritten digits (0 and 1).
- Separating two different types of plants based on their features.

## References

1. Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
2. Minsky, M., & Papert, S. (1969). An introduction to computational geometry. Cambridge tiass., HIT, 479, 480.

In [3]:
# Import necessary libraries
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

# Load the Iris dataset and prepare the binary classification data
data = load_iris()
X, y = data.data[:100], data.target[:100]

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the Perceptron
perceptron = Perceptron(n_features=X_train.shape[1], learning_rate=0.01, epochs=1000)
perceptron.fit(X_train, y_train)

# Make predictions and calculate accuracy
predictions = perceptron.predict(X_test)
accuracy = np.mean(predictions == y_test)
print(f"Accuracy: {accuracy:.2f}")


Accuracy: 1.00


In [5]:
X = np.array([1, -2, 3, -4, 5, 0])


# Apply the sign function to the linear output
predicted_labels = np.sign(X)

# Print the linear output and the predicted labels
print("Linear output:", X)
print("Predicted labels:", predicted_labels)

Linear output: [ 1 -2  3 -4  5  0]
Predicted labels: [ 1 -1  1 -1  1  0]
