# Perceptron

The **perceptron** is a fundamental building block in Machine Learning (ML) and artificial neural networks. It is a type of artificial neuron that takes multiple inputs, applies weights to those inputs, sums them up, and passes the result through an activation function to produce an output. 

The perceptron is a binary classifier, meaning it can classify inputs into two classes.
***
The perceptron was proposed by **Frank Rosenblatt** in 1957 and is one of the earliest models of an artificial neural network. It gained popularity due to its ability to learn and classify patterns, and it laid the foundation for many other artificial neural network models. It was inspired by the biological neurons found in the brain. 

![perceptron](https://andreyex.ru/wp-content/uploads/2019/07/TensorFlow-odnoslojnyj-perseptron_1.jpg)
***
#### Perceptron Convergence Theorem:

The perceptron convergence theorem states that if a dataset is linearly separable, then the <u>perceptron learning algorithm is guaranteed to find a separating hyperplane in a finite number of steps</u>. This theorem provides theoretical support for the effectiveness of perceptrons in solving linear classification problems.

## Basic components of Perceptron:

The perceptron consists of several essential components:

1. **Input Values**: The perceptron receives input values that represent features or attributes of the input data. Each input is assigned a weight, indicating its significance in influencing the perceptron's output.

2. **Weights**: The weights associated with the inputs determine their relative importance. These weights can be positive or negative real numbers and play a crucial role in the perceptron's decision-making process.

3. **Weighted Sum**: The perceptron calculates the weighted sum by multiplying each input value with its corresponding weight and then summing them up. This step represents the combination of inputs and weights.

4. **Activation Function**: The weighted sum is passed through an activation function, which introduces non-linearity into the perceptron's output. The activation function helps determine whether the perceptron should activate or remain inactive based on the input it receives.

5. **Threshold/Bias**: The perceptron can incorporate a bias term or threshold, which acts as an offset. It affects the decision boundary of the perceptron by adjusting the activation level required for the perceptron to fire.

6. **Output**: The output of the perceptron is generated by applying the activation function to the weighted sum (plus bias, if present). It represents the perceptron's classification decision or prediction, such as assigning an input to a specific class or producing a binary output.

By leveraging these fundamental components, the perceptron becomes capable of handling input data, performing a weighted summation, applying an activation function, and generating an output that facilitates classification or prediction tasks.

Throughout the training process, the perceptron's weights and bias are adjusted iteratively using a learning algorithm and the provided training data. The objective is to determine the optimal values for these weights and bias, enabling the perceptron to effectively classify the training examples or minimize the discrepancy between the predicted and desired outputs.

Through the combination of multiple perceptrons or the utilization of more intricate architectures like multi-layer perceptrons (MLPs), it becomes possible to tackle intricate problems that require non-linear decision boundaries.

## Types of perceptrons:

There are **two main types** of perceptrons: single-layer perceptrons and multi-layer perceptrons (also known as feedforward neural networks).

- Single-layer perceptrons: These consist of a single layer of artificial neurons, which take inputs, compute a weighted sum, and apply an activation function. Single-layer perceptrons are limited to linearly separable problems, where data points can be divided into distinct classes by a straight line or hyperplane.

- Multi-layer perceptrons (MLPs): These are composed of one or more hidden layers between the input and output layers. MLPs can learn and represent complex non-linear relationships, making them capable of solving more complicated problems.

## Math behind perceptron:

1. **Weighted sum of inputs with bias term**:

   $$
   weighted\_sum = w_0 \cdot b + w_1 \cdot x_1 + w_2 \cdot x_2 + \ldots + w_n \cdot x_n
   $$

   where $w_0$ represents the bias weight, $b$ is the bias input (usually set to 1), and $x_i$ are the input features.

2. **Activation function with threshold**:
   $$
   \text{output} = \begin{cases} 
                     1, & \text{if }  weighted\_sum \geq \text{threshold} \\
                     0, & \text{otherwise}
                   \end{cases}
   $$
   
   The threshold can be a predetermined value, often set to 0.

3. **Updating weights with learning rate**:

   $$
   w_i = w_i + \text{{learning\_rate}} \times (y - \text{{output}}) \times x_i
   $$

   where $w_i$ is the weight for the $i$-th input, $y$ is the target output, and $x_i$ is the $i$-th input value.

4. **Perceptron Loss function**:
   $$
   \text{Loss} = \frac{1}{N} \sum_{i=1}^{N} (\text{target}_i - \text{output}_i)^2
   $$
   
   This loss function represents the $\text{Mean Squared Error (MSE)}$  between the target output $\text{target}_i$ and the predicted output $\text{output}_i$ of the perceptron for $N$ training examples.

5. **Perceptron Update Rule**:
   $$
   \Delta w_i = \text{learning rate} \times (\text{target} - \text{output}) \times x_i
   $$
   
   This formula represents the change in the weight $\Delta w_i$ for the $i$-th input during each iteration of the training process.




## Importing libraries

In [1]:
import numpy as np

## Single-layer perceptron

In [2]:
class Perceptron:
    def __init__(self, num_features, learning_rate=0.1, num_epochs=100):
        """
        Initialize the Perceptron classifier.

        Args:
            num_features (int): Number of input features.
            learning_rate (float, optional): Learning rate for weight updates. Defaults to 0.1.
            num_epochs (int, optional): Number of training epochs. Defaults to 100.
        """
        self.num_features = num_features
        self.learning_rate = learning_rate
        self.num_epochs = num_epochs
        self.weights = np.zeros(num_features + 1)  # Additional weight for bias term

    def activation(self, x):
        """
        Activation function for the perceptron.

        Args:
            x (float): Input value.

        Returns:
            int: Output of the activation function (0 or 1).
        """
        return 1 if x >= 0 else 0

    def predict(self, x):
        """
        Perform prediction using the trained perceptron.

        Args:
            x (ndarray): Input features.

        Returns:
            int: Predicted class label (0 or 1).
        """
        x_with_bias = np.insert(x, 0, 1)  # Add bias term
        activation_input = np.dot(x_with_bias, self.weights)
        return self.activation(activation_input)

    def train(self, X, y):
        """
        Train the perceptron using the training data.

        Args:
            X (ndarray): Input features of shape (num_samples, num_features).
            y (ndarray): Target labels of shape (num_samples,).

        Raises:
            ValueError: If the dimensions of X and y are inconsistent.
        """
        if X.shape[0] != y.shape[0]:
            raise ValueError("Dimensions of X and y are inconsistent.")

        for _ in range(self.num_epochs):
            for x, target in zip(X, y):
                prediction = self.predict(x)
                self.weights += self.learning_rate * (target - prediction) * np.insert(x, 0, 1)

# Usage example with logical OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])  # Input features for logical OR
y = np.array([0, 1, 1, 1])  # Target labels for logical OR

perceptron = Perceptron(num_features=2, learning_rate=0.1, num_epochs=100)
perceptron.train(X, y)

# Test the trained perceptron with logical OR inputs
test_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
for x in test_data:
    prediction = perceptron.predict(x)
    print(f"Input: {x}, Prediction: {prediction}")

Input: [0 0], Prediction: 0
Input: [0 1], Prediction: 1
Input: [1 0], Prediction: 1
Input: [1 1], Prediction: 1
