# Perceptron 
campusx DL playlist lec 4

## Perceptrons: The Fundamental Building Block of Artificial Neural Networks

### 1. Introduction to Perceptrons
*   A **Perceptron is a fundamental building block of Artificial Neural Networks (ANNs)**.
*   It is an **algorithm** used for **supervised machine learning**.
*   Its design makes it a foundational component for Deep Learning.
*   It can also be understood as a **mathematical model or a mathematical function**.
*   Understanding single Perceptrons is crucial before moving to Multi-layer Perceptrons (MLPs).

### 2. Design and Working of a Perceptron
The Perceptron operates through a series of steps involving inputs, weights, a bias, a summation, and an activation function.

*   **Inputs (x1, x2, etc.)**: These are the features or data points provided to the Perceptron.
*   **Weights (w1, w2, etc.)**: These are numerical values assigned to each input, representing the **strength of the connection** between the input and the Perceptron's processing unit. Weights also indicate **feature importance**, with higher weights suggesting a more significant input in determining the output.
*   **Bias (b)**: An additional numerical value, often connected to an input of '1', which acts as an offset.
*   **Summation Block**: In this block, a **dot product** is calculated. Each input is multiplied by its corresponding weight, and these products are summed, along with the bias.
    *   The formula for the sum (let's call it `Z`) is: `Z = (w1 * x1) + (w2 * x2) + b` (for two inputs, `x1` and `x2`, where `b` is multiplied by an implicit '1').
    *   For multiple inputs (e.g., `x1, x2, x3`), the formula extends to `Z = (w1 * x1) + (w2 * x2) + (w3 * x3) + b`.
*   **Activation Function**: The calculated `Z` value is then passed through an **activation function**.
    *   **Purpose**: To bring the potentially unbounded `Z` value into a **given range** (e.g., -1 to 1, or 0 to 1).
    *   **Example: Step Function**: A common activation function where:
        *   If `Z` is greater than or equal to 0, the output is 1.
        *   If `Z` is less than 0, the output is 0.
    *   Other activation functions, such as ReLU, also exist.
*   **Output**: The final result from the activation function (e.g., 0 or 1) is the Perceptron's output.

### 3. Training and Prediction with a Perceptron
Like other machine learning algorithms, Perceptrons undergo two main stages: training and prediction.

*   **Training**:
    *   The **core objective of the training process is to calculate the optimal values for the weights (w) and bias (b)**.
    *   This is done by feeding the Perceptron **supervised training data** (e.g., student IQ, CGPA, and their known placement status).
    *   The Perceptron learns from this data to adjust its weights and bias so that its predictions align with the known outcomes.
*   **Prediction**:
    *   Once the Perceptron has been trained and the values of its weights and bias are determined, it can be used to **make predictions for new, unseen data**.
    *   For a new input (e.g., a student's IQ and CGPA), these values are passed through the Perceptron's design (summation, activation function) using the learned weights and bias to produce a predicted output (e.g., whether the student will be placed or not).

### 4. Perceptron vs. Biological Neuron
<img src='https://miro.medium.com/v2/resize:fit:1400/1*VbECYuxPu5CzEWzitCfRDA@2x.jpeg' alt='Neuron and perceptron image'>
Deep Learning is often said to be inspired by the human nervous system. While there are similarities between a Perceptron and a biological neuron, it's crucial to understand that they are **not identical copies**.

*   **Similarities (Weak Inspiration)**:
    *   Both are considered **building blocks**: Neurons are the building blocks of the nervous system, and Perceptrons are building blocks of ANNs.
    *   **Input/Output Flow**: The inputs to a Perceptron can be roughly compared to a neuron's **dendrites** (which receive input). The Perceptron's summation and activation blocks can be compared to the neuron's **nucleus** (where internal calculations occur), and the Perceptron's output to the neuron's **axon** (which transmits output).
    *   **Network Formation**: Many neurons connect to form a nervous system; similarly, many Perceptrons can form a neural network.

*   **Differences (Perceptron is not a true copy)**:
    *   **Complexity**: Biological neurons are **highly complex** and deeply interconnected in ways that are far more intricate than a simple Perceptron. Perceptrons are comparatively much simpler.
    *   **Internal Processing**: The internal processing within a neuron (involving complex electrochemical reactions) is largely **unknown** to science. In contrast, a Perceptron's internal processing is a **simple, known mathematical function** (summation and activation).
    *   **Neuroplasticity**: Biological neurons exhibit **neuroplasticity**, meaning their connections (dendrite thickness) can change, strengthen, weaken, or even form new connections over time, especially during learning. In a Perceptron model, the **connections (weights) do not change** once they have been set during training.

### 5. Geometric Intuition and Limitations
Visualising how a Perceptron works geometrically helps in understanding its function.

*   **Decision Boundary**: A Perceptron's calculation (`Z = w1*x1 + w2*x2 + b`) essentially defines a boundary.
    *   In a **2D space** (with two input features like IQ and CGPA), the equation `w1*x1 + w2*x2 + b = 0` represents a **straight line**.
    *   This line **divides the data into two regions**. Points on one side of the line fall into one class (e.g., placed), and points on the other side fall into the second class (e.g., not placed).
    *   Hence, a Perceptron acts as a **binary classifier**, separating data into two classes by creating distinct regions.
    *   In a **3D space** (with three input features), the Perceptron acts as a **plane**.
    *   For **4D or higher dimensions**, it acts as a **hyperplane**.
    *   The fundamental idea remains the same: the Perceptron **always divides the data into two parts or regions**.

*   **Major Limitation**:
    *   The most significant limitation of Perceptrons is that they can **only classify data that is linearly separable** or "sort of" linearly separable.
    *   This means the data points for different classes must be able to be separated by a single straight line (in 2D), a plane (in 3D), or a hyperplane (in higher dimensions).
    *   Perceptrons **fail when faced with completely non-linear datasets** (i.e., data that cannot be cleanly separated by a single line or plane). This inherent limitation significantly restricted the power and widespread use of single Perceptrons, leading to the development of more complex models like Multi-layer Perceptrons.

### 6. Practical Application Example
applying a Perceptron using `scikit-learn`.

*   A dataset of students' CGPA, resume scores, and placement status (a linearly separable dataset) is used.
*   The `scikit-learn.linear_model.Perceptron` class is imported and used to create a Perceptron object.
*   The `.fit()` method is called to train the model, which calculates the optimal `w1`, `w2`, and `b` values.

# Code example

In [6]:
import numpy as np 
import pandas as pd 

In [11]:
df=pd.read_csv('https://raw.githubusercontent.com/pankaj-2708/Machine-Learning/refs/heads/main/Datasets/placement.csv')

In [12]:
df.head()

Unnamed: 0,cgpa,placement_exam_marks,placed
0,7.19,26.0,1
1,7.46,38.0,1
2,7.54,40.0,1
3,6.42,8.0,1
4,7.23,17.0,0


In [18]:
from sklearn.linear_model import Perceptron

pr=Perceptron()
pr.fit(df.drop(columns=['placed']),df['placed'])
# perceptron is trained

0,1,2
,penalty,
,alpha,0.0001
,l1_ratio,0.15
,fit_intercept,True
,max_iter,1000
,tol,0.001
,shuffle,True
,verbose,0
,eta0,1.0
,n_jobs,


In [23]:
pr.coef_

array([[-25.65,  -8.  ]])

In [24]:
pr.intercept_

array([-15.])