# Perceptron

As a precursor to multilayer perceptrons,

Perceptron is defined as

$\hat{y} = f(w_1x_1 + w_2x_2 + ... w_nx_n + b)$

$w$ is a weight vector, $x$ is the input vector, $b$ is a bias, $f$ is an activation function which is the step function with perceptrons

Example:

In [68]:
import numpy as np

x = np.array([0.1, 0.9])
w = np.array([0.8, 0.2])
b = 0.5

z = w @ x + b
print("z:", z)

def f(z):
    return int(z >= 0)

print("y_hat:", f(z))


z: 0.76
y_hat: 1


we must learn the weights and the bias from the training data using the following formulas

$w := w + \eta(y-\hat{y})x$

$b := b + \eta(y-\hat{y})$

In [6]:
w = np.array([0.8, 0.2])
b = 0.5

x = np.array([1, 2])
y = 1
y_hat = 0
eta = 0.1

w_new = w + eta * (y - y_hat) * x
b = b + eta * (y - y_hat) 
print("Updated weights:", w_new)
print("Updated bias:", b)

Updated weights: [0.9 0.4]
Updated bias: 0.6


#### Perceptron from scratch:

In [None]:
class Perceptron:
    def __init__(self, learning_rate=0.1, epochs=10):
        self.learning_rate = learning_rate
        self.epochs = epochs
        self.weights = None
        self.bias = None

    def activation(self, z):
        return np.where(z >= 0, 1, 0)

    def fit(self, X, y):
        n_samples, n_features = X.shape

        # Initialize weights and bias to zero
        self.weights = np.zeros(n_features)
        self.bias = 0.0

        # Training loop
        for epoch in range(self.epochs):
            for i in range(n_samples):
                # Linear combination and prediction
                z = np.dot(X[i], self.weights) + self.bias
                y_pred = self.activation(z)

                # Perceptron update rule
                error = y[i] - y_pred
                self.weights += self.learning_rate * error * X[i]
                self.bias += self.learning_rate * error

    def predict(self, X):
        z = np.dot(X, self.weights) + self.bias
        return self.activation(z)


In [None]:
X = np.array([
    [1, 1],
    [1, 0],
    [0, 1],
    [0, 0]
])

y = np.array([1, 1, 1, 0]) # OR logic gate

model = Perceptron(learning_rate=0.01, epochs=10)
model.fit(X, y)
predictions = model.predict(X)
print("Predictions:", predictions)

Predictions: [1 1 1 0]


#### Limitation:

Does not work for non-linear decision boundaries

In [69]:
y = np.array([0, 1, 1, 0]) # XOR, nonlinear, should not work

model = Perceptron(learning_rate=0.1, epochs=50)
model.fit(X, y)
predictions = model.predict(X)
print("Predictions:", predictions)

Predictions: [1 1 1 1]


A single-layer perceptron can only handle linearly separable problems, which makes it too limited for most real tasks. To model more complex, nonlinear patterns, we add hidden layers and use smooth activation functions instead of a hard threshold. This leads to the multilayer perceptron (MLP), where stacked layers learn richer representations and can be trained efficiently using backpropagation.