Homework: Implementing a Perceptron

In this task, you will implement a simple perceptron model from scratch using Python. The goal of this task is to give you a hands-on experience of how a perceptron works and how it can be used for binary classification.

Instructions:

1. Define a function called `perceptron` that takes in three parameters:
    - `X`: a numpy array of shape `(n_samples, n_features)` representing the input data.
    - `y`: a numpy array of shape `(n_samples,)` representing the target labels (0 or 1).
    - `eta`: the learning rate for updating the weights.

2. Initialize the weight vector `w` to a random value of shape `(n_features,)`.

3. Define a for loop that iterates over a specified number of epochs (e.g., 100). Within each epoch, iterate over all the training samples and do the following:
    - Compute the predicted output `y_pred` by multiplying the input vector `X` with the weight vector `w` and passing it through a step function (e.g., Heaviside function).
    - Compute the error `err` as the difference between the true labels `y` and predicted output `y_pred`.
    - Update the weight vector `w` using the following formula: `w = w + eta * err * X[i]`

4. After all epochs have been completed, return the learned weight vector `w`.

5. Test your perceptron implementation on a simple binary classification problem, such as the XOR problem. Generate random data points and labels for the XOR problem and train your perceptron on this data. Print the learned weights and test the perceptron on new data points to see how well it can classify.

Bonus:

- Modify the perceptron to implement the perceptron learning algorithm with a bias term.
- Modify the perceptron to implement the adaptive linear neuron (Adaline) algorithm.

Deliverables:

- A Jupyter notebook or Python script that implements the perceptron from scratch and solves the XOR problem.
- A brief report explaining your implementation and results.

Data:

You can generate the XOR problem data using the following code:
--------------------------------------------
import numpy as np

np.random.seed(0)
X = np.random.randn(100, 2)

y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)

y = np.where(y, 1, -1)

--------------------------------------------


In [6]:
# Define a function called perceptron that takes in three parameters

import numpy as np

def perceptron(X, y, eta=0.1, epochs=100):
    # Initialise Weight vector
    n_samples, n_features = X.shape
    w = np.random.randn(n_features)
    b = np.random.randn(1)

   # Loop until all data points are correctly classified
    while True:
        # Variable to track if all points are classified correctly
        errors = 0

        # Loop through each training example
        for i in range(len(y)):
            # Calculate the prediction: weighted sum of inputs and weights
            y_pred = np.sign(np.dot(X[i], w))  # Step function (Heaviside)

            # Check if there's an error
            if y_pred != y[i]:
                # Update weights if the example is misclassified
                w = w + eta * (y[i] - y_pred) * X[i]
                errors += 1

        # If no errors, it means everything is correctly classified
        if errors == 0:
            break

    return w

In [8]:
## Generate XOR problem data

import numpy as np

np.random.seed(0)
X = np.random.randn(100, 2)

y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)

y = np.where(y, 1, -1)

y = np.where(y, 1, -1)

In [None]:
# Train the perceptron with a learning rate of 0.1
w = perceptron(X, y, eta=0.1)

In [None]:
# Print the learned weights
print("Learned weights:", w)

In [None]:
# Test the perceptron on new data points
test_points = np.array([[0.5, 0.5], [-0.5, -0.5], [0.5, -0.5], [-0.5, 0.5]])
for point in test_points:
    prediction = np.dot(point, w)
    prediction = 1 if prediction >= 0 else -1
    print(f"Test point {point} -> Prediction: {prediction}")

Steps followed:

* Initialization of weights: I first initialized the model's weights with random values. These weights are used to weight the influence of each feature on the model's output.

* Prediction: For each input data sample, I calculated a weighted sum of the features and then applied a threshold function (Heaviside function) to determine the predicted class (0 or 1) of the perceptron.

* Error calculation: The error is the difference between the true value (label) and the perceptron’s prediction. If the prediction is incorrect, the error is used to adjust the model's weights.

* Weight update: I updated the weights based on the error, the learning rate, and the input data, following the perceptron update rule.

* Testing: After training, I tested the model on new data to see how the perceptron made predictions