# Simple Perceptron Implementation
_by Mihai Dan Nadăș (mihai.nadas@ubbcluj.ro), January 2025_

This notebook implements a version of the perceptron as introduced by Frank Rosenblatt's 1958 paper, "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain". 

We will use basic math concepts, avoiding linear algebra (i.e., working with vectors and matrices) to the extent possible.

## Objective

The goal is to train a model with two weights, $w_{1},\ w_{2}$—one for each of the coordinates $x,\ y$ of a point defined as $(x,\ y)$—and one bias, $b$. We use an adaptation of the algebraic equation $y = mx + c$, commonly known as the slope-intercept form of a line.

The model will handle a simple classification task on a linearly separable dataset based on the following function:

$$
f: \mathbb{N} \to \mathbb{N}, \quad f(x) =
\begin{cases}
x & \text{if } x \bmod 2 = 0, \\
2x & \text{if } x \bmod 2 = 1.
\end{cases}
$$

## Dataset

We will first generate a dataset using the Python Standard Library.

In [None]:
```python
import random

def generate_dataset(num_items=20, start=0, stop=100):
    random.seed(42)
    dataset = set()
    
    while len(dataset) < num_items:
        x1 = random.randint(start, stop)
        if x1 % 2 == 0:
            x2 = x1
            y = 0  # Label as Class 0 if x1 is even
        else:
            x2 = 2 * x1
            y = 1  # Label as Class 1 if x1 is odd
        dataset.add((x1, x2, y))
    
    return list(dataset)

dataset = generate_dataset()

# Split the dataset into training and test sets
train_ratio = 0.8
num_train = int(len(dataset) * train_ratio)
dataset_train, dataset_test = dataset[:num_train], dataset[num_train:]
print(f"Training set (n={len(dataset_train)}): {dataset_train}")
print(f"Test set (n={len(dataset_test)}): {dataset_test}")
```

## Visual Representation

Using _Matplotlib_ and _pandas_, we will visually represent the training and test datasets.

In [None]:
```python
import matplotlib.pyplot as plt
import pandas as pd


def plot_datasets(train_dataset, test_dataset):
    # Combine datasets into a DataFrame for easier handling
    train_df = pd.DataFrame(train_dataset, columns=["x1", "x2", "class"])
    train_df["set"] = "Train"

    test_df = pd.DataFrame(test_dataset, columns=["x1", "x2", "class"])
    test_df["set"] = "Test"

    combined_df = pd.concat([train_df, test_df], ignore_index=True)

    # Define colors and markers
    colors = {0: "blue", 1: "red"}
    markers = {"Train": "o", "Test": "x"}

    # Plot each group using Matplotlib
    fig, ax = plt.subplots()
    for (dataset, cls), group in combined_df.groupby(["set", "class"]):
        ax.scatter(
            group["x1"],
            group["x2"],
            color=colors[cls],
            label=f"{dataset} Dataset - Class {cls}",
            s=40,
            marker=markers[dataset],
            edgecolors='k'  # Add edge color for better visibility
        )

    # Manage legend and labels
    handles, labels = ax.get_legend_handles_labels()
    by_label = dict(zip(labels, handles))
    ax.legend(by_label.values(), by_label.keys(), title="Dataset and Class", loc="best")
    ax.set_xlabel("x1", fontsize=12)
    ax.set_ylabel("x2", fontsize=12)
    ax.set_title("Training and Test Datasets", fontsize=14)
    ax.grid(True)

    # Enhance layout
    plt.tight_layout()


plot_datasets(dataset_train, dataset_test)
plt.show()
```

In [None]:
## Defining a Linear Classifier

With our dataset prepared, we now turn to the mathematical foundation that enables our model to classify an input $(x_{1}, x_{2})$ as belonging to class $0$ or $1$, as follows:

$$
c: \mathbb{N}^2 \to \{0,1\}, \quad
c(x_{1},x_{2}) =
\begin{cases} 
1, & \text{if } (x_{1},x_{2}) \in \text{Class 1}, \\
0, & \text{if } (x_{1},x_{2}) \in \text{Class 0}.
\end{cases}
$$

This classification can be achieved by using the algebraic representation of a line in a Cartesian coordinate system, described by:

$$
z(x_{1}, x_{2}) = w_{1}x_{1} + w_{2}x_{2} + b,
$$

where:
- $w_{1}$ and $w_{2}$ are the weights that determine the slope, representing the angle of the resulting line relative to the $x_{1}$-axis,
- $b$ is the bias (intercept), indicating where the line intersects the $x_{2}$-axis.

From the plotted graph above, it becomes clear that the two classes are linearly separable. This makes the use of a linear separation boundary, trained using Rosenblatt's Perceptron algorithm, appropriate.

To illustrate this, here's how a line defined by $w_{1}=1$, $w_{2}=0.5$, and $b=0$ would look like when plotted on our previous graph.

In [None]:
```python
def plot_decision_boundary(w1, w2, c, dataset_train, dataset_test):
    # Calculate x2 based on the line equation w1*x1 + w2*x2 + c = 0
    x2 = lambda x1: (-w1 * x1 - c) / w2 if w2 != 0 else float('inf')
    
    # Generate a range of x1 values from the dataset
    x1_values = range(0, 101)
    x2_values = [x2(x1) for x1 in x1_values]
    
    # Plot the datasets
    plot_datasets(dataset_train, dataset_test)
    
    # Plot the decision boundary line
    plt.plot(x1_values, x2_values, label=f"{w1:.2f}x1 + {w2:.2f}x2 + {c:.2f} = 0", linestyle='--', color='green')
    
    # Enhancements: Add title, labels, and grid for better clarity
    plt.title("Plot of Datasets with Decision Boundary")
    plt.xlabel("x1")
    plt.ylabel("x2")
    plt.grid(True)
    plt.legend(loc="best")

# Call the updated function to plot the dataset and decision boundary
plot_decision_boundary(-1.5, 1.1, -10, dataset_train, dataset_test)
plt.show()
```

In [None]:
Now, it is evident that in this configuration the datapoints are separated neatly. There are, however, alternative configurations for $w_{1},\ w_{2}, \text{ and } c$ that yield less optimal results. For example, when $w_{1}=0.1,\ w_{2}=0.1, \text{ and } c=0.5$, we obtain a less ideal separation boundary.

In [None]:
```python
plot_datasets_and_zx(0.1, 0.1, 0.5)
# As seen here, this alternative configuration provides a less ideal separation boundary.
# Notice how many data points fall on the wrong side of the line, indicating potential misclassifications.
# This illustrates the importance of selecting appropriate weights and biases for accurate classification.
```

In [None]:
## Evaluating the Performance of the Classifier

Now that we have defined $z = w_{1}x_{1} + w_{2}x_{2} + c$ as our classifier's decision boundary, and have visually demonstrated its functionality for some hand-picked values, we can now define a method to quantitatively assess its performance using the _accuracy_ metric.

### Defining the Classifier

Before we evaluate our classifier's performance, let's formally define it:

$$
c: \mathbb{N} \to \{0, 1\}, \quad
c(x_{1}, x_{2}) =
\begin{cases} 
1, & \text{if } z(x_{1}, x_{2}) \geq 0, \\
0, & \text{if } z(x_{1}, x_{2}) < 0.
\end{cases}
$$

In essence, this means that if an input $(x_{1}, x_{2})$ evaluates to a value above the decision boundary determined by $z(x_1, x_2)$, it will be classified as $1$, otherwise it will be classified as $0$.

Let's implement this classifier in code and continue to discuss methods for evaluating its performance.

In [None]:
```python
cx = lambda x1, x2, w1, w2, c: 1 if zx(x1, x2, w1, w2, c) >= 0 else 0

def accuracy(dataset, w1, w2, c):
    correct = sum(y == cx(x1, x2, w1, w2, c) for x1, x2, y in dataset)
    accuracy_percent = correct / len(dataset) * 100
    print(f"Accuracy on dataset: {correct}/{len(dataset)} correct, {accuracy_percent:.2f}% accuracy")
    return accuracy_percent / 100

# Calculate and print accuracy for specific weights and bias on the training dataset
train_accuracy = accuracy(dataset_train, -1.5, 1.1, -10)
test_accuracy = accuracy(dataset_test, -1.5, 1.1, -10)
print(f"Training set accuracy: {train_accuracy:.2f}")
print(f"Test set accuracy: {test_accuracy:.2f}")

train_accuracy = accuracy(dataset_train, 0.1, 0.1, 0.5)
test_accuracy = accuracy(dataset_test, 0.1, 0.1, 0.5)
print(f"Training set accuracy: {train_accuracy:.2f}")
print(f"Test set accuracy: {test_accuracy:.2f}")
```

In [None]:
### Discussion on Accuracy

As demonstrated earlier, different weights and bias values can lead to varying accuracy results. This arises because each set of weights and biases defines a unique decision boundary, which can effectively or ineffectively classify the dataset into the correct classes. The challenge is to identify the "optimal" values for these parameters. This is addressed through a process known as _model training_.

## Model Training

Leveraging the insights obtained from our analysis, we will now embark on training our model through iterative refinements of the weights and bias values. The objective is to enhance the accuracy to an acceptable level. This iterative training allows us to discover the optimal combination of parameters that aligns with the dataset and effectively addresses the classification task.

In [None]:
```python
# First, let's initialize the weights and bias to zero
w1, w2, c = 0, 0, 0

# Let's define the learning rate
learning_rate = 0.1

# Define the number of epochs
num_epochs = 100

# Create a list to store the details of the epochs
epoch_details = []

# Start the training loop
for epoch in range(num_epochs):
    print(f"Epoch {epoch + 1}")
    for x1, x2, y in dataset_train:
        z = zx(x1, x2, w1, w2, c)
        y_hat = 1 if z >= 0 else 0
        w1 += learning_rate * (y - y_hat) * x1
        w2 += learning_rate * (y - y_hat) * x2
        c += learning_rate * (y - y_hat)
        print(f"  x1={x1}, x2={x2}, y={y}, z={z:.2f}, y_hat={y_hat}, w1={w1:.2f}, w2={w2:.2f}, c={c:.2f}")
        # Append the details to the list
        epoch_details.append({
            "epoch": epoch + 1,
            "x1": x1,
            "x2": x2,
            "y": y,
            "z": z,
            "y_hat": y_hat,
            "w1": w1,
            "w2": w2,
            "c": c
        })

# Create a DataFrame from the collected details
epoch_details_df = pd.DataFrame(epoch_details)

# Display the first few rows of the DataFrame
epoch_details_df.head()
```