# **KogSys-ML-B Introduction to Machine Learning**
## **Perceptron**
---

To set up a new conda environment suitable for this notebook, you can use the following console commands:

```bash
conda create -y -n perc python=3.13
conda activate perc
python -m pip install -r requirements.txt
```

**Note**: Conda can become very hard-drive hungry when you use many environments. Consider regularly deleting environments you no longer need and running the ``conda clean --all`` command to remove no longer needed packages and cached files.

You can also install the requirements for this notebook into an existing environment by running the cell below:

In [1]:
# !python -m pip install -q -U -r requirements.txt

In [2]:
from __future__ import annotations

from typing import Any

import numpy as np
from sklearn.base import ClassifierMixin
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

np.random.seed(2025)

### **Implement a Perceptron**

The goal is to implement an efficient thresholded perceptron using ``numpy``. For today, we will only be relying on ``scikit-learn`` for the ``ClassifierMixin`` base class for our perceptron (and for data management).

We want our Perceptron to do the following things (in order of importance):
1. Implement the Perceptron Training Rule and classify both _single and sets of samples_
2. Automatically adapt to any input we pass as training data, i.e. do not have set a number of input features on construction
3. Work with any class labels, i.e. no matter wether the classes are named ``0`` and ``1`` or "apple" and "pear"

#### **$b$ or $w_0$?**

You have to make a choice whether you want to implement the bias as a standalone additive factor, or as part of the weight vector. Both implementations have their advantages and drawbacks, neither is "easier" than the other. Whereas the $w_0$ approach requires you to make some data transformations within some (maybe just one?) of the methods for the approach to work, the $b$ approach will not cause you any troubles on that end, but requires some additional lines of code which are easily forgotten.

For now: pick an option, and stick with it.

#### **``forward`` and ``predict``**

From our implementation of a Decision Tree ensemble in the third tutorial session, you know how the ``ClassifierMixin`` base class works, and that it requires the implementation of the ``predict`` method. For neural networks, it makes sense to add an intermediate method, the ``forward`` method. The ``forward`` method calculates the raw network output, whereas the ``predict`` method turns that output into a class prediction.

Note how this relates to the third point of our implementation goal: The predict method works as a translator, turning the $+1$ and $-1$ network outputs into a prediction for arbitrary class labels.

In [3]:
class Perceptron(ClassifierMixin):
    """
    Class implementing a thresholded Perceptron using the w0 approach
    """

    def __init__(self):
        """ """

        super().__init__()

        self.__w: np.ndarray = np.random.randn(0)
        self.__class_0: Any = 0
        self.__class_1: Any = 1

    @property
    def w(self) -> np.ndarray:
        return self.__w

    def __init_weights(self, X: np.ndarray) -> None:
        """
        Helper method to initialize weights. This is called from within fit if the dimensions of the input do not match the dimensions of the weights.

        Parameters
        ----------
        X: np.ndarray
            A training sample from which the correct dimensionality will be infered.
        """
        
        self.__w = np.random.randn(X.shape[-1] + 1)      # for the w0 approach, create a weight vector which is 1 larger than the number of attributes

    def forward(self, X: np.ndarray) -> np.ndarray:
        """
        Calculate the network output for input ``X``

        Parameters
        ----------
        X: np.ndarray
            Instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.

        Returns
        -------
        np.ndarray
            Array of network outputs.
        """
        
        X = np.concatenate([np.ones(X.shape[:-1] + (1,)), X], axis = -1)

        x = np.dot(X, self.__w)
        x = np.array([x]) if not isinstance(x, np.ndarray) else x

        x = np.array([1 if _x > 0 else -1 for _x in x])

        return x
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """
        Turn the netweok output into a concept prediction of 1 or 0.

        Parameters
        ----------
        X: np.ndarray
            Instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.

        Returns
        -------
        np.ndarray
            Array of concept predictions.
        """

        return np.array([self.__class_1 if o == 1 else self.__class_0 for o in self.forward(X)])

    def fit(
        self,
        X: np.ndarray,
        y: np.ndarray,
        lr: float = 0.001,
        max_epoch: int = 500,
        resume: bool = False,
    ) -> Perceptron:
        """
        Train the Perceptron on ``X`` labeled with ``y``.

        Parameters
        ----------
        X: np.ndarray
            Training instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.
        y: np.ndarrray
            Labels. Must have exactly to classes, i.e. two distinct 
        lr: float (Optional)
            Learning rate, default 0.001
        max_epoch: int (Optional)
            The maximum number of trianing epochs to run, default 500
        resume: bool (Optional)
            Resume training, i.e. do not initialize weights.

        Returns
        -------
        Perceptron
            itself

        Raises
        ------
        ValueError
            If ``X`` has an invalid number of dimensions.
        ValueError
            If ``y`` has an invalid number of classes.
        """

        # ensure valid inputs
        if X.ndim not in (1, 2):
            raise ValueError(
                f"Invalid Dimension: X is of dimension {X.ndim}, but must be of dimension 1 for a single instance or 2 for a set of instances."
            )
        if len(np.unique(y)) != 2:
            raise ValueError(
                f"Invalid number of classes: Requires 2 distinct classes in y but found {len(np.unique(y))}."
            )
        
        self.__class_0, self.__class_1 = np.unique(y)

        # re-initialize weights, if necessary
        if X.shape[-1] != (self.__w.shape[0] - 1) and not resume:
            self.__init_weights(X)

        for _ in range(max_epoch):
            for _x, _y in zip(X, y):
                _y = -1 if _y == self.__class_0 else 1
                self.__w += lr * (_y - self.forward(_x)) * np.concatenate([np.ones(_x.shape[:-1] + (1,)), _x], axis = -1)

            if self.score(X, y) == 1:
                break

        return self

In [4]:
class Perceptron(ClassifierMixin):
    """
    Class implementing a thresholded Perceptron using the bias approach
    """

    def __init__(self):
        """ """

        super().__init__()

        self.__w: np.ndarray = np.random.randn(0)
        self.__b: np.ndarray = np.random.randn(1)
        self.__class_0: Any = 0
        self.__class_1: Any = 1

    @property
    def w(self) -> np.ndarray:
        return self.__w

    @property
    def b(self) -> np.ndarray:
        return self.__b

    def __init_weights(self, X: np.ndarray) -> None:
        """
        Helper method to initialize weights. This is called from within fit if the dimensions of the input do not match the dimensions of the weights.

        Parameters
        ----------
        X: np.ndarray
            A training sample from which the correct dimensionality will be infered.
        """

        self.__w = np.random.randn(X.shape[-1])
        self.__b = np.random.randn(1)

    def forward(self, X: np.ndarray) -> np.ndarray:
        """
        Calculate the network output for input ``X``

        Parameters
        ----------
        X: np.ndarray
            Instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.

        Returns
        -------
        np.ndarray
            Array of network outputs.
        """

        x = np.dot(X, self.__w) + self.__b

        x = np.array([1 if _x > 0 else -1 for _x in x])

        return x
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """
        Turn the netweok output into a concept prediction of 1 or 0.

        Parameters
        ----------
        X: np.ndarray
            Instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.

        Returns
        -------
        np.ndarray
            Array of concept predictions.
        """
        return np.array([self.__class_1 if o == 1 else self.__class_0 for o in self.forward(X)])

    def fit(
        self,
        X: np.ndarray,
        y: np.ndarray,
        lr: float = 0.001,
        max_epoch: int = 500,
        resume: bool = False,
    ) -> Perceptron:
        """
        Train the Perceptron on ``X`` labeled with ``y``.

        Parameters
        ----------
        X: np.ndarray
            Training instances, either one-dimensional for a single instance or 2-dimensional for a set of instances.
        y: np.ndarrray
            Labels. Must have exactly to classes, i.e. two distinct 
        lr: float (Optional)
            Learning rate, default 0.001
        max_epoch: int (Optional)
            The maximum number of trianing epochs to run, default 500
        resume: bool (Optional)
            Resume training, i.e. do not initialize weights.

        Returns
        -------
        Perceptron
            itself

        Raises
        ------
        ValueError
            If ``X`` has an invalid number of dimensions.
        ValueError
            If ``y`` has an invalid number of classes.
        """

        # ensure valid inputs
        if X.ndim not in (1, 2):
            raise ValueError(
                f"Invalid Dimension: X is of dimension {X.ndim}, but must be of dimension 1 for a single instance or 2 for a set of instances."
            )
        if len(np.unique(y)) != 2:
            raise ValueError(
                f"Invalid number of classes: Requires 2 distinct classes in y but found {len(np.unique(y))}."
            )
        
        self.__class_0, self.__class_1 = np.unique(y)

        # re-initialize weights, if necessary
        if X.shape[-1] != self.__w.shape[0] and not resume:
            self.__init_weights(X)

        for _ in range(max_epoch):
            for _x, _y in zip(X, y):
                _y = -1 if _y == self.__class_0 else 1
                self.__w += lr * (_y - (o := self.forward(_x))) * _x
                self.__b += lr * (_y - o)

            if self.score(X, y) == 1:
                break

        return self

### **Test Your Perceptron**

You may leave the following two cells as they are. Your Perceptron should be able to classify this split of the ``iris`` dataset with perfect accuracy, taking essentially no time to run.

In [5]:
X, y = load_iris(return_X_y = True)

y = np.array([1 if _y == 0 else 0 for _y in y])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

In [6]:
p = Perceptron()

p = p.fit(X_train, y_train)
p.score(X_test, y_test)

1.0