# **KogSys-ML-B: Einführung in Maschinelles Lernen**
## **$k$-Nearest Neighbor Classification**

---

To set up a new conda environment suitable for this notebook, you can use the following console commands:

```bash
conda create -y -n knn python=3.13
conda activate knn
python -m pip install -r requirements.txt
```

**Note**: Conda can become very hard-drive hungry when you use many environments. Consider regularly deleting environments you no longer need and running the ``conda clean --all`` command to remove no longer needed packages and cached files.

You can also install the requirements for this notebook into an existing environment by running the cell below:

In [None]:
# !python -m pip install -q -U -r requirements.txt

In [None]:
from typing import Optional

import numpy as np
from numpy.typing import ArrayLike
from sklearn.base import ClassifierMixin
from sklearn.datasets import load_iris
from sklearn.metrics.pairwise import euclidean_distances
from sklearn.model_selection import train_test_split

### **$k$-NN Algorithm**

Complete the implementation of the ``KNNClassifier`` in the next cell. Note that you are using the ``ClassifierMixin`` base class, so you need to implement ``fit`` and ``predict`` methods.

Note that $k$-NN, as a lazy learning approach, employs its logic during _inference_, not during training. In other words, while the ``fit`` method is fairly simple, predict is more complex than you may be used to from previous algorithms.

Note: This is the shortest machine learning algorithm (in terms of lines of code) we have implemented so far – if you find the right methods.


In [None]:
class KNNClassifier(ClassifierMixin):
    def __init__(self, k: int = 5) -> None:
        """
        Parameters
        ----------
        k: int
            The k to use for prediction.
        """
        self.__k = k

        self.X: Optional[ArrayLike] = None
        self.y: Optional[ArrayLike] = None

    @property
    def k(self) -> int:
        """
        k getter method.

        Returns
        -------
        int
            k
        """
        return self.__k

    def set_k(self, k: int) -> None:
        """
        k setter method.

        Parameters
        ----------
        k: int
            new k
        """
        self.__k = k

    def fit(self, X: ArrayLike, y: ArrayLike) -> "KNNClassifier":
        """
        Store training data.

        Parameters
        ----------
        X: ArrayLike
            Training data
        y: ArrayLike
            Training labels

        Returns
        -------
        KNNClassifier
            itself
        """
        raise NotImplementedError

    def predict(self, X: ArrayLike) -> ArrayLike:
        """
        Parameters
        ----------
        X: ArrayLike
            Prediction Samples
        
        Returns
        -------
        ArrayLike
            Array containing predictions for each passed instance.
        """
        raise NotImplementedError

### **Test Algorithm**

You don't need to change anything in the next cell. You should get perfect accuracy with the defined split.

In [None]:
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

knn = KNNClassifier()
knn.fit(X_train, y_train)
knn.score(X_test, y_test)