# L4c: K-Nearest Neighbors Approach for Classification
In this lecture, we explore the K-Nearest Neighbors (KNN) algorithm, a simple yet effective non-parametric method for classification tasks. KNN classifies new data points based on the majority class of their $k$ closest neighbors in the feature space.

> __Learning Objectives:__
>
> By the end of this lecture, you will be able to:
> Three learning objectives go here

Let's get started!
___

## K-Nearest Neighbor Classification
K-nearest neighbor (KNN) classification is a simple yet powerful machine learning algorithm for classification and regression tasks. The algorithm finds the K closest data points to a new instance in the feature space and then classifies the new instance based on the majority class among these neighbors.

> __Key assumption__: The key assumption of a [K-nearest neighbor classifier](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) is that _similar_ inputs have _similar_ labels (in classification tasks) or _similar_ outputs for K-nearest neighbor regression tasks. However, the wiggly aspect of this assumption is what we mean by _similar_.

Let's look at some pseudo-code for KNN classification.

__Initialization__: Provide a reference dataset $\mathcal{D} = \{(\mathbf{x}_{i},y_{i}) \mid i = 1,2,\dots,n\}$, where the vectors $\mathbf{x}_i \in \mathbb{R}^{m}$ are $m$-dimensional feature vectors and the target variables are discrete labels $y_i \in \{-1,1\}$. Specify the number of neighbors $K$, where $1 \leq K < n$. Choose a distance metric $d(\cdot,\cdot)$ (typically Euclidean distance: $d(\mathbf{x}, \mathbf{x}^*) = \|\mathbf{x} - \mathbf{x}^*\|_2$).

__For each new point__ $\mathbf{x}^*$ to classify __do__:

1. Compute distances: Calculate the distance from $\mathbf{x}^*$ to all reference points in $\mathcal{D}$: $d_i = d(\mathbf{x}_i, \mathbf{x}^*)$ for $i = 1, 2, \dots, n$.

2. Find $K$ nearest neighbors: Identify the set $\mathcal{N}_K(\mathbf{x}^*) = \{\mathbf{x}_{i_1}, \mathbf{x}_{i_2}, \dots, \mathbf{x}_{i_K}\}$ of $K$ reference points with the smallest distances to $\mathbf{x}^*$.

3. Majority vote: Count the class labels among the $K$ nearest neighbors and assign the majority class to $\mathbf{x}^*$: $$\hat{y}^* = \arg\max_{c \in \{-1,1\}} \sum_{j=1}^{K} \mathbb{1}(y_{i_j} = c)$$ where $\mathbb{1}(\cdot)$ is the indicator function that returns 1 if the condition is true and 0 otherwise.

4. Return the predicted label $\hat{y}^*$ for the new point $\mathbf{x}^*$.

## Summary
One direct, concise summary sentence goes here.

> __Key Takeaways:__
>
> Three key takeaways go here

One direct, concise concluding sentence goes here.
___