# K-Nearest Neighbors (KNN) from Scratch

K-Nearest Neighbors (KNN) is a simple, non-parametric algorithm for classification and regression. It predicts the label of a sample based on the majority label (or average value) of its k nearest neighbors in the feature space.

In this notebook, you'll scaffold the steps to implement KNN from scratch, including distance metrics, prediction, and evaluation.

## 📏 Distance Metrics

KNN relies on a distance metric to find the closest neighbors. Common choices are Euclidean and Manhattan distance.

### Task:
- Scaffold functions to compute Euclidean and Manhattan distances between two points.
- Add docstrings explaining their use.

In [None]:
def euclidean_distance(x1, x2):
    """
    Compute the Euclidean distance between two points.
    Args:
        x1, x2 (np.ndarray): Input vectors.
    Returns:
        float: Euclidean distance.
    """
    # TODO: Implement Euclidean distance
    pass

def manhattan_distance(x1, x2):
    """
    Compute the Manhattan distance between two points.
    Args:
        x1, x2 (np.ndarray): Input vectors.
    Returns:
        float: Manhattan distance.
    """
    # TODO: Implement Manhattan distance
    pass

## 🔢 Finding the K Nearest Neighbors

For a given sample, find the k closest points in the training set using the chosen distance metric.

### Task:
- Scaffold a function to find the indices of the k nearest neighbors.
- Add a docstring explaining its use.

In [None]:
def find_k_nearest_neighbors(X_train, x_query, k, distance_fn):
    """
    Find the indices of the k nearest neighbors in X_train to x_query.
    Args:
        X_train (np.ndarray): Training data (n_samples x n_features).
        x_query (np.ndarray): Query point (n_features,).
        k (int): Number of neighbors.
        distance_fn (callable): Distance function.
    Returns:
        list: Indices of the k nearest neighbors.
    """
    # TODO: Find k nearest neighbors
    pass

## 🏷 KNN Classification

Predict the class label for a query point by majority vote among its k nearest neighbors.

### Task:
- Scaffold a function to predict the class label for a query point.
- Add a docstring explaining its use.

In [None]:
def knn_predict_class(X_train, y_train, x_query, k, distance_fn):
    """
    Predict the class label for x_query using KNN.
    Args:
        X_train (np.ndarray): Training data.
        y_train (np.ndarray): Training labels.
        x_query (np.ndarray): Query point.
        k (int): Number of neighbors.
        distance_fn (callable): Distance function.
    Returns:
        int or str: Predicted class label.
    """
    # TODO: Predict class label using majority vote
    pass

## 🔢 KNN Regression (Optional)

For regression, predict the value as the average of the k nearest neighbors' values.

### Task:
- Scaffold a function to predict a regression value for a query point.
- Add a docstring explaining its use.

In [None]:
def knn_predict_regression(X_train, y_train, x_query, k, distance_fn):
    """
    Predict the regression value for x_query using KNN.
    Args:
        X_train (np.ndarray): Training data.
        y_train (np.ndarray): Training values.
        x_query (np.ndarray): Query point.
        k (int): Number of neighbors.
        distance_fn (callable): Distance function.
    Returns:
        float: Predicted value.
    """
    # TODO: Predict regression value using average of neighbors
    pass

## ⚖️ Feature Scaling / Normalization

Feature scaling is important for KNN, as distance metrics are sensitive to the scale of features.

### Task:
- Scaffold a function to normalize features (e.g., min-max or z-score normalization).
- Add a docstring explaining its use.

In [None]:
def normalize_features(X):
    """
    Normalize features (e.g., min-max or z-score normalization).
    Args:
        X (np.ndarray): Feature matrix.
    Returns:
        np.ndarray: Normalized features.
    """
    # TODO: Normalize features
    pass

## 🏋 Training and Evaluation Loop

KNN is a lazy learner: no explicit training, but you can evaluate its performance on a test set.

### Task:
- Scaffold a function to compute accuracy for classification or MSE for regression.
- Add docstrings explaining their use.

In [None]:
def compute_accuracy_knn(y_true, y_pred):
    """
    Compute accuracy for KNN classification.
    Args:
        y_true (np.ndarray): True labels.
        y_pred (np.ndarray): Predicted labels.
    Returns:
        float: Accuracy (0 to 1).
    """
    # TODO: Compute accuracy
    pass

def compute_mse_knn(y_true, y_pred):
    """
    Compute mean squared error for KNN regression.
    Args:
        y_true (np.ndarray): True values.
        y_pred (np.ndarray): Predicted values.
    Returns:
        float: Mean squared error.
    """
    # TODO: Compute MSE
    pass

## 🧠 Final Summary: KNN in ML

- KNN is a simple, interpretable, and effective algorithm for both classification and regression.
- Understanding distance metrics, feature scaling, and the curse of dimensionality is key for using KNN effectively.
- KNN is a useful baseline and a great way to build intuition for more complex ML models.