In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Problem 2: Exponentially Weighted k-Nearest Neighbors

The **exponentially weighted k-Nearest Neighbors (k-ENW)** method is a variant of the kNN algorithm that uses all the neighbors of a new point $x_{\text{new}}$ instead of only the nearest $k$ neighbors. 
The neighbors are assigned weights that depend on their distance from $x_{\text{new}}$ and a constant $\alpha$ between 0 and 1. 
Specifically, the weight of the $i$-th nearest neighbor is given by:

$$
w_i = \alpha(1-\alpha)^{i-1} \quad \mbox{for }i=1,2,\ldots,m
$$

where $m$ is the total number of neighbors.
Note that the weight of the first nearest neighbor ($i=1$) is simply $\alpha$.
The choice of $\alpha$ determines the weight assigned to each neighbor. 
Larger values of $\alpha$ give more weight to the nearest neighbors and produce weights that decay rapidly to zero as $i$ increases.
Smaller values of $\alpha$ give more equal weight to all the neighbors and produce weights that decay more slowly to zero.

## Part 1: Implementing Exponentially Weighted k-Nearest Neighbors

Your task is to implement the exponentially weighted k-nearest neighbors method using the weight function described above.
Write a function `exponential_weighted_knn` that takes as input:

- the training data X, where each row represents a data point and each column represents a feature
- the training labels vector y, where the i-th entry represents the label of the i-th training data point
- the new data matrix X_new, where each row represents a new data point and each column represents a feature
- the parameter alpha used to compute the weights

The function should return a vector `y_pred` containing the predicted labels for the new data.

In [2]:
def exponential_weighted_knn(X,y,X_new,alpha):
    
    
    return y_pred

## Part 2: The MNIST dataset revisited

In [3]:
# load the data
url = 'https://raw.githubusercontent.com/um-perez-alvaro/Data-Science-Theory/master/Data/digits.csv'
data = pd.read_csv(url)
data.head(5)

Unnamed: 0,pixel 0,pixel 1,pixel 2,pixel 3,pixel 4,pixel 5,pixel 6,pixel 7,pixel 8,pixel 9,...,pixel 775,pixel 776,pixel 777,pixel 778,pixel 779,pixel 780,pixel 781,pixel 782,pixel 783,label
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9


In [4]:
X = data.iloc[:,0:784].to_numpy() # pixels
y = data['label'].to_numpy() # labels

**Part 2-a:** Split the dataset into training and validation sets.

**Part 2-b:** Use the validation set to tune-in the parameter $\alpha$.

**Part 2-c:** Test your classifier on the following test set

In [None]:
# load test data
url = 'https://raw.githubusercontent.com/um-perez-alvaro/Data-Science-Theory/master/Data/digits_test.csv'
test_data = pd.read_csv(url)

In [None]:
X_test = test_data.iloc[:,0:784].to_numpy() # pixels
y_test = test_data['label'].to_numpy() # labels