# Assignment 3 - KNN Classifiers

The goal of this assignment is to apply the K-Nearest Neighbors (KNN) classification algorithm on the digits dataset, which contains images of hand-written digits.

The dataset comprises 8x8 pixel images of hand-written digits (0 through 9). Each image is represented as a 64-dimensional vector, where each dimension corresponds to a pixel's grayscale value.

## Tasks:

### 1. Data Loading and Exploration: (2 points)

In [None]:
# Import necessary libraries
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt

# Load the dataset
digits = load_digits()

# Explore the dataset
print(f"Data shape: {digits.data.shape}")
print(f"Target shape: {digits.target.shape}")

# Visualize the first few images and their labels
"""
Question: Visualize the Digits Dataset (2 points)
Display the first ten images from the digits dataset with their labels. Use a 2x5 grid of subplots, and ensure the grid size is 10x5. Show each image in grayscale and title it with the respective label. Adjust the layout to prevent overlap.

Hints:

Use the 'ax.imshow()' method to display an image and 'ax.set_title()' to set its title.
Write your code below:
"""
fig, axes = plt.subplots(2, 5, figsize=(10, 5))
for ax, image, label in zip(axes.ravel(), digits.images, digits.target):
    # show image 
    # set the image title with the label
plt.tight_layout()
plt.show()


### 2. Data Preprocessing (2 points)

In [None]:
from sklearn.model_selection import train_test_split

# Split the dataset into training and testing sets
"""
Question: Dataset Splitting (2 points)
Using the digits dataset, divide it into training and testing sets. 

Hints:

Utilize the train_test_split function.
Remember to set test_size and random_state.
"""
# write the train and test split

### 3. KNN Classification (2 points)

In [None]:
from sklearn.neighbors import KNeighborsClassifier

# Initialize the KNN classifier
knn = KNeighborsClassifier(n_neighbors=5)

# Train the classifier and predict labels for the test set
"""
Question: KNN Classifier Training and Prediction (2 points)
Given the initialized KNN classifier with n_neighbors set to 5, perform the following tasks:

1- Train the classifier using the training data.
2- Predict the labels for the test dataset using the trained classifier.

Hints:

Use the 'fit' method to train the classifier.
Utilize the 'predict' method to make predictions on the test data.
"""
# fit the model

# predict X_test and store it in y_pred

### 4. Evaluation (2 points)

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix

"""
Question: Evaluating Classifier Performance (2 points)
After obtaining the predicted labels from the trained KNN classifier, evaluate its performance using the following metrics:

Calculate the accuracy of the classifier and display it in percentage format.
Compute the confusion matrix for the true labels and predicted labels.
Hints:

Use the accuracy_score function to get the accuracy.
The confusion_matrix function can help in deriving the confusion matrix.
"""
# Calculate accuracy
accuracy = # calculate accuracy
print(f"Accuracy with k=5: {accuracy*100:.2f}%")

# Confusion matrix
conf_mat = # calculate confusion matrix
print("Confusion Matrix:")
print(conf_mat)

### 5. Experimenting with different k values (2 points)

In [None]:
k_values = [1, 3, 5, 7, 9, 11, 13, 15]
accuracies = []

"""
Question: Experimenting with Different k-values in KNN (2 points)
The k-value in KNN determines the number of neighbors to consider when making a prediction. To find an optimal k-value, perform the following:

Iterate over different k-values given in the k_values list.
For each k-value, train the KNN classifier and predict the labels for the test set.
Calculate and store the accuracy for each k-value in the accuracies list.
Finally, plot a graph of accuracy vs. k-values to visualize the performance of the KNN classifier for different k-values.

Hint:

Use a loop to iterate over each k-value.
Remember to initialize the KNN classifier with the current k-value in each iteration
"""
for k in k_values:
    knn = # fit the classifier
    knn.fit(X_train, y_train)
    y_pred = # predict on test
    accuracy = # calculate the accuracy
    # add the accuracy to accuracies

# Plot the results
plt.plot(k_values, accuracies, marker='o')
plt.xlabel("k values")
plt.ylabel("Accuracy")
plt.title("Accuracy vs. k values in KNN")
plt.show()