# K-Nearest Neighbors (KNN)
K-Nearest Neighbors (KNN) is a simple, non-parametric algorithm used for both classification and regression tasks. In KNN classification, the output class is determined by the majority class among the K nearest neighbors in the feature space.

In [None]:
# K-Nearest Neighbors (KNN) Classification Notebook

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Generate synthetic binary classification data
np.random.seed(0)
X = np.random.rand(100, 2)  # 100 samples with 2 features
y = (X[:, 0] + X[:, 1] > 1).astype(int)  # Class 1 if the sum of the features is greater than 1

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Train the KNN classifier
k = 5  # Number of neighbors
model = KNeighborsClassifier(n_neighbors=k)
model.fit(X_train, y_train)

# Predict using the model
y_pred = model.predict(X_test)

# Evaluate model performance
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)

# Print evaluation results
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", class_report)

# Visualize the results
plt.scatter(X_test[y_test == 0][:, 0], X_test[y_test == 0][:, 1], color='red', label='Class 0')
plt.scatter(X_test[y_test == 1][:, 0], X_test[y_test == 1][:, 1], color='blue', label='Class 1')

# Create a mesh grid for decision boundary visualization
x_min, x_max = X[:, 0].min() - 0.1, X[:, 0].max() + 0.1
y_min, y_max = X[:, 1].min() - 0.1, X[:, 1].max() + 0.1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01), np.arange(y_min, y_max, 0.01))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.3)
plt.title('KNN Decision Boundary')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()

### Explanation of Code Components

1. **Data Generation**: Synthetic binary classification data is created where points are classified based on the sum of two features.

2. **Data Preprocessing**: The dataset is split into training and testing sets.

3. **Model Training**: A KNN classifier is trained using the training data, specifying the number of neighbors (K).

4. **Prediction**: Predictions are made on the test set.

5. **Model Evaluation**:
   - **Accuracy**: The proportion of correctly classified instances.
   - **Confusion Matrix**: A table used to describe the performance of the classification model.
   - **Classification Report**: Includes precision, recall, and F1-score for each class.

6. **Visualization**: A scatter plot displays the actual class distribution in the test set. The decision boundary created by the KNN model is visualized using contour plots.

### Note
- The choice of K (number of neighbors) can significantly impact model performance. It can be determined using cross-validation.
- KNN can be computationally intensive for large datasets since it requires calculating the distance to all training samples during prediction.