# k-Nearest Neighbors (k-NN) - Product Purchase Prediction
In this notebook, we will use the k-Nearest Neighbors (k-NN) algorithm to predict whether a person will purchase a product based on their **age**, **estimated salary**, and **credit score**.

We will go step by step: from loading the dataset, exploring and visualizing it, to building and evaluating the model.

## Step 1: Import Required Libraries
We begin by importing the necessary Python libraries for data handling, visualization, and machine learning.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

print("Libraries imported successfully!")

## Step 2: Load and Explore the Dataset
We will load a dataset that contains information about customers: their **age**, **estimated salary**, and **credit score**, along with a label that tells us if they purchased the product (`1`) or not (`0`).

In [None]:
# Load the dataset
data = pd.read_csv('data/purchase_data.csv')

# Show the first few rows
print(data.head())

## Step 3: Visualize the Data
Let's visualize the relationship between some of the variables and the purchase decision. We will use scatter plots.

In [None]:
# Age vs Purchased
plt.scatter(data['age'], data['purchased'], c='blue')
plt.xlabel('Age')
plt.ylabel('Purchased')
plt.title('Age vs Purchased')
plt.show()

# Credit Score vs Purchased
plt.scatter(data['credit_score'], data['purchased'], c='green')
plt.xlabel('Credit Score')
plt.ylabel('Purchased')
plt.title('Credit Score vs Purchased')
plt.show()

## Step 4: Prepare the Data
Now we select the input features (`age`, `estimated_salary`, `credit_score`) and the target variable (`purchased`).
We split the data into a training set and a test set to evaluate the model performance later.

In [None]:
# Feature matrix and target vector
X = data[['age', 'estimated_salary', 'credit_score']]
y = data['purchased']

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)
print("✅ Data prepared and split into training and test sets.")

## Step 5: Train the k-NN Model
We create an instance of `KNeighborsClassifier` and use the `.fit()` method to train it with the training data.

In [None]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
print("✅ Model trained successfully.")

## Step 6: Make Predictions
We use the trained model to predict values for the test set.

In [None]:
predictions = knn.predict(X_test)
print("✅ Predictions completed.")

## Step 7: Evaluate the Model
We evaluate how well the model performed using accuracy as our metric.

In [None]:
accuracy = accuracy_score(y_test, predictions)
print(f"✅ Model Accuracy: {accuracy:.2f}")

## Step 8: Try Your Own Prediction
Now you can try a custom example. Enter the **age**, **estimated salary**, and **credit score**, and the model will predict whether the person would buy the product.

In [None]:
## User input
age_input = int(input("Enter age: "))
salary_input = float(input("Enter estimated salary: "))
credit_input = int(input("Enter credit score: "))

# Make prediction
# Create DataFrame with feature names
new_data = pd.DataFrame([{
    'age': age_input,
    'estimated_salary': salary_input,
    'credit_score': credit_input
}])
prediction = knn.predict(new_data)[0]
label = 'Will Purchase' if prediction == 1 else 'Will Not Purchase'
print(f"Prediction: {label}")