# Practical Exercise: Predicting Student Grades with k-Nearest Neighbors
In this exercise, you will use the **k-Nearest Neighbors (k-NN)** algorithm to predict the grade category of students based on study habits and lifestyle features.

**Dataset features:**
- `hours_studied`: daily hours dedicated to studying
- `sleep_hours`: average sleep time per night
- `coffee_intake`: 1 if drinks coffee, 0 otherwise

**Target variable:**
- `grade_category`: 0 = Low, 1 = Medium, 2 = High


## Step 1: Load Required Libraries

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report
print("✅ Libraries loaded.")

✅ Libraries loaded.


## Step 2: Load and Inspect the Dataset

In [2]:
df = pd.read_csv('data/knn_student_grades.csv')
print("📄 First rows of the dataset:")
print(df.head())

print("\n📊 Dataset description:")
print(df.describe())

📄 First rows of the dataset:
   hours_studied  sleep_hours  coffee_intake  grade_category
0            3.3          5.5              0               1
1            6.6          7.1              0               1
2            4.6          4.6              0               1
3            2.5          4.6              0               0
4            6.7          6.2              0               1

📊 Dataset description:
       hours_studied  sleep_hours  coffee_intake  grade_category
count     160.000000   160.000000     160.000000      160.000000
mean        4.856875     6.778125       0.481250        1.325000
std         2.076663     1.392282       0.501217        0.532586
min         0.400000     4.000000       0.000000        0.000000
25%         3.275000     5.800000       0.000000        1.000000
50%         5.000000     6.900000       0.000000        1.000000
75%         6.600000     7.800000       1.000000        2.000000
max         9.000000    10.000000       1.000000        2.000

## Step 3: Prepare the Data

In [3]:
X = df[['hours_studied', 'sleep_hours', 'coffee_intake']]
y = df['grade_category']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
print("✅ Data split completed.")

✅ Data split completed.


## Step 4: Train the k-NN Model

In [4]:
model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)
print("✅ Model training complete.")

✅ Model training complete.


## Step 5: Evaluate the Model

In [5]:
y_pred = model.predict(X_test)
print(f"🔍 Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

🔍 Accuracy: 0.93

Classification Report:
              precision    recall  f1-score   support

           0       0.50      1.00      0.67         1
           1       0.92      1.00      0.96        24
           2       1.00      0.80      0.89        15

    accuracy                           0.93        40
   macro avg       0.81      0.93      0.84        40
weighted avg       0.94      0.93      0.93        40



## Step 6: Try a Custom Student Profile

In [6]:
print("🔍 Enter student information to predict grade category:")
study = float(input("Hours studied per day: "))
sleep = float(input("Hours of sleep per night: "))
coffee = int(input("Drinks coffee? (0 = No, 1 = Yes): "))

student_df = pd.DataFrame([[study, sleep, coffee]], columns=X.columns)
pred = model.predict(student_df)[0]
label = ['Low', 'Medium', 'High'][pred]
print(f"Prediction: 📘 The predicted grade category is **{label}**")

🔍 Enter student information to predict grade category:


Hours studied per day:  3
Hours of sleep per night:  8
Drinks coffee? (0 = No, 1 = Yes):  1


Prediction: 📘 The predicted grade category is **Medium**
