# Case Study 2.C.02 — Mobile Phone Price Classification

This notebook mirrors the Python scripts and is designed for classroom demos.

## Goals
1. Create scatter plots colored by **Price Range**.
2. Train and evaluate two classifiers using only **Battery Power** and **Front Camera Megapixels**:
   - KNeighborsClassifier (with StandardScaler)
   - RandomForestClassifier

Dataset file: `K4.0026_2.C.02_MobilePhone.csv`


In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D  # noqa: F401

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

DATA_PATH = "/mnt/data/K4.0026_2.C.02_MobilePhone.csv"

In [None]:
# Load the dataset
df = pd.read_csv(DATA_PATH)
df.head()

## Encode labels for coloring
We map `Price Range` to integers: `{'l': 0, 'm': 1, 'h': 2}`.

In [None]:
label_map = {'l': 0, 'm': 1, 'h': 2}
y_color = df['Price Range'].map(label_map).values
np.unique(df['Price Range']), np.unique(y_color)

## Part 1 — Visualization

In [None]:
# 1) Battery Power vs Internal Memory
plt.figure(figsize=(7,5))
plt.scatter(df['battery_power'], df['int_memory'], c=y_color, s=30)
plt.xlabel('Battery Power')
plt.ylabel('Internal Memory (GB)')
plt.title('Battery Power vs Internal Memory (colored by Price Range)')
plt.grid(True)
plt.tight_layout()
plt.show()

In [None]:
# 2) Front Camera Megapixels vs Bluetooth
plt.figure(figsize=(7,5))
plt.scatter(df['frontcamermegapixels'], df['blue'], c=y_color, s=30)
plt.xlabel('Front Camera Megapixels')
plt.ylabel('Bluetooth (0/1)')
plt.title('Front Camera Megapixels vs Bluetooth (colored by Price Range)')
plt.grid(True)
plt.tight_layout()
plt.show()

In [None]:
# 3) Internal Memory vs Front Camera Megapixels
plt.figure(figsize=(7,5))
plt.scatter(df['int_memory'], df['frontcamermegapixels'], c=y_color, s=30)
plt.xlabel('Internal Memory (GB)')
plt.ylabel('Front Camera Megapixels')
plt.title('Internal Memory vs Front Camera Megapixels (colored by Price Range)')
plt.grid(True)
plt.tight_layout()
plt.show()

In [None]:
# 4) 3D: Battery Power, Bluetooth, Front Camera Megapixels
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(df['battery_power'], df['blue'], df['frontcamermegapixels'], c=y_color, s=30)
ax.set_xlabel('Battery Power')
ax.set_ylabel('Bluetooth (0/1)')
ax.set_zlabel('Front Camera Megapixels')
ax.set_title('3D: Battery Power, Bluetooth, Front Camera Megapixels (colored by Price Range)')
plt.tight_layout()
plt.show()

## Part 2 — Classification (KNN and Random Forest)
We use **only** `battery_power` and `frontcamermegapixels` as features, per the case study.

In [None]:
X = df[['battery_power', 'frontcamermegapixels']].values
y_str = df['Price Range'].values
le = LabelEncoder().fit(y_str)
y = le.transform(y_str)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, stratify=y, random_state=42
)

In [None]:
# KNN with StandardScaler
knn_pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("knn", KNeighborsClassifier(n_neighbors=5))
])
knn_pipe.fit(X_train, y_train)
y_pred_knn = knn_pipe.predict(X_test)
print("KNN Accuracy:", round(accuracy_score(y_test, y_pred_knn), 4))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_knn))
print("Classification Report:\n", classification_report(y_test, y_pred_knn, target_names=le.classes_))

In [None]:
# Random Forest
rf = RandomForestClassifier(n_estimators=200, random_state=42)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)
print("RandomForest Accuracy:", round(accuracy_score(y_test, y_pred_rf), 4))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_rf))
print("Classification Report:\n", classification_report(y_test, y_pred_rf, target_names=le.classes_))