# 📱 Mobile Price Range Prediction using Random Forest + RandomizedSearchCV

_Predicting phone price ranges using machine learning with clean code, hyperparameter tuning, and ✨vibes✨._

## 🔍 Introduction

In this notebook, we use a dataset of mobile phone specifications to **predict the price range** a phone falls into — whether it's a budget phone, mid-range, or high-end.

We’ll train a `RandomForestClassifier` and use `RandomizedSearchCV` to find the best hyperparameters. 🌳✨

This is a real-world inspired task often seen in **e-commerce**, where companies need to recommend or auto-tag price ranges based on features.


## 📚 Importing Libraries

In [1]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import RandomizedSearchCV

## 🧼 Data Overview + Preprocessing

In [2]:
training_df = pd.read_csv('/kaggle/input/mobile-price-classification/train.csv')
testing_df = pd.read_csv('/kaggle/input/mobile-price-classification/test.csv')

In [3]:
training_df.head()

Unnamed: 0,battery_power,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,...,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi,price_range
0,842,0,2.2,0,1,0,7,0.6,188,2,...,20,756,2549,9,7,19,0,0,1,1
1,1021,1,0.5,1,0,1,53,0.7,136,3,...,905,1988,2631,17,3,7,1,1,0,2
2,563,1,0.5,1,2,1,41,0.9,145,5,...,1263,1716,2603,11,2,9,1,1,0,2
3,615,1,2.5,0,0,0,10,0.8,131,6,...,1216,1786,2769,16,8,11,1,0,0,2
4,1821,1,1.2,0,13,1,44,0.6,141,2,...,1208,1212,1411,8,2,15,1,1,0,1


In [4]:
training_df.isnull().sum()

battery_power    0
blue             0
clock_speed      0
dual_sim         0
fc               0
four_g           0
int_memory       0
m_dep            0
mobile_wt        0
n_cores          0
pc               0
px_height        0
px_width         0
ram              0
sc_h             0
sc_w             0
talk_time        0
three_g          0
touch_screen     0
wifi             0
price_range      0
dtype: int64

In [5]:
testing_df.columns

Index(['id', 'battery_power', 'blue', 'clock_speed', 'dual_sim', 'fc',
       'four_g', 'int_memory', 'm_dep', 'mobile_wt', 'n_cores', 'pc',
       'px_height', 'px_width', 'ram', 'sc_h', 'sc_w', 'talk_time', 'three_g',
       'touch_screen', 'wifi'],
      dtype='object')

In [6]:
training_df.columns

Index(['battery_power', 'blue', 'clock_speed', 'dual_sim', 'fc', 'four_g',
       'int_memory', 'm_dep', 'mobile_wt', 'n_cores', 'pc', 'px_height',
       'px_width', 'ram', 'sc_h', 'sc_w', 'talk_time', 'three_g',
       'touch_screen', 'wifi', 'price_range'],
      dtype='object')

In [7]:
X = training_df.drop(['price_range'], axis=1)
y = training_df['price_range']
y_test = testing_df.drop(['id'], axis=1)

In [8]:
scaler = StandardScaler()

In [9]:
X_train = scaler.fit_transform(X)
y_test = scaler.transform(y_test)

## 🌳 Model: Random Forest + RandomizedSearchCV

In [10]:
rf = RandomForestClassifier()

In [11]:
max_features = [0.2, 0.4, 0.5, 0.7, 1.0]
# bootstrap = [True, False]
max_samples = [0.2, 0.4, 0.5, 0.7, 1.0]

In [12]:
parameters = {'max_features' : max_features,
             'max_samples' : max_samples}

In [13]:
search = RandomizedSearchCV(
    estimator=rf,
    param_distributions=parameters, 
    n_iter=20,       
    cv=5,            
    n_jobs=-1,       
    random_state=42  
)

In [14]:
search.fit(X_train,y)

In [15]:
search.best_params_

{'max_samples': 0.7, 'max_features': 0.7}

In [16]:
search.best_score_

0.9035

## Model Training

In [17]:
rf.fit(X_train, y)

## Model Prediction

In [18]:
y_pred = rf.predict(y_test)

## 📁 Submission File

In [19]:
submission = pd.DataFrame({
    "id": range(len(y_pred)),  # Generate 0, 1, 2, ..., len(test)-1
    "price_range": y_pred
})


submission.to_csv("submission.csv", index=False)

**If you liked this notebook, feel free to 🌟 upvote it 🌟 or leave a comment!** 