## Laptop Prices Prediction using Random Forest Regressor

## 1. Introduction

In this project, we analyze a dataset containing laptop prices and their specifications. The goal is to build a Random Forest Regressor model to predict a target variable based on the given features.

## 2. Libraries Used

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

## 3. Dataset Overview

In [2]:
data = pd.read_csv(r"C:\Users\devad\Downloads\laptop_prices.csv")
data.head()

Unnamed: 0,Company,Product,TypeName,Inches,Ram,OS,Weight,Price_euros,Screen,ScreenW,...,RetinaDisplay,CPU_company,CPU_freq,CPU_model,PrimaryStorage,SecondaryStorage,PrimaryStorageType,SecondaryStorageType,GPU_company,GPU_model
0,Apple,MacBook Pro,Ultrabook,13.3,8,macOS,1.37,1339.69,Standard,2560,...,Yes,Intel,2.3,Core i5,128,0,SSD,No,Intel,Iris Plus Graphics 640
1,Apple,Macbook Air,Ultrabook,13.3,8,macOS,1.34,898.94,Standard,1440,...,No,Intel,1.8,Core i5,128,0,Flash Storage,No,Intel,HD Graphics 6000
2,HP,250 G6,Notebook,15.6,8,No OS,1.86,575.0,Full HD,1920,...,No,Intel,2.5,Core i5 7200U,256,0,SSD,No,Intel,HD Graphics 620
3,Apple,MacBook Pro,Ultrabook,15.4,16,macOS,1.83,2537.45,Standard,2880,...,Yes,Intel,2.7,Core i7,512,0,SSD,No,AMD,Radeon Pro 455
4,Apple,MacBook Pro,Ultrabook,13.3,8,macOS,1.37,1803.6,Standard,2560,...,Yes,Intel,3.1,Core i5,256,0,SSD,No,Intel,Iris Plus Graphics 650


**Dataset Details**:

- The dataset contains information about various laptops, including their brand, specifications, and price.

- The number of columns and their relationships are analyzed.

- Some categorical columns are converted into numeric form using LabelEncoder.

## 4. Data Preprocessing

### Checking Null Values:

In [3]:
print(data.isnull().sum())

Company                 0
Product                 0
TypeName                0
Inches                  0
Ram                     0
OS                      0
Weight                  0
Price_euros             0
Screen                  0
ScreenW                 0
ScreenH                 0
Touchscreen             0
IPSpanel                0
RetinaDisplay           0
CPU_company             0
CPU_freq                0
CPU_model               0
PrimaryStorage          0
SecondaryStorage        0
PrimaryStorageType      0
SecondaryStorageType    0
GPU_company             0
GPU_model               0
dtype: int64


- There are no null values in the dataset, so no imputation is required.

### Converting Categorical Data:

In [4]:
le = LabelEncoder()
data['Company'] = le.fit_transform(data['Company'])
data['Product'] = le.fit_transform(data['Product'])
data['TypeName'] = le.fit_transform(data['TypeName'])
data['OS'] = le.fit_transform(data['OS'])
data['Screen'] = le.fit_transform(data['Screen'])
data['Touchscreen'] = le.fit_transform(data['Touchscreen'])
data['IPSpanel'] = le.fit_transform(data['IPSpanel'])
data['RetinaDisplay'] = le.fit_transform(data['RetinaDisplay'])
data['CPU_company'] = le.fit_transform(data['CPU_company'])
data['CPU_model'] = le.fit_transform(data['CPU_model'])
data['PrimaryStorageType'] = le.fit_transform(data['PrimaryStorageType'])
data['SecondaryStorageType'] = le.fit_transform(data['SecondaryStorageType'])
data['GPU_company'] = le.fit_transform(data['GPU_company'])
data['GPU_model'] = le.fit_transform(data['GPU_model'])

### 5. Model Building

### Splitting the Dataset:

In [5]:
x = data.drop(['RetinaDisplay'], axis=1)
y = data['RetinaDisplay']

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

### Training the Model:

In [6]:
model = RandomForestRegressor()
model.fit(x_train, y_train)

## 6. Model Evaluation


### Making Predictions:

In [7]:
y_pred = model.predict(x_test)

### Calculating Metrics:

In [8]:
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared Score:", r2)

Mean Squared Error: 0.0006074509803921569
R-squared Score: 0.9477519841269841


## 7. Conclusion  

- The **Random Forest Regressor** performed with an **R-squared score** of **0.5149**, indicating moderate predictive power.  
- Further improvements can be made by **feature selection**, **hyperparameter tuning**, and using **advanced models**.  
- This project successfully demonstrated the process of **data preprocessing**, **model building**, and **evaluation** using **Random Forest Regressor**.  
