### Multiple Linear Regression
### What is Multiple Linear Regression?
#### Think of it like this:
#### In Simple Linear Regression, we had one input (X) and one output (Y):
#### Price = 𝑚 × RAM + 𝑏
#### But in Multiple Linear Regression, we have many inputs (X1, X2, X3... Xn) and one output (Y):
#### Price = 𝑚1 × RAM + 𝑚2 × Storage + 𝑚3 × ProcessorSpeed + 𝑏
#### A laptop’s price doesn’t depend only on RAM but also on storage, processor speed, brand, and other factors.
#### Multiple Linear Regression finds the best combination of all these factors to predict the price.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
import matplotlib.pyplot as plt

In [2]:
laptop = pd.read_csv('laptop.csv')
laptop.sample(5)

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,brand,name,price,spec_rating,processor,CPU,Ram,Ram_type,ROM,ROM_type,GPU,display_size,resolution_width,resolution_height,OS,warranty
526,550,629,MSI,Pulse 15 B13VGK-1296IN Gaming Laptop,154990,71.0,13th Gen Intel Core i7 13700H,"14 Cores (6P + 8E), 20 Threads",16GB,DDR5,1TB,SSD,8GB NVIDIA GeForce RTX 4070,15.6,2560.0,1440.0,Windows 11 OS,2
52,53,55,Honor,MagicBook X14 2023 ‎FRI-F56 Laptop,52990,66.0,12th Gen Intel Core i5 12450H,"Octa Core (4P + 4E), 12 Threads",16GB,DDR4,512GB,SSD,Intel UHD Graphics,14.0,1920.0,1200.0,Windows 11 OS,1
486,509,576,Acer,Aspire Lite 15 AL15-51 2023 Laptop,45999,64.0,11th Gen Intel Core i5 1155G7,"Quad Core, 8 Threads",16GB,DDR4,1TB,SSD,Intel Iris Xe Graphics,15.6,1920.0,1080.0,Windows 11 OS,1
68,69,72,Xiaomi,Redmi Book 14 2023 Laptop,43990,69.323529,12th Gen Intel Core i5 12500H,"12 Cores (4P + 8E), 16 Threads",16GB,DDR5,512GB,SSD,Intel Iris Xe Graphics,14.0,2880.0,1800.0,Windows 11 OS,1
57,58,61,HP,15S-FQ5202TU Laptop,51780,69.323529,12th Gen Intel Core i5 1235U,"10 Cores (2P + 8E), 12 Threads",8GB,DDR4,512GB,SSD,Intel Iris Xe Graphics,15.6,1920.0,1080.0,Windows 11 OS,1


In [3]:
laptop['Ram'] = laptop['Ram'].str.extract(r'(\d+)').astype(int)
laptop['ROM'] = laptop['ROM'].str.extract(r'(\d+)').astype(int)
laptop['spec_rating'] = laptop['spec_rating']
laptop['display_size'] = laptop['display_size']
# for multiple linear regression taking multiple input features
X = laptop[['Ram', 'ROM', 'spec_rating', 'display_size']]
y = laptop['price']

In [5]:
#training and splitting test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [6]:
lrmodel = LinearRegression()
lrmodel.fit(X_train,y_train)

In [7]:
print("Slope (m):", lrmodel.coef_[0])  # How much price changes per GB
print("Intercept (b):", lrmodel.intercept_)  # Base price

Slope (m): 5220.154180949262
Intercept (b): -209089.8588983183


In [8]:
y_pred = lrmodel.predict(X_test)

In [9]:
#mean absolute error provides avg range of model mistakes here we predicted ram against price so for that it is providing mean erro 27489 rs
# mostly mae should be around 1000 if its more than that we need to apply tuning or log loss function to get it near 0 value
mae = mean_absolute_error(y_test,y_pred)
print("Mean absolute error: ", mae)

Mean absolute error:  24424.289800691313


In [10]:
#predicting linear regression model score
regression_score = lrmodel.score(X_test,y_test)
print(regression_score*100,'%')

64.98257997642413 %
