# Supervised Regression Project

In a Supervised Regression Project, we use various regression techniques such as K-Nearest Neighbors (KNN) Regressor, AdaBoost, Gradient Boosting, Decision Tree Regressor (DTR), Random Forest, and Linear Regression. The project workflow consists of data collection and preprocessing, selecting or creating meaningful features, training and assessing these models, fine-tuning their settings, comparing their performance, and ultimately choosing the most accurate model for making predictions. This process enables us to harness data to make informed forecasts and informed decisions.

### Importing Necessary Libraries

In [264]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

### Lodaing Dataset

In [265]:
data=pd.read_csv(r"C:\Users\babua\Downloads\CarPrice_Assignment.csv")
data.head(5)

Unnamed: 0,car_ID,symboling,CarName,fueltype,aspiration,doornumber,carbody,drivewheel,enginelocation,wheelbase,...,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,16500.0
2,3,1,alfa-romero Quadrifoglio,gas,std,two,hatchback,rwd,front,94.5,...,152,mpfi,2.68,3.47,9.0,154,5000,19,26,16500.0
3,4,2,audi 100 ls,gas,std,four,sedan,fwd,front,99.8,...,109,mpfi,3.19,3.4,10.0,102,5500,24,30,13950.0
4,5,2,audi 100ls,gas,std,four,sedan,4wd,front,99.4,...,136,mpfi,3.19,3.4,8.0,115,5500,18,22,17450.0


### Data Exploration 

In [266]:
data.shape

(205, 26)

In [267]:
data.info

<bound method DataFrame.info of      car_ID  symboling                   CarName fueltype aspiration  \
0         1          3        alfa-romero giulia      gas        std   
1         2          3       alfa-romero stelvio      gas        std   
2         3          1  alfa-romero Quadrifoglio      gas        std   
3         4          2               audi 100 ls      gas        std   
4         5          2                audi 100ls      gas        std   
..      ...        ...                       ...      ...        ...   
200     201         -1           volvo 145e (sw)      gas        std   
201     202         -1               volvo 144ea      gas      turbo   
202     203         -1               volvo 244dl      gas        std   
203     204         -1                 volvo 246   diesel      turbo   
204     205         -1               volvo 264gl      gas      turbo   

    doornumber      carbody drivewheel enginelocation  wheelbase  ...  \
0          two  convertible   

In [268]:
data.describe()

Unnamed: 0,car_ID,symboling,wheelbase,carlength,carwidth,carheight,curbweight,enginesize,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
count,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0,205.0
mean,103.0,0.834146,98.756585,174.049268,65.907805,53.724878,2555.565854,126.907317,3.329756,3.255415,10.142537,104.117073,5125.121951,25.219512,30.75122,13276.710571
std,59.322565,1.245307,6.021776,12.337289,2.145204,2.443522,520.680204,41.642693,0.270844,0.313597,3.97204,39.544167,476.985643,6.542142,6.886443,7988.852332
min,1.0,-2.0,86.6,141.1,60.3,47.8,1488.0,61.0,2.54,2.07,7.0,48.0,4150.0,13.0,16.0,5118.0
25%,52.0,0.0,94.5,166.3,64.1,52.0,2145.0,97.0,3.15,3.11,8.6,70.0,4800.0,19.0,25.0,7788.0
50%,103.0,1.0,97.0,173.2,65.5,54.1,2414.0,120.0,3.31,3.29,9.0,95.0,5200.0,24.0,30.0,10295.0
75%,154.0,2.0,102.4,183.1,66.9,55.5,2935.0,141.0,3.58,3.41,9.4,116.0,5500.0,30.0,34.0,16503.0
max,205.0,3.0,120.9,208.1,72.3,59.8,4066.0,326.0,3.94,4.17,23.0,288.0,6600.0,49.0,54.0,45400.0


In [269]:
#unique values
l={}
for feature in data.columns:
    unique_values=data[feature].unique()
    l[feature]={"unique values":unique_values,'length of feature':len(unique_values)}
for feature,info in l.items():
    print(f"Feature: {feature}")
    print(f"Unique values: {info['unique values']}")
    print(f"Number of unique values: {info['length of feature']}")
    print("-" * 30)

Feature: car_ID
Unique values: [  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107 108
 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
 199 200 201 202 203 204 205]
Number of unique values: 205
------------------------------
Feature: symboling
Unique values: [ 3  1  2  0 -1 -2]
Number of unique value

### Data Preprocessing

In [270]:
data.isnull().sum()

car_ID              0
symboling           0
CarName             0
fueltype            0
aspiration          0
doornumber          0
carbody             0
drivewheel          0
enginelocation      0
wheelbase           0
carlength           0
carwidth            0
carheight           0
curbweight          0
enginetype          0
cylindernumber      0
enginesize          0
fuelsystem          0
boreratio           0
stroke              0
compressionratio    0
horsepower          0
peakrpm             0
citympg             0
highwaympg          0
price               0
dtype: int64

In [271]:
data.duplicated().sum()

0

In [272]:
data.dtypes

car_ID                int64
symboling             int64
CarName              object
fueltype             object
aspiration           object
doornumber           object
carbody              object
drivewheel           object
enginelocation       object
wheelbase           float64
carlength           float64
carwidth            float64
carheight           float64
curbweight            int64
enginetype           object
cylindernumber       object
enginesize            int64
fuelsystem           object
boreratio           float64
stroke              float64
compressionratio    float64
horsepower            int64
peakrpm               int64
citympg               int64
highwaympg            int64
price               float64
dtype: object

In [273]:
#removing outliers
for column in data.columns:
    if data.dtypes[column]!='object':
        Q1 = data[column].quantile(0.25)
        Q3 = data[column].quantile(0.75)
        IQR = Q3-Q1
        lower_bound = Q1 - 1.5 * IQR
        upper_bound = Q3 + 1.5 * IQR
        outliers = data[(data[column] < lower_bound) & (data[column] > upper_bound)]
data.head(5)

Unnamed: 0,car_ID,symboling,CarName,fueltype,aspiration,doornumber,carbody,drivewheel,enginelocation,wheelbase,...,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,16500.0
2,3,1,alfa-romero Quadrifoglio,gas,std,two,hatchback,rwd,front,94.5,...,152,mpfi,2.68,3.47,9.0,154,5000,19,26,16500.0
3,4,2,audi 100 ls,gas,std,four,sedan,fwd,front,99.8,...,109,mpfi,3.19,3.4,10.0,102,5500,24,30,13950.0
4,5,2,audi 100ls,gas,std,four,sedan,4wd,front,99.4,...,136,mpfi,3.19,3.4,8.0,115,5500,18,22,17450.0


In [274]:
data.shape

(205, 26)

In [275]:
#checking relevance of categorical values in price determination
for column in categorical_data:
    groups = data[column].unique()
    anova_results = stats.f_oneway(*[data['price'][data[column] == group] for group in groups])
    
    print(f"ANOVA for {column}:")
    print("F-statistic:", anova_results.statistic)
    print("P-value:", anova_results.pvalue)
    print("\n")

ANOVA for CarName:
F-statistic: 8.421907781044004
P-value: 6.414986719214914e-16


ANOVA for fueltype:
F-statistic: 2.2927407366575174
P-value: 0.13153563336537924


ANOVA for aspiration:
F-statistic: 6.636621968649918
P-value: 0.010700300833183433


ANOVA for doornumber:
F-statistic: 0.20594600575940436
P-value: 0.6504483953298938


ANOVA for carbody:
F-statistic: 8.031976496876302
P-value: 5.031712258477608e-06


ANOVA for drivewheel:
F-statistic: 70.3205526496926
P-value: 6.632887281209634e-24


ANOVA for enginelocation:
F-statistic: 23.9697400547047
P-value: 1.993019639057392e-06


ANOVA for enginetype:
F-statistic: 9.376220306463633
P-value: 4.692664568743044e-09


ANOVA for cylindernumber:
F-statistic: 57.568880995353695
P-value: 8.065780498463557e-41


ANOVA for fuelsystem:
F-statistic: 15.641864574663314
P-value: 2.990385908932205e-16




The tests we did with ANOVA show that some things about the cars really affect their prices. For example, the kind of car (CarName), the shape of the car (carbody), how the wheels work (drivewheel), where the engine is (enginelocation), the type of engine (enginetype), how many cylinders the engine has (cylindernumber), and the fuel system used (fuelsystem) all make a big difference in how much a car costs. We're pretty sure of this because the p-values are really low, meaning there are clear differences in prices. On the other hand, stuff like the type of fuel (fueltype), how the engine takes in air (aspiration), and how many doors the car has (doornumber) don't seem to matter much because the p-values are high. These results help us understand why car prices vary in our data.


In [276]:
#removing unwanted columns
new_data=data.drop(['doornumber','fueltype','aspiration'],axis=1)
new_data.head(5)

Unnamed: 0,car_ID,symboling,CarName,carbody,drivewheel,enginelocation,wheelbase,carlength,carwidth,carheight,...,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,convertible,rwd,front,88.6,168.8,64.1,48.8,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,convertible,rwd,front,88.6,168.8,64.1,48.8,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,16500.0
2,3,1,alfa-romero Quadrifoglio,hatchback,rwd,front,94.5,171.2,65.5,52.4,...,152,mpfi,2.68,3.47,9.0,154,5000,19,26,16500.0
3,4,2,audi 100 ls,sedan,fwd,front,99.8,176.6,66.2,54.3,...,109,mpfi,3.19,3.4,10.0,102,5500,24,30,13950.0
4,5,2,audi 100ls,sedan,4wd,front,99.4,176.6,66.4,54.3,...,136,mpfi,3.19,3.4,8.0,115,5500,18,22,17450.0


In [277]:
categoricalv=list(new_data.select_dtypes(include=['object']).columns)
categoricalv

['CarName',
 'carbody',
 'drivewheel',
 'enginelocation',
 'enginetype',
 'cylindernumber',
 'fuelsystem']

## Creating Machine Learning Models

##### Importing Necessary Libraries 

In [278]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split,cross_val_score
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor, AdaBoostRegressor,GradientBoostingRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline

In [279]:
X=new_data.drop('price',axis=1)
y=new_data['price']
X.head(5)

Unnamed: 0,car_ID,symboling,CarName,carbody,drivewheel,enginelocation,wheelbase,carlength,carwidth,carheight,...,cylindernumber,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg
0,1,3,alfa-romero giulia,convertible,rwd,front,88.6,168.8,64.1,48.8,...,four,130,mpfi,3.47,2.68,9.0,111,5000,21,27
1,2,3,alfa-romero stelvio,convertible,rwd,front,88.6,168.8,64.1,48.8,...,four,130,mpfi,3.47,2.68,9.0,111,5000,21,27
2,3,1,alfa-romero Quadrifoglio,hatchback,rwd,front,94.5,171.2,65.5,52.4,...,six,152,mpfi,2.68,3.47,9.0,154,5000,19,26
3,4,2,audi 100 ls,sedan,fwd,front,99.8,176.6,66.2,54.3,...,four,109,mpfi,3.19,3.4,10.0,102,5500,24,30
4,5,2,audi 100ls,sedan,4wd,front,99.4,176.6,66.4,54.3,...,five,136,mpfi,3.19,3.4,8.0,115,5500,18,22


In [280]:
y.head(5)

0    13495.0
1    16500.0
2    16500.0
3    13950.0
4    17450.0
Name: price, dtype: float64

In [281]:
#splitting data for testing and training
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=.3,random_state=2)

### Encoding Catregorical values

In [282]:
ct=make_column_transformer(
    (OneHotEncoder(handle_unknown='ignore'),['CarName','carbody','drivewheel','enginelocation','enginetype','cylindernumber','fuelsystem']),remainder='passthrough')
ct.fit_transform(X)


<205x194 sparse matrix of type '<class 'numpy.float64'>'
	with 4443 stored elements in Compressed Sparse Row format>

###  Linear Regression

In [283]:
lr=LinearRegression()
pipe=make_pipeline(ct,lr)
pipe.fit(X_train,y_train)

In [284]:
y_pred=pipe.predict(X_test)
y_pred

array([11421.05919618, 15602.64943981,  7712.77788756, 17067.0926779 ,
       10462.33999471,  8626.10554711, 15446.14528478, 14669.91973108,
       10097.29015127,  9805.25759781, 18823.83139557,  6068.00489633,
        8241.41067026, 35647.05250353, 27282.67197058,  3774.98868988,
        6085.47112815,  6796.11108484, 29715.19531346,  9311.50366136,
       12875.64289011,  8404.31094747,  9500.61937999, 21262.3679622 ,
        7230.32575388,  3800.14767814,  6564.62726325,  9358.90826563,
       19115.00209738, 20904.71700873,  8560.66646583,  5034.35389241,
       12327.52650181, 21789.81986214,  5905.8261014 , 26154.83896245,
       19523.71769195, 26204.67244645, 15659.93039618, 10214.41860772,
       17842.59321798, 34814.27066918, 29539.10255502, 10280.43022749,
       13180.16192845,  7141.38109626, 10618.49509288,  7867.1862835 ,
       15285.23532321, -3194.08074711, 43255.77734521, 11961.58287358,
        5985.58821888,  8892.90801657, 15883.88041919, 12635.35029958,
      

In [285]:
#Evaluating model's performance
mse=mean_squared_error(y_test,y_pred)
r2score=r2_score(y_test,y_pred)
msa=mean_absolute_error(y_test,y_pred)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  29936883.222583182
mean absolute error =  4021.986127730151
r2 score =  0.36083215742724994


### Decision Tree Regressor

In [286]:
dt=DecisionTreeRegressor()
pipe=make_pipeline(ct,dt)
pipe.fit(X_train,y_train)

In [287]:
# making predictions on test data
y_pred=pipe.predict(X_test)
y_pred

array([ 7609., 14869.,  8949., 16925., 16515.,  6649., 12170., 16900.,
        8358., 11248., 17450.,  7898.,  8495., 15250., 11850.,  6849.,
        8058.,  7499., 37028.,  6095.,  7957.,  6575., 16845., 15250.,
       10595.,  6295.,  6338., 10898., 11549., 16500., 11845., 12290.,
        6669., 15250.,  6649., 17199., 16515., 17199., 12290.,  8358.,
       11900., 40960., 23875.,  5399., 11248., 10595.,  6649., 10595.,
        7957.,  8449., 32250.,  7895., 10595., 10595., 17199., 13860.,
       15690., 17199.,  8921., 10945.,  6649., 12170.])

In [288]:
#Evaluating model's performance
mse=mean_squared_error(y_pred,y_test)
r2score=r2_score(y_pred,y_test)
msa=mean_absolute_error(y_pred,y_test)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  8296531.519094983
mean absolute error =  2019.2231129032257
r2 score =  0.8206907954998894


### Random Forest Regressor 

In [289]:
rf=RandomForestRegressor()
pipe=make_pipeline(ct,rf)
pipe.fit(X_train,y_train)

In [290]:
y_pred=pipe.predict(X_test)
y_pred

array([ 7024.54 , 13865.15 ,  9022.61 , 14580.88 , 15625.97 ,  6658.73 ,
       13094.59 , 10816.9  ,  8853.72 , 10873.44 , 13944.13 ,  8107.14 ,
        8546.6  , 14048.17 , 13155.35 ,  7028.06 ,  8214.82 ,  7227.58 ,
       33442.88 ,  6515.7  ,  8160.14 ,  6500.51 , 19088.3  , 14167.07 ,
        9599.03 ,  6089.43 ,  7609.27 , 10328.84 , 12783.41 , 15232.53 ,
       11563.57 , 11958.92 ,  6443.43 , 15059.98 ,  6443.94 , 17705.63 ,
       16615.76 , 17768.38 , 11183.48 ,  8853.72 , 14565.7  , 35426.815,
       19298.41 ,  5878.65 , 10448.36 ,  8697.85 ,  6665.67 ,  7971.09 ,
        7923.47 , 10068.21 , 34926.855,  9190.49 ,  8339.48 ,  8456.55 ,
       18571.57 , 14627.39 , 15719.2  , 18016.31 ,  9797.81 , 12029.64 ,
        6803.52 , 10532.35 ])

In [291]:
#Evaluating model's performance
mse=mean_squared_error(y_pred,y_test)
r2score=r2_score(y_pred,y_test)
msa=mean_absolute_error(y_pred,y_test)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  6298867.440558049
mean absolute error =  1596.5337580645164
r2 score =  0.8402807105691861


### AdaBoostRegressor 

In [292]:
ada=AdaBoostRegressor()
pipe=make_pipeline(ct,ada)
pipe.fit(X_train,y_train)

In [293]:
y_pred=pipe.predict(X_test)
y_pred

array([ 8043.56666667, 14744.58974359,  9414.15789474, 14339.2       ,
       15284.10843373,  8080.19230769, 14399.35294118,  9228.4047619 ,
        8328.44444444, 11174.34482759, 14744.58974359,  8306.29032258,
        8426.44642857, 14702.97916667, 14499.79545455,  8114.42857143,
        8166.35294118,  8114.42857143, 34455.35714286,  8043.56666667,
        9366.875     ,  8043.56666667, 19245.46296296, 14442.69230769,
        9698.5       ,  8043.56666667,  8131.41666667, 10285.        ,
       13986.89230769, 14744.58974359, 13040.23076923, 12035.25806452,
        8043.56666667, 15284.10843373,  8043.56666667, 16714.94117647,
       15284.10843373, 16714.94117647, 11151.5       ,  8328.44444444,
       14823.3030303 , 35234.34210526, 20110.53488372,  8043.56666667,
       10020.94736842,  8449.14285714,  8080.19230769,  8114.42857143,
        8131.41666667, 11563.57142857, 33574.13888889,  9366.875     ,
        8328.44444444,  8328.44444444, 18257.4       , 15284.10843373,
      

In [294]:
#Evaluating model's performance
mse=mean_squared_error(y_pred,y_test)
r2score=r2_score(y_pred,y_test)
msa=mean_absolute_error(y_pred,y_test)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  6556290.539913769
mean absolute error =  1732.3834461960662
r2 score =  0.817861874383555


### GradientBoostingRegressor

In [295]:
gb=GradientBoostingRegressor()
pipe=make_pipeline(ct,gb)
pipe.fit(X_train,y_train)

In [296]:
y_pred=pipe.predict(X_test)
y_pred

array([ 6650.29331968, 13862.77064977,  9014.53929679, 16288.67847359,
       15216.26476884,  7125.3627887 , 13904.95475736, 10442.80651572,
        9024.1247563 , 11037.50983373, 13595.53920565,  7603.82247311,
        8752.00168901, 16071.69097176, 14017.49723493,  7168.99977465,
        8105.32222388,  7674.7962827 , 32816.79003937,  6397.5558684 ,
        8713.93772151,  6428.26733189, 19608.66510152, 15097.53773954,
        9273.59036959,  5965.46778934,  7342.73776306, 10778.30146541,
       12978.06721502, 16914.854789  , 11867.19331141, 13835.99956254,
        6353.14610211, 16596.18752125,  6428.26733189, 17828.39944874,
       15951.60015596, 17828.39944874, 12590.71814582,  9024.1247563 ,
       14693.18562578, 37240.47103226, 20381.48093616,  6112.31721144,
       10300.46153502,  8208.35824515,  7046.14484669,  7571.08208187,
        7761.70704159, 10515.8547776 , 32702.8882881 ,  9123.33217315,
        7860.87017618,  8196.46221112, 17752.94501214, 15860.35939491,
      

In [297]:
#Evaluating model's performance
mse=mean_squared_error(y_pred,y_test)
r2score=r2_score(y_pred,y_test)
msa=mean_absolute_error(y_pred,y_test)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  5537691.697621403
mean absolute error =  1536.4564809951708
r2 score =  0.8618416234703037


### KNeighborsRegressor

In [298]:
knn= KNeighborsRegressor()
pipe=make_pipeline(ct,knn)
pipe.fit(X_train,y_train)

In [299]:
y_pred=pipe.predict(X_test)
y_pred

array([ 6509.6, 13733. ,  9159.4, 10611. , 17975. ,  6010.4, 11851.8,
       11127.2,  8071.4,  9465.4, 13562.6,  8002.4,  8719.6, 14039.2,
       11804.6,  7319. ,  8262. ,  7140.6, 25073.8,  6446.6,  7696.4,
        6041.6, 19031.6, 11127.2,  9238.6,  6242.6,  8070. ,  9318.4,
       13736. , 13562.6, 13131. , 10465. ,  6509.6, 15797. ,  6043.6,
       16332.4, 16869.8, 16332.4,  8885. ,  8071.4, 16153.8, 37933.2,
       19789. ,  6263.2,  9444.8,  8436.2,  6010.4,  8626.4,  9155.8,
       10448. , 37933.2,  9414. ,  8070. ,  8626.4, 17975. , 16869.8,
       17007.4, 15917.2, 11280. , 12161. ,  7319. ,  9238.6])

In [300]:
#Evaluating model's performance
mse=mean_squared_error(y_pred,y_test)
r2score=r2_score(y_pred,y_test)
msa=mean_absolute_error(y_pred,y_test)
print("mean square error = ",mse)
print("mean absolute error = ",msa)
print("r2 score = ",r2score)

mean square error =  10407776.257869177
mean absolute error =  2139.329564516129
r2 score =  0.7388971267089172



In conclusion, after thoroughly exploring and evaluating different regression models including KNN Regressor, AdaBoost, Decision Tree Regressor, Random Forest, Linear Regression, and Gradient Boosting, it has been determined that the Gradient Boosting model stands out as the most effective choice.