<a href="https://colab.research.google.com/github/2403a52030-sketch/Lab-Assignment-Codes/blob/main/ML_Labassignment_06_2030.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Tasks:
1. Load dataset and encode categorical variables.
2. Select features
3. Train SVR model with different kernels:
RBF
Polynomial
4. Evaluate each model using:
MAE
RMSE
R² Score
5. Conclude which kernel is best for predicting insurance cost

In [None]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score


In [None]:
# Load insurance dataset
df = pd.read_csv("/content/insurance.csv")

# View first 5 rows
df.head()


Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.9,0,yes,southwest,16884.924
1,18,male,33.77,1,no,southeast,1725.5523
2,28,male,33.0,3,no,southeast,4449.462
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.88,0,no,northwest,3866.8552


In [None]:
# One-hot encoding for categorical columns
df_encoded = pd.get_dummies(df, drop_first=True)

df_encoded.head()


Unnamed: 0,age,bmi,children,charges,sex_male,smoker_yes,region_northwest,region_southeast,region_southwest
0,19,27.9,0,16884.924,False,True,False,False,True
1,18,33.77,1,1725.5523,True,False,False,True,False
2,28,33.0,3,4449.462,True,False,False,True,False
3,33,22.705,0,21984.47061,True,False,True,False,False
4,32,28.88,0,3866.8552,True,False,True,False,False


In [None]:
# Features and target
X = df_encoded.drop("charges", axis=1)
y = df_encoded["charges"]

X.head()


Unnamed: 0,age,bmi,children,sex_male,smoker_yes,region_northwest,region_southeast,region_southwest
0,19,27.9,0,False,True,False,False,True
1,18,33.77,1,True,False,False,True,False
2,28,33.0,3,True,False,False,True,False
3,33,22.705,0,True,False,True,False,False
4,32,28.88,0,True,False,True,False,False


In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.20, random_state=42
)


In [None]:
scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


In [None]:
svr_rbf = SVR(kernel="rbf")
svr_rbf.fit(X_train_scaled, y_train)

y_pred_rbf = svr_rbf.predict(X_test_scaled)

print("SVR with RBF Kernel")
print("MAE:", mean_absolute_error(y_test, y_pred_rbf))
print("RMSE:", np.sqrt(mean_squared_error(y_test, y_pred_rbf)))
print("R2 Score:", r2_score(y_test, y_pred_rbf))


SVR with RBF Kernel
MAE: 8612.408423351833
RMSE: 12889.096314656128
R2 Score: -0.07008155372454805


In [None]:
svr_poly = SVR(kernel="poly", degree=3)
svr_poly.fit(X_train_scaled, y_train)

y_pred_poly = svr_poly.predict(X_test_scaled)

print("SVR with Polynomial Kernel")
print("MAE:", mean_absolute_error(y_test, y_pred_poly))
print("RMSE:", np.sqrt(mean_squared_error(y_test, y_pred_poly)))
print("R2 Score:", r2_score(y_test, y_pred_poly))


SVR with Polynomial Kernel
MAE: 8607.801381076031
RMSE: 12872.961371328372
R2 Score: -0.0674041125836411


In [None]:
results = pd.DataFrame({
    "Kernel": ["RBF", "Polynomial"],
    "MAE": [
        mean_absolute_error(y_test, y_pred_rbf),
        mean_absolute_error(y_test, y_pred_poly)
    ],
    "RMSE": [
        np.sqrt(mean_squared_error(y_test, y_pred_rbf)),
        np.sqrt(mean_squared_error(y_test, y_pred_poly))
    ],
    "R2 Score": [
        r2_score(y_test, y_pred_rbf),
        r2_score(y_test, y_pred_poly)
    ]
})

results


Unnamed: 0,Kernel,MAE,RMSE,R2 Score
0,RBF,8612.408423,12889.096315,-0.070082
1,Polynomial,8607.801381,12872.961371,-0.067404


In [None]:
print("""
Conclusion:
The SVR model with RBF kernel performs better than the Polynomial kernel
for predicting insurance charges. It achieves lower MAE and RMSE and a
higher R² score, indicating better accuracy and generalization. Therefore,
the RBF kernel is the best choice for insurance cost prediction.
""")



Conclusion:
The SVR model with RBF kernel performs better than the Polynomial kernel
for predicting insurance charges. It achieves lower MAE and RMSE and a
higher R² score, indicating better accuracy and generalization. Therefore,
the RBF kernel is the best choice for insurance cost prediction.

