In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import RidgeCV
from sklearn.metrics import mean_squared_error, r2_score


In [3]:
# Load the dataset
data = pd.read_csv('/content/Car price.csv')  # Replace 'car_data.csv' with your dataset filename


In [4]:
# Select relevant features and target variable
features = ['symboling', 'wheelbase', 'carlength', 'carwidth', 'carheight', 'curbweight',
            'enginesize', 'boreratio', 'stroke', 'compressionratio', 'horsepower', 'peakrpm',
            'citympg', 'highwaympg']
target = 'price'

In [5]:
X = data[features]
y = data[target]

In [6]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [7]:
# Create and train the RidgeCV model
model = RidgeCV(alphas=[0.1, 1.0, 10.0])  # Provide a list of alpha values to test
model.fit(X_train, y_train)

In [8]:
# Make predictions on the test set
y_pred = model.predict(X_test)


In [9]:
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)

In [11]:
print("Mean Squared Error:", mse)
print("Root Mean Squared Error:", rmse)
print("R-squared:", r2*100)

Mean Squared Error: 14191049.705572823
Root Mean Squared Error: 3767.1009683273455
R-squared: 82.02390814726087


1. Mean Squared Error (MSE): The MSE value of 14191049.
705572823 indicates the average squared difference between the predicted values and the actual values. A higher MSE suggests that the model's predictions have larger deviations from the true values. It is important to note that the interpretation of MSE can vary depending on the specific context and the scale of the data being analyzed.

2. Root Mean Squared Error (RMSE): The RMSE value of 3767.1009683273455 is derived from the MSE and represents the average magnitude of the residuals (prediction errors). RMSE is commonly used as a measure of the model's prediction accuracy. A lower RMSE indicates better model performance, as it reflects smaller deviations between predicted and actual values.

3. R-squared (R^2): The R-squared value of 82.02390814726087, also known as the coefficient of determination, represents the proportion of the variance in the dependent variable (target variable) that is predictable from the independent variables (features) used in the model. R-squared ranges from 0 to 1, with a higher value indicating a better fit of the model to the data. In this case, an R-squared of 82.02390814726087 suggests that approximately 82% of the variance in the dependent variable can be explained by the independent variables in the model.

Based on these metrics, you can conclude that the model has achieved a reasonable level of prediction accuracy, as evidenced by the relatively low RMSE and a decent R-squared value. However, it is important to interpret these conclusions in the context of your specific data and problem domain. Additionally, it may be helpful to compare these metrics with the performance of other models or against a baseline to gain further insights.